Network Working Group                                       V. Narayanan
Internet-Draft                                            Qualcomm, Inc.
Intended status: Standards Track                       November 19, 2008
Expires: May 23, 2009


               Usage Agnostic Overlay Operation in RELOAD
              draft-vidya-p2psip-usage-agnostic-reload-00

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on May 23, 2009.

Abstract

   RELOAD [1] defines an overlay framework for providing peer-to-peer
   connectivity and storage/retreival primitives for applications.
   Applications or usages are expected to reside on top of such an
   overlay.  In general, this is a good design that allows multiple
   applications to use the same overlay framework.  In such a design,
   however, there are some decisions to be made in terms of what is an
   overlay function and what must be defined by a usage.  These
   decisions should generally be based on whether the particular
   function is expecting an operation or guarantee from the overlay
   nodes in general or from a particular usage only.  This type of
   separation is especially crucial to avoid needing flag days for
   upgrading nodes in order to accommodate a newer usage version for


Narayanan                 Expires May 23, 2009                  [Page 1]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


   performing the overlay operation.


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3
   3.  Motivation for Usage Agnostic Overlay Behavior  . . . . . . . . 3
   4.  Data Authorization  . . . . . . . . . . . . . . . . . . . . . . 4
     4.1.  Data Authorization Properties and Requirements  . . . . . . 4
     4.2.  Data Authorization in RELOAD and Proposed Changes . . . . . 6
   5.  Storage Semantics in RELOAD and Proposed Changes  . . . . . . . 7
     5.1.  Store and Remove Request Processing . . . . . . . . . . . . 7
     5.2.  Storing and Retrieval of Related Data . . . . . . . . . . . 7
   6.  Security Considerations . . . . . . . . . . . . . . . . . . . . 7
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8
   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 8
   9.  Normative References  . . . . . . . . . . . . . . . . . . . . . 8
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . . . 8
   Intellectual Property and Copyright Statements  . . . . . . . . . . 9


Narayanan                 Expires May 23, 2009                  [Page 2]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


1.  Introduction

   RELOAD allows usages to be defined on top of an overlay
   instantiation.  Usages may define various kinds of data, with a
   corresponding data model for each kind.  In order to store data on a
   particular node in the overlay, RELOAD requires that it recognizes
   the kind id of the data being stored and that the data model used is
   the correct one for that kind id.  Further, for data authorization,
   RELOAD requires that the resource name be computed using a certified
   user name or node id, where the certificate of interest and resource
   name calculation mechanism may be defined by a usage.  For these
   reasons, it will be necessary that all nodes in an overlay support
   all usages that may be offered on that overlay.  Further, when usages
   evolve to newer versions, it will be essential that all nodes support
   the same versions of all usages.  In a distributed system, this
   requires a flag day for all devices, which is usually unrealistic.
   The goal of this draft is to bring out the need for keeping storage
   and overlay-level data authorization usage agnostic.  By doing so,
   RELOAD compatible overlays will yield more easily to supporting
   heterogeneous usages as well as allow upgrades to evolve over time.
   For properties and primitives provided at the overlay level, it is
   crucial that we maintain such usage independence.


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [2].

   This document uses the terminology used in RELOAD.  In addition, the
   following terms are used:

   o  "Overlay level" - The term "overlay level" is used to refer to
      operations at the layers below the usage layer and above the
      transport layer in the RELOAD architecture.

   o  "Application/Usage level" - The terms "application level" or
      "usage level" are used to refer to operations at the usage layer
      in the RELOAD architecture.


3.  Motivation for Usage Agnostic Overlay Behavior

   At a very fundamental level, DHTs provide key-based routing, on top
   of which several applications may exist.  Overlay protocols, such as
   the one defined in RELOAD, provide storage and retrieval semantics
   and allow peer-to-peer connectivity for nodes.  These are very basic


Narayanan                 Expires May 23, 2009                  [Page 3]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


   primitives that can be used by a wide range of applications.  Even
   for the SIP-based class of applications within the scope of the
   P2PSIP WG, there are several applications that may exist on various
   nodes in an overlay.  It is not necessary that all nodes in an
   overlay support the same applications.  Further, a given application
   may evolve over time, with newer versions potentially defining new
   kinds of data to be stored and retrieved on an overlay.  Overlay
   primitives such as storage functioning independent of the usage has a
   huge advantage in that it does not require all nodes in the overlay
   to support a given usage in order to store data corresponding to that
   usage.  Usages may, of course, provide additional mechanisms at the
   usage level that may be interpreted by nodes that do support those
   usages - but, this itself is independent of the overlay primitives.

   Given the heterogeneous usages and potential evolution of even the
   same usage, an overlay will need a flag day for upgrading all
   devices, if the operations on the overlay primitives are allowed to
   be defined by the usages.  The overlay will then be rendered unusable
   or will result in unpredictable behavior when incompatible usages are
   present on various nodes.  Hence, while usages should be allowed to
   interface with the overlay layer to provide certain inputs, the
   operations on overlay primitives such as storage should be defined
   and contained within the overlay layers.


4.  Data Authorization

4.1.  Data Authorization Properties and Requirements

   Before discussing the specific data authorization model employed in
   RELOAD, let us briefly discuss the requirements for data
   authorization and what properties are expected of it.  Data
   authorization provides the following properties:

   o  Data authorization allows the storing node to verify that the data
      owner is in fact authorized to store at that particular resource
      id.  This allows the overlay to restrict the number of locations
      at which data can be stored by a given node and hence limits the
      impact of a single node to the distributed quota available in the
      overlay.  However, the ability to spread a given node's data
      around the overlay is also an equally important feature - hence,
      restricting the location of storage corresponding to a node to
      exactly one is also a problem.  Without this, a given node may be
      burdened by all the data from a heavy user.  This also provides an
      attacker incentives to target a particular resource id for a
      chosen location attack.


Narayanan                 Expires May 23, 2009                  [Page 4]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


   o  Data authorization allows the data owner to control which nodes/
      users are allowed to write to that particular resource id.
      Without this, any node may potentially overwrite the contents of a
      given resource id.

   Both properties described above are usage independent and should be
   realized as such.  The following are the requirements to consider in
   designing a data authorization solution:

   o  Data authorization for overlay storage MUST be provided in a usage
      agnostic fashion.  Hence, a node in the overlay should be able to
      verify authorization of a storage request from another node,
      without having to support the usage to which the data to be stored
      belongs.

   o  Data authorization MUST allow the data owner to ensure that
      unauthorized nodes cannot write to or modify the contents of the
      particular items in the resource id that it stored.  It should be
      noted that the assurance of the data authorization model is
      dependent on the storing node enforcing the model.

   o  An authorized store MUST be independently verifiable by any node
      in the overlay.  In other words, a node retrieving the contents of
      a resource id must be able to verify that the data owner
      corresponding to it was authorized to store at that resource id.
      It should be noted that it is not readily possible to verify that
      the storing node was responsible for a given resource id.

   o  A given node SHOULD be able to store at multiple locations in the
      overlay.  This allows a node to spread its data across an overlay
      and not rely on a single node.  In general, replication and
      caching strategies may address the single point of storage
      limitations.  However, the overlay storage primitives natively
      offering the spread is inherently useful in handling queries and
      the storage itself.

   o  Data authorization at the overlay level MUST be based on overlay
      credentials (e.g., the certificate provided by an enrollment
      server or a self-signed certificate, whichever is honored by the
      overlay).  Usage specific certificates cannot be honored by nodes
      that do not support those usages and hence, are not a viable
      candidate for overlay level data authorization.

   o  The overlay MUST provide the ability for a querier to look up a
      service without having prior knowledge of the user name or node id
      of the entity offering the service.  For e.g., a node looking for
      TURN services cannot be realistically expected to know the
      corresponding user names or node ids.


Narayanan                 Expires May 23, 2009                  [Page 5]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


   o  Resource id computation MUST be performed in a usage agnostic
      fashion.  Usages may be allowed to provide usage-specific inputs
      into the resource id computation.

4.2.  Data Authorization in RELOAD and Proposed Changes

   Data authorization in RELOAD is based on the use of a certified user
   name or node id being used to compute the resource name.
   Essentially, the use of such a mechanism for resource id calculation
   allows a storing node to verify that the data owner is authorized to
   store at that location.  RELOAD allows the resource name that is used
   as input to resource id computation to be defined by specific usages.
   Further, each usage is also allowed to define its own rules for data
   authorization.  This makes the data authorization model usage
   dependent, lending itself to the need for homogeneous usage support
   and usage upgrade flag days for overlays.  This section suggests a
   slight deviation in the authorization model for RELOAD to allow
   independent operation of storage authorization at the overlay level.

   Resource ID is computed as a hash of a Resource Name, as specified in
   RELOAD.  However, the Resource Name comprises of two parts - a usage
   specific string and an AuthName, separated by the delimiter ":".  The
   usage specific string may be provided by a given usage or defaulted
   to the usage id.  The AuthName is the certified user name or node id.
   Hence, ResourceID = h(UsageString:AuthName).  As an example, it
   amounts to something like h(sip:alice@example.dht.org) - note that
   there is some ambiguity in the text around resource id generation in
   the current RELOAD draft and it may potentially be leading to this
   same point.

   When a usage is interested in allowing queries without prior
   knowledge of the user name or node id, it MUST NOT include the
   AuthName as part of the Resource Name.  In such cases, authorization
   may be provided at the value level.  For instance, ":AuthName" may be
   appended to the value for this purpose.  Note that this is more
   straightforward in the Single Value and Dictionary data models and
   more challenging in the array case.  Usages must evaluate the need
   for overlay level authorization and determine the use of this.
   Without the possibility for the value-level data authorization,
   however, querying without user name or node id knowledge will take
   more complicated overlay crawling and service publication for
   discovery purposes.  In order to support this dual level
   authorization, the protocol must allow the ability to signal the
   appropriate authorization type in use.  Further, authorization MUST
   NOT be applied at the resource id and value levels at any time.


Narayanan                 Expires May 23, 2009                  [Page 6]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


5.  Storage Semantics in RELOAD and Proposed Changes

5.1.  Store and Remove Request Processing

   RELOAD requires a node receiving a store/remove request to verify
   that the kind id is known to it and the corresponding data model is
   correct.  This assumes that the storing node supports the exact same
   usage, which may not necessarily be true.  Hence, it is recommended
   that these checks not be required to store the data.  A node may,
   based on local policies, provide preference to data corresponding to
   usages it supports - but, such local decisions should be outside the
   scope of the overlay protocol itself.

5.2.  Storing and Retrieval of Related Data

   Going by the resource id computation described in Section 4.2, data
   corresponding to a given usage and a given user/node may be stored at
   a particular resource id.  Usages may actually even spread the data
   corresponding to it around further by providing different semantics
   to the usage specific data in the resource name.  For instance,
   instead of "sip:alice@example.dht.org" being the resource name, one
   may envision "sipaor:alice@example.dht.org", etc.  RELOAD allows
   storing the certificate corresponding to a user name or node id at
   the resource ids corresponding to the hash of the user name and/or
   node id.  That by itself is useful; however, it is also useful if the
   certificate can be stored at the same place as the SIP AOR.  There
   are two ways in which this can be accomplished.  The SIP usage may
   specify a certificate kind that also may be stored at the resource
   id.  Alternately, it can be modeled as a targeted storage of a
   resource id and corresponding value at a node that may not own that
   resource id.  This would allow a node that is storing data
   corresponding to multiple usages at the same resource id (either
   because it chose to not have authorization or because it computed its
   resource id as simply a hash of its AuthName) to just store one
   instance of a certificate even if multiple usages make use of it.
   However, at this time, it seems reasonable to expect each usage to
   define all required related data kinds such that those objects can be
   stored at the same resource id with relative ease.


6.  Security Considerations

   This draft relates to usage independent overlay semantics in peer-to-
   peer networks.  The related security aspects for data authorization
   have been described in the body of the draft.  There are no
   additional security considerations to document at this time.


Narayanan                 Expires May 23, 2009                  [Page 7]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


7.  IANA Considerations

   None


8.  Acknowledgments

   This draft came about as a result of several discussions with
   Lakshminath Dondeti, Ranjith Jayaram, Radha Padmanabhan, and Saumitra
   Das.


9.  Normative References

   [1]  Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and H.
        Schulzrinne, "REsource LOcation And Discovery (RELOAD) Base
        Protocol", draft-ietf-p2psip-base-00 (work in progress),
        October 2008.

   [2]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.


Author's Address

   Vidya Narayanan
   Qualcomm, Inc.
   5775 Morehouse Dr
   San Diego, CA
   USA

   Phone: +1 858-845-2483
   Email: vidyan@qualcomm.com


Narayanan                 Expires May 23, 2009                  [Page 8]

Internet-Draft      Usage Agnostic Overlay Operation       November 2008


Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Narayanan                 Expires May 23, 2009                  [Page 9]