Internet Draft Resource Management in Diffserv Framework April 2001 Internet Engineering Task Force L. Westberg INTERNET-DRAFT M. Jacobsson Expires October 2001 G. Karagiannis S. Oosthoek D. Partain V. Rexhepi R. Szabo P. Wallentin Ericsson April 2001 Resource Management in Diffserv (RMD) Framework draft-westberg-rmd-framework-00.txt Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Westberg, et al. Expires October 2001 [Page 1] Internet Draft Resource Management in Diffserv Framework April 2001 Abstract This draft presents a framework for the Resource Management in Diffserv (RMD) designed for edge-to-edge resource reservation in a Differentiated Services (Diffserv) domain. RMD extends the Diffserv architecture with new resource reservation concepts and features. Moreover, this framework enhances the Load Control protocol described in [WeTu00]. The RMD framework defines two architectural concepts: - the Per Hop Reservation (PHR) - the Per Domain Reservation (PDR) The PHR protocol is used in a Diffserv domain on a per-hop basis to augment the Diffserv Per Hop Behavior (PHB) with resource reservation. It is implemented in all nodes in a Diffserv domain. On the other hand, the PDR protocol manages the resource reservation per Diffserv domain relying on the PHR resource reservation status in any nodes. It is only implemented at the boundary of the domain (at the edge nodes). The RMD framework presented in this draft describes the new reservation concepts and features. Furthermore it describes the: - relationship between the PHR and PHB - interaction between the PDR and PHR - interoperability between the PDR and external resource reservation schemes This framework is an open framework in the sense that it provides the basis for interoperability with other resource reservation schemes and can be applied in different types of networks as long as they are Diffserv domains. The framework scheme presented in this document aims at extreme simplicity and low cost of implementation along with good scaling properties. 1. Introduction Today's Internet applications range from simple ones such as e-mail, web browsing and file transfers to highly demanding real-time applications like audio and video streaming, IP telephony and multimedia conferencing. This diversity has influenced the user's and provider's expectations of the Internet infrastructure for satisfying Westberg, et al. Expires October 2001 [Page 2] Internet Draft Resource Management in Diffserv Framework April 2001 the diverse service needs of the applications. In a highly competitive environment such as the Internet Service Providers' (ISPs) world, satisfying customer needs, whether they are other ISPs or end users, is key to survival. Therefore, the ISPs' zeal to provide value-added services to their customers is natural. One significant class of such value-added services requires real-time message transport. It can be expected that these real-time services will be popular as they replicate or are natural extensions of existing communication services like telephony. Exact and reliable resource management (such as admission control) is essential for achieving high utilization in networks with real-time transport requirements. Solving this problem is difficult primarily due to scalability issues. The Differentiated Services (Diffserv) architecture ([RFC2475], [RFC2638], [BeBi99]) was introduced as a result of efforts to avoid the scalability and complexity problems of Intserv [RFC1633]. Scalability is achieved by offering services on an aggregate basis rather than per-flow and by forcing as much of the per-flow state as possible to the edges of the network. The service differentiation is achieved using the Differentiated Service (DS) field in the IP header and the Per-Hop Behavior (PHB) as main building blocks. Packets are handled at each node according to the PHB indicated by the DS field in the message header. The Diffserv domain will provide to its customer, which is a host or another domain, the required service by complying fully with the Service Level Agreement (SLA) agreed upon. The SLA can either be negotiated statically or dynamically. The transit service to be provided with accompanying parameters like transmit capacity, burst size and peak rate is specified in the technical part of the SLA, the Service Level Specification (SLS). However, the Diffserv architecture currently does not standardize any solution for dynamic resource reservation. This memo, the RMD framework, defines a dynamic resource reservation scheme that can be used for the dynamic SLS provisioning in an edge-to-edge Diffserv domain. As such, once solutions for resource reservation are introduced, Diffserv needs to be extended with new features. Moreover, this framework enhances the Load Control protocol described in [WeTu00]. The basic functionality in the interior nodes as proposed by that memo is similar to the proposal in this memo. The RMD framework distinguishes between two types of protocols, the Westberg, et al. Expires October 2001 [Page 3] Internet Draft Resource Management in Diffserv Framework April 2001 Per Domain Reservation (PDR) and Per Hop Reservation (PHR) protocols: - A Per Domain Reservation protocol is used to perform resource reservation in the complete Diffserv domain. A PDR protocol is used by the edge nodes (ingress and egress), but not by the interior nodes. - A Per Hop Reservation protocol is used to perform a per-hop reservation, extending the Diffserv PHB. A PHR protocol is used in all nodes in the Diffserv domain (both edge and interior nodes) on a hop by hop basis. Furthermore, the RMD framework defines: - the relationship between the PHR and PHB - the interaction between the PDR and PHR - interoperability between the PDR and external resource reservation schemes The design of the PHR and PDR protocols extends the Diffserv framework with new features necessary for the deployment of the RMD in Diffserv domains. The new features required in this reservation scheme are presented in this framework draft. As this reservation scheme is meant as a solution for a single domain, it is very important that it is able to interoperate with other resource reservation schemes used in other domains, and, as such, be part of end-to-end resource reservation mechanisms. This framework is an open framework in the sense that it provides the basis for interoperability with other resource reservation schemes and is to be applied in different types of networks as long as they are Diffserv domains. Furthermore, it is possible for the RMD framework to co- exist with statically allocated PHBs and SLSs. The framework scheme presented in this document aims at extreme simplicity and low cost of implementation along with good scaling properties. 1.1. Definitions/Terminology DS behavior aggregate (identical to [RFC2475]): A collection of packets with the same DS codepoint crossing a link in a particular direction. Westberg, et al. Expires October 2001 [Page 4] Internet Draft Resource Management in Diffserv Framework April 2001 DS-compliant (identical to [RFC2475]): Enabled to support differentiated services functions and behaviors as defined in [RFC2474], this document, and other differentiated services documents; usually used in reference to a node or device. Per Hop Behavior (PHB) (identical to [RFC2475]): The externally observable forwarding behavior applied at a DS-compliant node to a DS behavior aggregate. Per Hop Reservation (PHR): The per-hop resource reservation in a Diffserv domain, extending the Diffserv PHB, e.g., the bandwidth allocated to an AF PHB (see RFC2597]), with resource reservation. It is implemented at both the interior nodes and the edge nodes. Per Hop Reservation (PHR) protocol: A type of protocol that is used to perform a per hop reservation. A PHR protocol is used in all nodes in the Diffserv domain (both edge and interior nodes) on a hop by hop basis. Per Domain Behavior (PDB)(similar to [NiKa01]): Describes the behavior experienced by a particular set of packets as they cross a DS domain. A PDB is characterized by specific metrics that quantify the treatment that a set of packets with a particular DSCP (or set of DSCPs) will receive as it crosses a DS domain. Per Domain Reservation (PDR): The resource reservation in the complete Diffserv domain. Per Domain Reservation (PDR) protocol: A type of protocol used to perform a per domain reservation. A PDR protocol is used by edge nodes (ingress and egress), but not by the interior nodes. Edge nodes: Westberg, et al. Expires October 2001 [Page 5] Internet Draft Resource Management in Diffserv Framework April 2001 Nodes that are located at the boundary of a Diffserv domain. Interior node: All the nodes that are part of a Diffserv domain and are not edge nodes. Ingress node: An edge node that handles the traffic as it enters the Diffserv domain. Egress node: An edge node that handles the traffic as it leaves the Diffserv domain. End Host: QoS-aware end terminal, either fixed or mobile, i.e. running QoS-aware applications RMD domain: A Diffserv domain that uses the RMD framework. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Overview of the RMD Framework Protocols The RMD framework is based on Diffserv principles for QoS provisioning and extends these principles with new ones necessary to provide resource provisioning and control in Diffserv domains. The RMD operates in a Diffserv domain and therefore support for different levels of Quality of Service (QoS) MUST be provided using Diffserv, as defined in [RFC2475]. The RMD framework will use the Diffserv classes Expedited Forwarding (EF) [RFC2598] and Assured Forwarding (AF) [RFC2597] as QoS classes. This implies that any network supporting the RMD framework MUST be able to classify, mark, police and schedule the traffic accordingly. Westberg, et al. Expires October 2001 [Page 6] Internet Draft Resource Management in Diffserv Framework April 2001 It is assumed that different externally defined QoS classes can be translated into these Diffserv classes (Per Hop Behaviors). In order to maximize the scalability in the Diffserv domain the complexity imposed by the resource reservation scheme has to be moved as much as possible away from the interior nodes. Therefore, the RMD framework is separating the problem of a complex reservation within a domain from a simple reservation within a node. This is accomplished by specifying two types of resource reservation protocols. The first resource reservation protocol type is denoted as Per Hop Reservation (PHR) that is enabling reservation of resources per PHB in each node within a Diffserv domain. This protocol type is optimized to reduce the requirements placed on the functionality of the interior nodes. For example, the nodes that are implementing this protocol type do not have per flow responsibilities. The second protocol type is denoted as Per Domain Reservation (PDR) and is responsible for the resource reservation within the complete Diffserv domain and is used by edge nodes (ingress and egress), but not by the interior nodes. This protocol introduces strict and complex requirements on the functionality implemented on the edge nodes. An example of such functionality is the mapping of the traffic parameters signalled by an external QoS request to parameters that are useful to the RMD scheme. In the RMD framework, different PDR and PHR protocols can be used within a Diffserv domain simultaneously. The PHR protocol is a new defined protocol while the PDR protocol can be either a new defined protocol or (one or more) already existing protocols. Examples of such existing protocols can be the Resource Reservation Protocol (RSVP) [RFC2205], RSVP aggregation [BaIt01], Simple Network Management Protocol (SNMP) [RFC1905], Common Open Policy Service (COPS) [RFC2748]. There may be different levels of granularity between external QoS requests and PDR reservations, e.g., one to one, many-to-one. Similarly there may be different levels of granularity between PDR protocol actions and PHR protocol actions, e.g., one to one, one-to- many and many-to-one. Westberg, et al. Expires October 2001 [Page 7] Internet Draft Resource Management in Diffserv Framework April 2001 2.1. RMD framework scenarios Two different scenarios are identified wherein this framework is applied. The first scenario illustrated in Figure 1 includes ingress nodes, egress nodes and interior nodes. The second scenario illustrated in Figure 2 includes in addition to the nodes depicted in Figure 1, also an "oracle" (or "agent") that is involved in the per domain reservation, but which do not provide any resources by itself. Note that combinations of the two scenarios may be possible. |---------| |--------| |--------| |--------| |---------| | | | | | | | | | | |Ingress | |Interior| |Interior| |Interior| | Egress | | node |<->| node |<->| node |<->| node |<->| node | | | | | | | | | | | |---------| |--------| |--------| |--------| |---------| Figure 1: First scenario for the RMD framework |--------| | | --------------------->| Oracle |<--------------------- | | | | | |--------| | v v |---------| |--------| |--------| |--------| |---------| | | | | | | | | | | |Ingress | |Interior| |Interior| |Interior| | Egress | | node |<->| node |<->| node |<->| node |<->| node | | | | | | | | | | | |---------| |--------| |--------| |--------| |---------| Figure 2: Second scenario for the RMD framework Westberg, et al. Expires October 2001 [Page 8] Internet Draft Resource Management in Diffserv Framework April 2001 Figures 3 and Figure 4 depict the peers in the communication of the PDR and PHR protocols in the two different scenarios. In Section 5 some examples illustrating the usage of actual PDR and PHR protocols in different scenarios are given. External QoS <---| Request | \|/ |---------| |---------| | PDR |<---------------------------------------->| PDR | |---------| |---------| | | |---------| |--------| |--------| |--------| |---------| | PHR |<->| PHR |<->| PHR |<->| PHR |<->| PHR | |---------| |--------| |--------| |--------| |---------| ingress interior interior interior egress Figure 3: PDR and PHR protocol peers in the first scenario In the first scenario, the PDR protocol is used between the ingress and egress nodes. The ingress node receives an external QoS request and initiates the per domain reservation. The PHR protocol is used between all nodes on an hop-by-hop basis along the path from the ingress to the egress. The PDR protocol may use the PHR protocol or any underlying protocol for the transport of PDR messages. External QoS <--------| Request | \|/ |---------| |----------| |---------| | PDR |<------------->| PDR |<------------->| PDR | |---------| |----------| |---------| | | |---------| |--------| |--------| |---------| | PHR |<->| PHR |<-------------->| PHR |<->| PHR | |---------| |--------| |--------| |---------| ingress interior oracle interior egress Figure 4: PDR and PHR protocol peers in the second scenario In the second scenario, the "oracle" receives the external QoS request and uses a PDR protocol towards the ingress and egress nodes to perform the per domain reservation. Note that the "oracle" does Westberg, et al. Expires October 2001 [Page 9] Internet Draft Resource Management in Diffserv Framework April 2001 not use the PHR protocol. In the RMD framework all of the PHR signaling messages are to be generated and discarded at the edge nodes (ingress and egress nodes) and not at the end hosts. Moreover, all of the PDR messages are to be generated and discarded either at the edge nodes or at the oracle. 2.2. PDR protocol functions A PDR protocol implements all or a subset of the following functions: * Mapping of external QoS request to a Diffserv Code Point (DSCP); * Admission control and/or resource reservation within a domain; * Maintenance of flow identifier and reservation state per flow, e.g. by using soft state refresh; * Notification of the ingress node IP address to the egress node; * Notification of lost signalling messages (PHR and PDR) occurred in the communication path from the ingress to the egress nodes; * Notification of resource availability in all the nodes located in the communication path from the ingress to the egress nodes. 2.3. PHR protocol functions A PHR protocol implements all or a subset of the following functions: * Admission control and/or resource reservation within a node; * Management of one reservation state per PHB by using soft state updates; * Measurement of the user traffic load; Westberg, et al. Expires October 2001 [Page 10] Internet Draft Resource Management in Diffserv Framework April 2001 * Stores a pre-configured threshold value on maximal allowable traffic load (or resource units) per PHB; * Adaptation to load sharing. Load sharing allows interior nodes to take advantage of multiple routes to the same destination by sending via some or all of these available routes. The PHR protocol has to adapt to load sharing once it is used; * Severe congestion notification. This situation occurs as a result of route changes, a link failure or a long period of congestion. The PHR has to notify the edges about the occurrence of this situation; * Transport of transparent PDR messages. The PHR protocol may encapsulate and transport PDR messages from an ingress node to an egress node. 3. The PDR protocols 3.1. Introduction A PDR protocol layer interacts with external resource requests (via, for example, RSVP [RFC2205]) and with the PHR protocol layer for handling resources within the edge-to-edge domain. A PDR protocol manages the reservation of the resources per Diffserv domain and is implemented at the edges of this domain. This protocol handles the dynamic reservation requests, that is their admission or rejection, and possibly based on the results of the edge-to-edge domain per hop reservation (PHR). These dynamic reservation requests, shown as "ext. QoS request" in Figures 1 to 4, are generated externally to the Diffserv domain and various protocols might potentially be used to make these requests (RSVP, RSVP aggregation, etc.). A PDR protocol layer should always be able to interpret the resource request and map it into an appropriate DSCP to be used in the edge- to-edge domain. Depending on these external protocols or resource reservation schemes, different PDR protocols can be defined in order to comply with the above requirement. The PDR protocol thus is a link between the external resource reservation scheme and the edge-to-edge PHR. Westberg, et al. Expires October 2001 [Page 11] Internet Draft Resource Management in Diffserv Framework April 2001 A PDR protocol should be able to identify and specify any external request for establishment and maintenance of resources using a (possibly aggregated) flow definition, i.e., flow specification identifier (ID). The flow specification ID is only used by the edge nodes to provide the per-domain reservation (PDR) functionality. Depending on the PDR type used, different flow IDs can be specified. For example, a flow specification ID can be a combination of source IP address, destination IP address and the DSCP field. The flow specification ID is used to identify a (possibly aggregated) state that will only be maintained in the edge nodes. 3.2. Per Domain Reservation (PDR) protocol features Depending either on the external resource reservation scheme with which the Diffserv domain has to interwork or on the characteristics of the network, the RMD framework MAY specify that several PDRs could use one PHR. For example, a core network that is applying RSVP aggregation for resource management will use a different PDR than the PDR that has to be used in a wireless access network that is interconnected to the same core network which is using RSVP/Intserv for resource management. However, both Diffserv domains may use the same reservation based PHR. For each of these PDRs, there MAY be certain specific functions defined. However, the RMD framework defines a common set of features that need to be realized by any PDR that uses a specific PHR, such as the RODA PHR [RODA]. These features are described in the sections below. Besides this common set of features, there is also an optional feature described in Section 3.2.5. 3.2.1. Ingress node addressing There are many situations, such as acknowledgement of a request, when the egress node has to notify the ingress node about the resource reservation status of the communication path between ingress and egress nodes using the PDR protocol, i.e., the request is admitted or is rejected. This means that the egress node MUST be able to send a PDR signaling Westberg, et al. Expires October 2001 [Page 12] Internet Draft Resource Management in Diffserv Framework April 2001 message to the ingress node. Depending on the PDR used and consequently also on the flow id specification (see Section 3.1), the IP address of the ingress node can be derived in two ways: * the egress node can determine the IP address of the ingress node from the available information contained in the header of a received PHR signaling message. This could, for example, be the source IP address of the PHR signaling message received. * the ingress node has to encapsulate its IP address in the PDR signaling message that is encapsulated in a PHR signaling message. The egress node decapsulating the PHR it received is able to extract the PDR signaling message and the IP address of the ingress node. 3.2.2. Error control The PHR signaling messages may be dropped in the communication path from the ingress to the egress nodes. If a reservation-based PHR is used, these messages might have been received by some of the intermediate interior nodes located in this communication path before being dropped. Some other interior nodes located on the same communication path might not receive these PHR signaling messages. This will mean that the interior nodes that received this PHR signaling message will reserve resources that will not be used. Should this occur, the PDR protocol MUST be able to handle the recovery of the dropped reservation-based and measurement-based PHR signaling messages. One possible solution to this is described in Section 5.3.1. 3.2.3. Management of Reservation States The per-domain reservation functionality MUST support the initiation and maintenance of PDR states. This can be accomplished by using either a new defined PDR protocol or (one or more) already existing protocols. Examples of such existing protocols are the Resource Reservation Protocol (RSVP) [RFC2205], RSVP aggregation [BaIt01], Simple Network Management Protocol (SNMP) [RFC1905], Common Open Westberg, et al. Expires October 2001 [Page 13] Internet Draft Resource Management in Diffserv Framework April 2001 Policy Service (COPS) [RFC2748]. These states will be identified using the flow specification ID (see Section 3.1) and the related requested resource unit per Diffserv class PHB. The egress node MUST be able to identify the flow using the flow specification ID after receiving a PHR signaling message. Depending on the PDR protocol type being used, the flow specification ID can be derived in two ways: * derived from PHR message: the flow specification ID can be derived from the available information contained in the header of the PHR signaling message received. This could, for example, be the combination of the source and destination IP addresses and the DSCP in the PHR signaling message. * derived from PDR message: the flow specification ID is included in the PDR signaling message that is encapsulated by the ingress node into the PHR signaling message. The egress node decapsulating the PHR received is able to extract the PDR signaling message and the flow specification ID information. Moreover, the PDR signaling message that is sent by the egress node towards the ingress node MUST also contain the flow specification ID information. The PDR resource reservation states can be either hard or soft states. If these states are hard they will have to be initiated, updated or released explicitly. If these states are soft states then they have to be updated regularly. 3.2.4. Resource Unavailability When there are insufficient resources available in the communication path between the ingress node and egress node, the ingress node that generated the PHR signaling messages will have to be notified by means of a PDR reporting message. Any interior node that cannot admit a reservation request due to the lack of the available resources MUST be able to mark the PHR signaling message that will be sent towards the egress node. The egress node will then generate and send to the ingress node a marked PDR signaling message to indicate that the communication path is not able to admit the reservation request. Upon receiving this message the PDR will reject the external resource request since there Westberg, et al. Expires October 2001 [Page 14] Internet Draft Resource Management in Diffserv Framework April 2001 are insufficient resources available internally to satisfy this request. In the case of reservation-based requests, interior nodes that admitted the resource request, which will be rejected in other nodes further in the communication path, will reserve unnecessary resources. The PDR protocol SHOULD be able to handle the release of these unnecessarily reserved resources, see e.g., Section 5.2.1.1. 3.2.5. Bi-directional reservations Making bi-directional reservations is an optional feature and does not belong to the common set of features described in Sections 3.2.1 to 3.2.4. This feature is only relevant when using the RMD framework in specific kinds of networks. One method for bi-directional reservations is based on combining two uni-directional reservations. This is because messages traveling from the reserving entity are likely to follow a different path than messages traveling towards it. The bi-directional reservation imposes a few requirements on the edge nodes, as described below: * The edge nodes must be able to distinguish between a uni-directional and a bi-directional resource reservation PDR signaling message. This SHOULD be accomplished by using a flag in the header of the PDR signaling messages. Furthermore, these bi-directional packets MUST include the requested resource parameters for initiating a uni-directional reservation in the opposite direction (from the egress to the ingress). Note that the requested resource parameters used for bi-directional reservations are asymmetric, i.e., the value of the requested resources used in the direction from the ingress node towards the egress node could be different than the requested resources used in the direction from the egress node towards the ingress node. * When an egress node receives a bi-directional reservation request message, the egress node will have to construct a uni-directional PDR signaling message and a PHR signaling message that will be sent in the opposite direction. The source IP address of this PHR signaling message that is sent towards the ingress node will be the same as the destination IP address of the PHR signaling message it received, while the destination IP address of this PHR Westberg, et al. Expires October 2001 [Page 15] Internet Draft Resource Management in Diffserv Framework April 2001 signaling message will be the same as the source IP address of the PHR signaling message it received. The ingress node that performs a bi-directional reservation, assuming that the above requirements are satisfied, will notify the egress node by means of the PDR signaling messages. On receiving this PDR signaling message, the egress node will initiate a uni-directional reverse PDR signaling message, which will take care of the reservation in the opposite direction. 4. The PHR protocols 4.1. Introduction The Per Hop Reservation (PHR) protocols extend the PHB in Diffserv by adding resource reservation, thus enabling reservation of resources per Diffserv class PHB per hop in each node within a Diffserv domain. The RMD Framework currently specifies two different PHR groups: - The Reservation-Based PHR group In this PHR group, each node in the communication path from an ingress node to an egress node keeps only one state per PHB. The reservation is done in terms of resource units, which may be based on a single parameter, such as bandwidth, or on more sophisticated parameters. These resources are requested dynamically per PHB, i.e., per DSCP and reserved on demand on all nodes in the communication path from an ingress node to an egress node. Furthermore, this PHR group has to maintain per each PHB a threshold that specifies the maximum number of reservable resource units that could for example, be statically configured. A reservation-based PHR protocol is described in detail in [RODA]. The RMD framework uses soft state for updating and releasing reserved resources. - The Measurement-based Admission Control (MBAC) PHR group Westberg, et al. Expires October 2001 [Page 16] Internet Draft Resource Management in Diffserv Framework April 2001 This PHR group is used to check the availability of resources before flows are admitted and without installing any reservation state. That is, measurements are done on the real average traffic (user) data load. The main advantage of this PHR group is that the PHR functionality that is executed at the edge and interior nodes will not have to maintain any reservation states. However, the measurement based PHR uses two states that do not have to be maintained by the PHR protocol. One state per PHB that stores the measured user traffic load associated to the PHB and another state per PHB that stores the maximum allowable traffic load per PHB. Although this is all that is currently defined, new types of PHRs within a PHR group may be defined in the future, as might new PHR groups. To the extent possible, traffic patterns SHOULD be configured in the nodes rather than signaled. The goal is to simplify the traffic parameter mapping at the interior nodes and keep complexity at the edges. This also simplifies the processing of on-demand requests. For example, some of the token bucket parameters such as token bucket peak rate and bucket size can be configured. The negotiated parameter within the edge-to-edge Diffserv domain in the RMD framework is the number of the requested resource units. For example, the RODA PHR [RODA] specifies that this parameter is a simple "bandwidth" parameter and can have a maximum value of 2^16 = 65536 resource units. However, this unit may not necessarily be a simple bandwidth value. It might be defined in terms of any resource unit (e.g., effective bandwidth) to support statistical multiplexing at the message level. A mapping MUST be performed between the type of the resource units requested by an external reservation protocol and the resource units understood by the RMD scheme. 4.2. Per Hop Reservation (PHR) protocol features The required features for the two PHR groups (the reservation-based and measurement-based (MBAC)) are different for the two groups. These features are described in the sections below. Westberg, et al. Expires October 2001 [Page 17] Internet Draft Resource Management in Diffserv Framework April 2001 4.2.1. One reservation state per Diffserv class PHB The reservation based PHR installs and maintains one state per PHB, i.e., per DSCP, in all the nodes located in the communication path from the ingress node up to the egress node. The per PHB states will have to be created, maintained and released using soft states. When soft state is used, a finite lifetime is set for the length of the reservation. These reservations are then maintained by sending periodic refresh messages. If the reservation state does not receive a refresh message within a refresh period, this state is automatically released. The length of the refresh period MUST be the same throughout the Diffserv domain and SHOULD be configurable. The PHR functionality has to maintain per each PHB a threshold that specifies the maximum number of reservable resource units that could for example, be statically configured. Furthermore, the PHR functionality has to maintain the number of currently reserved resource units that are signaled by the PHR protocol. This feature is specific only to the reservation-based PHR group. 4.2.2. Sender-initiated In general, a resource reservation scheme can be sender-initiated or receiver-initiated. In a receiver-initiated scheme, such as Resource reSerVation Protocol [RFC2205], the reservation of the resources is initiated by the receiver. This means that backward routing information has to be stored in the nodes that are located in the forwarding path between the sender and receiver. This backward routing information will be used by the reservation messages sent by the receiver to the sender. All signaling messages belonging to the same flow will then follow the same backward and forward path. In order to avoid storing backward routing information in the RMD framework, a sender-initiated scheme is used. The ingress node will initiate and manage the resource reservation process, meaning that it will generate the PHR signaling messages. Each of these messages may carry either the total amount of the requested resources or a part of the requested resources. Assuming that typical IP routing protocols are used, i.e., packets are routed based on IP destination address, all the PHR signaling Westberg, et al. Expires October 2001 [Page 18] Internet Draft Resource Management in Diffserv Framework April 2001 messages that are generated by the edge nodes SHOULD use the IP addresses of the end hosts involved in the resource reservation session as the source and destination IP addresses. However, depending on the PDR used, exceptions should be allowed. For example, the PHR signaling messages may have the IP addresses of the edge nodes as the source and destination IP addresses. This will imply that the traffic (user) data associated with these PHR signaling messages must be encapsulated with the IP addresses of the edge nodes as the source and destination IP addresses. Both PHR groups MUST be sender-initiated. 4.2.3. Adapts to Load Sharing Load sharing, also known as load balancing, allows interior nodes to take advantage of multiple routes to the same destination by sending messages via some or all of these available routes. However, load sharing will imply that the traffic (user) data will not follow exactly the same paths as the PHR signaling messages that are used to reserve the transport resources used by the traffic (user) data. Load sharing can be characterized as equal or unequal cost (see [Doy98]), where cost is specified as a generic term referring to any metric that is associated with the path. Equal cost load sharing (see, for example, [RFC2676]) distributes traffic equally among the multiple paths. Unequal cost load sharing, on the other hand, does not distribute the traffic equally. An example of this type may be the optimized multi-path (OMP) that is able to distribute loading information, proposing a means for adjusting forwarding and providing an algorithm for making the adjustments gradually enough to ensure stability yet providing reasonably fast adjustment when needed. Note that "reasonably fast" means adaptation in a couple of hours, i.e., daily load fluctuations. OMP discovers multiple paths, not necessarily equal cost paths, to any destinations in the network, but based on the load reported from a particular path, it determines which fraction of the traffic to direct to the given path. Incoming packets are subject to a (source, destination address) hash computation, and effective load sharing is accomplished by means of adjusting the hash thresholds. When combining with multi-protocol label switching (MPLS) forwarding, OMP becomes an effective route optimization engine that can serve the requirements claimed on traffic engineering (TE) in [RFC2702]. Westberg, et al. Expires October 2001 [Page 19] Internet Draft Resource Management in Diffserv Framework April 2001 Load sharing can be accomplished in different ways: * Per-destination load sharing: distributes the traffic based on the destination address. All messages for one destination on the network travel on the same path. * Per-message load sharing (or round robin): given equal cost paths, the first message destined for a particular destination on the network is sent via one path, the next message to the same destination is sent via another path, and so on. * Using a predefined hash function: the combination of the source and destination IP addresses and the source and destination ports is used in a hash function to determine for each message which load sharing path should be used. In this situation, even if the various paths may have equivalent metrics, the traffic associated with one TCP connection is always routed on a single path. The Resource Management in Diffserv framework, by means of PHR and PDR functionality, has the necessary support to adapt to load sharing once it is used. This feature is mandatory for both PHR groups. 4.2.4. Severe Congestion Handling Severe congestion may occur as a result of route changes, a link failure or a long period of congestion. The PHR needs to signal severe congestion to the edges. In particular, the severe congestion status in the interior nodes has to be reported to the edge nodes by means of the PHR signaling messages. The edges MUST solve this congestion state by rejecting the reservation requests that are at that moment requesting resources. Furthermore, depending on the PDR used, on-going flows might be preempted (e.g., shifted to an alternative PHB). The severe congestion SHOULD always be signaled to the edges by the interior nodes regardless of the type of PHR group The Resource Management in Diffserv framework defines several solutions to provide this feature. Moreover, the PHR functionality detects the severe congestion and the PDR protocol informs the edge nodes about this severe congestion situation. Westberg, et al. Expires October 2001 [Page 20] Internet Draft Resource Management in Diffserv Framework April 2001 A number of possible methods of detecting severe congestion are listed below: * During route change: 1) the routing protocol functionality available in the node informs the PHR functionality available in the same node that a route is changed. This method is complex and is difficult to be efficiently deployed. 2) if the number of resources per PHB that have to be refreshed by the PHR soft state refresh messages are higher than the number of resources that were reserved previously for the same PHB then the node deduces that a severe congestion has occurred. This method is easy to be implemented, but depends on the length of the refresh period. If this length is high then the detection of the severe congestion situation will be slow. 3) by using user data traffic measurements. If the volume of the data traffic increases suddenly, this means that a possible route change occurred. The node will then deduce that a severe congestion has occurred. This method is very fast, but it increases the complexity of the PHR functionality since measurements of user data traffic have to be performed. * During link failures: each node must be configured such that if a link failure occurs then it will be deuced that a severe congestion situation occurred. * Another way of detecting severe congestion could be accomplished by providing the possibility to the ingress node to monitor and store a number that identifies how many rejections per flow ID occurred within a predefined time, e.g., 1 second. If this number is higher than a predefined value then the ingress node will deduce that a severe congestion occurred. The simple operation in case of a severe congestion is described in Section 5.3.2. Westberg, et al. Expires October 2001 [Page 21] Internet Draft Resource Management in Diffserv Framework April 2001 5. Examples of RMD Operation The RMD framework extends the Diffserv architecture by adding dynamic resource reservations. It is applied edge-to-edge in a dynamically provisioned Diffserv domain. The admission or rejection of the incoming SLS request relies on the result of the PHR signaling protocol. The PDR protocol is the one that links the SLA/SLS request and the PHR protocol. Later in this document, the SLA/SLS request will be referred to simply as a QoS request. The functional operation of the RMD framework is described as interoperation between the PHR and PDR functions, abstracted from the details in the following scenarios: * normal operation * fault handling: - loss of PHR signaling messages - severe congestion handling There are two typical example scenarios used for describing the normal operation and fault handling of the RMD framework: Example 1: PDR protocol will initiate and maintain the PDR states in the ingress/egress nodes. In this scenario it is assumed that the external QoS request does not create any resource reservation states in the ingress/egress nodes. Example 2: PDR protocol will use (partially or fully) the resource reservation states initiated and maintained by an external protocol as PDR states. The signaling message types are also explained briefly. 5.1. Examples of Signaling Message Types The RMD Framework classifies the signaling messages into PHR and PDR signaling messages for supporting PHR and PDR functionality, respectively. Westberg, et al. Expires October 2001 [Page 22] Internet Draft Resource Management in Diffserv Framework April 2001 5.1.1. PHR signaling message types There are two types of PHR signaling messages: * PHR_Resource_Request The "PHR_Resource_Request" signaling message is common to both PHR groups, but its role is different in the two PHR groups: 1. The reservation-based "PHR_Resource_Request" signaling message is generated by the ingress node in order to initiate or update the aggregated soft state reservation in the communication path to the egress node. 2. The measurement-based "PHR_Resource_Request" PHR signaling message is generated by the ingress node to check the monitoring status of each node located in the communication path between the ingress node and egress node. * PHR_Refresh_Update The "PHR_Refresh_Update" signaling message is specific to reservation-based PHR group. The "PHR_Refresh_Update" signaling message is generated by the ingress node in order to initiate, update or refresh the soft state reservation per DSCP in the communication path to egress node. If possible, all the nodes should process the "PHR_Refresh_Update" messages with a higher priority than the "PHR_Resource_Request" messages. 5.1.2. PDR signaling message types The PDR signaling messages are processed only by the RMD edge nodes and not by the interior nodes. The PDR protocol can either be an entirely new protocol (see Example 1, Section 5.2.1.1) or it may use one of the existing protocols such as RSVP, RSVP aggregation, SNMP, COPS, etc. (see Example 2, Section 5.2.1.2) as part of its functionality. In order to describe the functionality of the PDR there are several messages denoted in this document, which are not Westberg, et al. Expires October 2001 [Page 23] Internet Draft Resource Management in Diffserv Framework April 2001 formally specified protocol messages, but represent just an exemplification of possible protocol messages used for exchanging the PDR information (such as for e.g flow id, address of the ingress) between edge nodes. These PDR signaling messages may also be encapsulated into PHR messages in case it is necessary. These PDR signaling exemplification messages are listed below: * PDR_Reservation_Request The "PDR_Reservation_Request" signaling message is generated by the ingress node in order to initiate or update the PDR state in the egress node. * PDR_Refresh_Request The "PDR_Refresh_Request" message is sent by the ingress node to the egress node to refresh the PDR states located in the egress node. Any of the "PDR_Reservation_Request" or "PDR_Refresh_Request" messages may either be or not be encapsulated into a PHR message. When any of these PDR messages is encapsulated into one PHR message, then this PDR message SHOULD contain the information that is required by the egress node to associate the PHR signalling message that encapsulated this PDR message to for example the PDR flow ID and/or the IP address of the ingress node. * PDR_Reservation_Report The "PDR_Reservation_Report" messages are sent by the egress node to the ingress node to report that a "PHR_Resource_Request"/"PDR_Reservation_Request" has been received and that the request has been admitted or rejected. * PDR_Refresh_Report The "PDR_Refresh_Report" messages are sent by the egress node to the ingress node to report that a "PHR_Refresh_Update"/ "PDR_Refresh_Request" message has been received and has been processed. * PDR_Request_info Westberg, et al. Expires October 2001 [Page 24] Internet Draft Resource Management in Diffserv Framework April 2001 A "PDR_Request_info" message is encapsulated into a PHR signaling message that is sent by the ingress node towards the egress node. This PDR message is containing the information that is required by the egress node to associate the PHR signalling message that encapsulated this PDR message to for example the PDR flow ID and/or the IP address of the ingress node. If possible all the nodes should process the "PDR_Refresh_Report" messages with a higher priority than the "PDR_Reservation_Report" messages. 5.2. Example of Normal operation Normal operation refers to the situation when no problems are occurring in the network, such as route or link failure, severe congestion, loss of PHR signaling messages, etc. Normal operation is different for the two PHR groups (the reservation-based PHR and the measurement-based PHR). Both are explained in the following sections. 5.2.1. Normal Operation using the reservation-based PHR Depending on the functionality of the external resource reservation protocol that interoperates with the RMD domain two scenario types can be identified: * Example 1 where the external resource reservation protocol does not create any reservation states in ingress/egress nodes; * Example 2 where the external resource reservation protocol creates reservation states in ingress/egress nodes. 5.2.1.1. Example 1 In this scenario the external resource reservation protocol that interoperates with the RMD framework does not create any reservation states in ingress/egress nodes. Once a QoS request arrives at the ingress node, the PDR protocol must classify it into an appropriate Diffserv class PHB. It should Westberg, et al. Expires October 2001 [Page 25] Internet Draft Resource Management in Diffserv Framework April 2001 calculate the associated resource unit for this QoS request, i.e., bandwidth parameter. The PDR state will be associated with a flow specification ID. If the QoS request is satisfied locally then the ingress node will generate the "PHR_Resource_Request" signaling message and the "PDR_Reservation_Request", which will be encapsulated in the "PHR_Resource_Request" signaling message. The PDR signaling message MAY contain information such as the IP address of the ingress node and the per-flow specification ID. The PDR signaling message MUST be decapsulated and processed by the egress node only. The intermediate interior nodes receiving the "PHR_Resource_Request" must identify the Diffserv class PHB (the DSCP type of the PHR signaling message) and, if possible, reserve the requested resources. The node reserves the requested resources by adding the requested amount to the total amount of reserved resources for that Diffserv class PHB. The behavior of the egress node on admission or rejection of the "PHR_Resource_Request" is the same as in the interior nodes. After processing the "PHR_Resource_Request" message, the egress node decapsulates the "PDR_Reservation_Request" and creates/identifies the flow specification ID and the state associated with it. In order to report the successful reservation to the ingress node, the egress node will send the "PDR_Reservation_Report" message back to the ingress node. After receiving the "PDR_Reservation_Report" the ingress node will inform the external source of the successful reservation, which will in turn send traffic (user) data. If the reserved resources need to be refreshed (updated), the ingress node will generate a "PDR_Refresh_Request" message in order to refresh the PDR soft state in the egress node. A "PHR_Refresh_Update" is used to refresh the PHR aggregated soft state in both interior and egress nodes. The "PDR_Refresh_Request" will be encapsulated into the "PHR_Refresh_Update". The refresh periods should be equal in all edge and interior nodes. Interior nodes that receive the "PHR_Refresh_Update" will refresh/update the aggregated reservation state related to the Diffserv class PHB (DSCP). After processing the "PHR_Refresh_Update" message, the egress node MUST identify the flow specification ID carried by either the header of the PHR signaling message or the encapsulated PDR signaling message (see Section 5.1.2). In this way the PDR state associated with this flow specification ID can be refreshed instantaneously. The Westberg, et al. Expires October 2001 [Page 26] Internet Draft Resource Management in Diffserv Framework April 2001 egress node will send the "PDR_Refresh_Report" signaling message back to ingress node to acknowledge the admission and processing of the "PHR_Refresh_Update" signaling message. The resources in any node are released if there are no "PHR_Refresh_Update" messages received during a refresh period. The flow diagram showing the normal operation in case of successful reservation for Example 1 is shown in Figure 5. Ingress Interior Interior Egress QoS |PHR_Resource_Request| | | request | (PDR_ResReq*) | | | -------> |------------------->|PHR_Resource_Request| | | | (PDR_ResReq*) | | | |------------------->|PHR_Resource_Request| | | | (PDR_ResReq*) | | | |------------------->| | | | | | PDR_Reservation_Report| | |<-------------------|--------------------|--------------------| | | | | | | Traffic(user) Data | | -------->|------------------->|------------------->|------------------->|---> | | | | | PHR_Refresh_Update | | | | (PDR_RefReq*) | | | | ------------------>| PHR_Refresh_Update | | | | (PDR_RefReq*) | | | |------------------->| PHR_Refresh_Update | | | | (PDR_RefReq*) | | | |------------------->| | | PDR_Refresh_Report | | |<-------------------|--------------------|--------------------| (PDR_ResReq*) - represents the PDR_Reservation_Request message encapsulated in the PHR_Resource_Request message. This message is processed only by the ingress and egress nodes. (PDR_RefReq*) - represents the PDR_Refresh_Request message encapsulated in the PHR_Refresh_Update message. This message is processed only by the ingress and egress nodes. Figure 5: Normal Operation for successful reservation- Example 1 Westberg, et al. Expires October 2001 [Page 27] Internet Draft Resource Management in Diffserv Framework April 2001 If there are no resources available locally, the ingress node will immediately reject the external QoS request and will not generate any signaling messages related to this request. On the other hand, if resources are lacking on the interior or egress of the network, the interior and egress nodes MUST mark and forward the "PHR_Resource_Request" signaling message they receive in order to indicate the lack of resources to the ingress node and that no reservation was made. Interior nodes receiving a marked "PHR_Resource_Request" message will not process it. Egress nodes receiving the marked "PHR_Resource_Request" MUST mark the "PDR_Reservation_Report" message that is sent towards the ingress node. After receiving the marked "PDR_Reservation_Report", the ingress node will reject the external QoS request. The interior nodes that have reserved resources for a QoS request that was rejected will release them during a refresh period, since no refresh PHR signaling messages will arrive. Figure 6 depicts the normal operation for an unsuccessful reservation for Example 1. Ingress Interior Interior Egress QoS |PHR_Resource_Request| | | Request | (PDR_ResReq*) | | | -------->|------------------->| PHR_Resource_Request (marked) | | | (PDR_ResReq*)| | | M---------------------------------->| | | | | | | | | | PDR_Reservation_Report (marked) | |<-------------------|--------------------|--------------| QoS | | | | Request | | | | Rejected | | | | <--------| | | | | | | | (PDR_ResReq*) - represents the PDR_Reservation_Request message encapsulated in the PHR_Resource_Request message. This message is processed only by the ingress and egress nodes. Figure 6: Normal Operation for unsuccessful reservation - Example 1 Westberg, et al. Expires October 2001 [Page 28] Internet Draft Resource Management in Diffserv Framework April 2001 5.2.1.2. Example 2 In this scenario the external resource reservation protocol that interoperates with the RMD domain creates reservation states in ingress/egress nodes that are used (partially or completely) by the RMD framework as PDR resource reservation states. In this scenario as already mentioned an external protocol (such as RSVP, RSVP aggregation) initiates and maintains the states (per flow or per aggregates) in the ingress and egress nodes. In the RMD framework these states (fully or partially) are to be used by the PDR handling the resource reservation in the Diffserv domain as PDR states, which will consist of for example a flow id and a DSCP. Furthermore, in this scenario the "PHR_Resource_Request" and "PHR_Refresh_Request" messages are encapsulating "PDR_Request_Info" messages that are used to associate the PHR signalling message that encapsulated this PDR message to for example the PDR flow ID and/or the IP address of the ingress node. Apart from this the rest of the functionality in generating and processing the PDR and PHR signalling messages by the edge and interior nodes is the same as in previous case (see Example 1). Westberg, et al. Expires October 2001 [Page 29] Internet Draft Resource Management in Diffserv Framework April 2001 The flow diagram showing the normal operation in case of successful reservation for Example 2 is shown in Figure 7. Ingress Interior Interior Egress | | | | External | External | | |External Protocol | Protocol (used for initiation of the PDR states) |Protocol <------> |<--------------|-------------------|------------------>|<------> (QoS | | | | request)| | | | |PHR_Resource_- | | | | Request | | | |(PDR_ReqInfo*) | | | |-------------->|PHR_Resource_Request | | | (PDR_ReqInfo*) | | | |------------------>|PHR_Resource_Request | | | (PDR_ReqInfo*) | | | |------------------>| | | | | | PDR_Reservation_Report | |<--------------|-------------------|-------------------| | | | | | | Traffic(user) Data| | -------->|-------------->|------------------>|------------------>|---> External | External | | |External Protocol | Protocol (used for maintenance of the PDR states)|Protocol <------> |<--------------|-------------------|------------------>|<------> | | | | |PHR_Refresh_- | | | |Update | | | |(PDR_ReqInfo*) | | | | ------------->| PHR_Refresh_Update| | | | (PDR_ReqInfo*) | | | |------------------>| PHR_Refresh_Update| | | | (PDR_ReqInfo*) | | | |------------------>| | | | | | | PDR_Refresh_Report| | |<--------------|-------------------|-------------------| | | | | (PDR_ReqInfo*) - represents the PDR_Request_Info message encapsulated into a PHR message message. This message is processed only by the ingress and egress nodes. Westberg, et al. Expires October 2001 [Page 30] Internet Draft Resource Management in Diffserv Framework April 2001 Figure 7: Normal Operation for successful reservation - Example 2 When there are no resources available in ingress/egress nodes or interior nodes the operation is similar to the one in Example 1, with the difference on the fact that the PDR resource reservation states are handled by the external protocol. Furthermore, in this scenario the "PHR_Resource_Request" and "PHR_Refresh_Request" messages are encapsulating "PDR_Request_Info" messages that are used to associate the PHR signalling message that encapsulated this PDR message to for example the PDR flow ID and/or the IP address of the ingress node. Figure 8 depicts the normal operation for an unsuccessful reservation for Example 2, where X represents the external protocol states related to the unsuccessful reservation, that need to be released either based on soft state principle or explicitly depending on the external protocol. Ingress Interior Interior Egress External | | | |External Protocol | External Protocol (used for PDR state initiation) |Protocol <------> |<-------------------|------------------|--------------->|<------> (QoS | | | | request)| | | | |PHR_Resource_Request| | | | (PDR_ReqInfo*) | | | |------------------->| PHR_Resource_Request (marked) | | | (PDR_ReqInfo*) | | M---------------------------------->| | | | | | PDR_Reservation_Report (marked) | |<-------------------|------------------|----------------| | | | | External | | | |External Protocol | | External Protocol| |Protocol <------> X<-------------------|------------------|--------------->X<------> (QoS | | | | request)| | | | (PDR_ReqInfo*) - represents the PDR_Request_Info message encapsulated into a PHR message. This message is processed only by the ingress and egress nodes. Figure 8: Normal Operation for unsuccessful reservation - Example 2 Westberg, et al. Expires October 2001 [Page 31] Internet Draft Resource Management in Diffserv Framework April 2001 5.2.2. Normal operation using the measurement-based PHR This RMD functionality is quite similar to that which uses the reservation-Based PHR. As with the reservation_based PHR, in this case both of the example scenarios are considered and the same differences between the two in the manner of handling the PDR states, applies here as well. The classification of the QoS request is done as described in Sections 5.2.1.1 and 5.2.1.2 respectively. The difference with the reservation-based is that the measurement- based PHR relies on a measurement algorithm on admission or rejection of the resource requests. As such, it does not have to maintain any resource reservation state per PHB in the edge or interior nodes. However, the measurement based PHR uses two states that are not maintained by the PHR protocol. One state per PHB that stores the measured user traffic load associated to that PHB and another state per PHB that stores the maximum allowable traffic load per PHB. However, the edges maintain a PDR resource reservation state (see Section 3.2.3). The initiation and maintenance of the PDR resource reservation states is accomplished in an identical way as described in Section 5.2.1.1 and Section 5.2.1.2 respectively. If the QoS request can be satisfied locally, the ingress node will start the process of generating the "PHR_Resource_Request" message. In addition, depending on how the external resource reservation protocol initiates and maintains the PDR resource reservation states at the edges, the ingress node will also create either the "PDP_Resource_Request" message or the "PDR_Request_Info" message (see Section 5.2.1.1). On receiving the "PHR_Resource_Request" signaling message, the interior node has to check the monitoring status by, for example, measuring the real average traffic (user) data load per PHB. By "monitoring status", we specify how much of the resources allocated to a particular PHB have been consumed. If the sum of the value of the PHR requested resources and the value specified by the monitoring status is less than or equal to the maximum node capacity associated with the given PHB, then the request is accepted. Otherwise, the node does not have the requested amount of resources. Therefore, "PHR_Resource_Request" is marked as not admitted. The behavior of the egress node on admission or rejection of the "PHR_Resource_Request" is the same as in the interior nodes. The Westberg, et al. Expires October 2001 [Page 32] Internet Draft Resource Management in Diffserv Framework April 2001 reporting process used to inform the ingress node about the monitoring status is similar to the process explained in Section 5.2.1.1. 5.3. Example of Fault Handling Operation Fault Handling Operation refers to the situations when there are problems in the network, such as route or link failure, severe congestion, loss of the PHR signaling messages, etc. Two typical situations will be described: the loss of the PHR signaling messages and severe congestion. The fault handling operation described here is in general independent from the type of the example scenarios, thus it can be applied in both cases. 5.3.1. Loss of PHR signaling messages The PHR signaling messages and subsequently the PDR signaling messages might be dropped, for example due to route or link failure. The loss of the PHR signaling messages is especially problematic for the reservation-based PHR since the dropped signaling messages might have reserved resources in some interior nodes in the communication path that will now not be used. This does not present a problem for the measurement-based PHR since the resources are not reserved. The ingress nodes are responsible for handling the loss of the PHR signaling messages. After sending a "PDR_Reservation_Request", a "PDR_Refresh_Request" or a "PDR_Request_Info" message as encapsulated in a PHR message, the ingress node will start a timer. The ingress node will then wait for a predefined amount of time to receive an acknowledgement, either as a "PDR_Reservation_Report" or "PDR_Refresh_Report" message. If the ingress node does not receive this acknowledgment within the predefined amount of time, it will conclude that an error has occurred. Moreover, it will also know that this error occurred during the resource reservation process for the flow session that is associated with the "PDR_Reservation_Request" or "PDR_Refresh_Request" message it sent previously. When a "PHR_Resource_Request" message is dropped, then the ingress node will not send any new PDR and PHR signaling messages associated with the same flow session during the first subsequent refresh period. In this way all the possible unused reserved resources will implicitly be released within one refresh period. Westberg, et al. Expires October 2001 [Page 33] Internet Draft Resource Management in Diffserv Framework April 2001 When a "PHR_Refresh_Update" message is dropped, the ingress node, depending on which PDR type was used, will send a PDR and "PHR_Refresh_Update" message during either the first or second subsequent refresh period. In the first case, one or more interior nodes may reserve double the amount of the required resources, while only half of the amount of these reserved resources will be used. In the second case, the ingress node will not send any new PDR and "PHR_Refresh_Update" messages associated with the same flow session during the first subsequent refresh period. In this way all possible unused reserved resources will implicitly be released. However, the application may experience a possible QoS degradation during one refresh period. 5.3.2. Severe Congestion Handling operation Severe Congestion handling in the RMD framework is the same regardless of the PHR group used. When severe congestion occurs, the ingress node MUST be informed. If the severe congestion occurs in the interior or the egress node, then these nodes will set the "severe congestion" flag [RODA] in the PHR signaling message and will forward it to the egress node. The egress node will inform the ingress node by sending a report message with the "severe congestion" flag set. After receiving this message, the ingress node will discard all new incoming requests for the severely congested path for a predefined time. A flow diagram showing the severe congestion handling is depicted in Figures 9 and 10, where in a) the severe congestion flag is set in PHR_Resource_Request and as a result no new QoS requests are admitted for that communication path and in b) the severe congestion flag is set in PHR_Refresh_Update and depending on the PDR used, on-going flows might be preempted (e.g., shifted to an alternative PHB). Note that this separation is only for illustrative purposes, since once a severe congestion occurs in the path independently of which messages are marked with severe congestion there will be no traffic sent on that path within the same PHB. Figure 9 illustrates the scenario that is denoted in Section 5.2.1.1, i.e. Example 1 and Figure 10 illustrates the scenario that is denoted in Section 5.2.1.2, i.e. Example 2. Westberg, et al. Expires October 2001 [Page 34] Internet Draft Resource Management in Diffserv Framework April 2001 a) Ingress Interior Interior Egress | | | | QoS |PHR_Resource_Request| | | Request | (PDR_ResReq*) | | | -------->|------------------->|PHR_Resource_Request (severe congestion) | | | (PDR_ResReq*) | | S---------------------------------------->| | | | | | PDR_Reservation_Report (severe congestion) | |<-------------------|--------------------|--------------------| QoS | | | | Request | | | | Rejected | | | | <--------| | | | | | | | b) | | Traffic(user) Data | | -------->|------------------->|------------------->|------------------->|---> |PHR_Refresh_Update | | | | (PDR_RefReq*) | | | |------------------->|PHR_Refresh_Update (severe congestion) | | | (PDR_RefReq*) | | S---------------------------------------->| | | | | | PDR_Refresh_Report (severe congestion) | |<-------------------|--------------------|--------------------| Traffic | | | | data | | | | blocked | | | | ---------X | | | | | | | (PDR_ResReq*) - represents the PDR_Reservation_Request message encapsulated in the PHR_Resource_Request message. This message is processed only by the ingress and egress nodes. (PDR_RefReq*) - represents the PDR_Refresh_Request message encapsulated in the PHR_Refresh_Update message. This message is processed only by the ingress and egress nodes. Figure 9: Severe Congestion handling Operation applied to Example 1 Westberg, et al. Expires October 2001 [Page 35] Internet Draft Resource Management in Diffserv Framework April 2001 a) Ingress Interior Interior Egress External | | | |Ext. Protocol | External Protocol (used for PDR state initiation) |Prot. <------> |<-------------------|--------------------|------------------->|<---> QoS | | | | Request |PHR_Resource_Request| | | | (PDR_ReqInfo*) | | | |------------------->|PHR_Resource_Request (severe congestion) | | | (PDR_ReqInfo*) | | S---------------------------------------->| | | | | | | | | | PDR_Reservation_Report (severe congestion) | |<-------------------|--------------------|--------------------| QoS | | | | Request | | | | Rejected | | | | <--------| | | | | | | | b) | | Traffic(user) Data | | -------->|------------------->|------------------->|------------------->|---> |PHR_Refresh_Update | | | | (PDR_ReqInfo*) | | | |------------------->|PHR_Refresh_Update (severe congestion) | | | (PDR_ReqInfo*) | | S---------------------------------------->| | | | | | PDR_Refresh_Report (severe congestion) | |<-------------------|--------------------|--------------------| Traffic | | | | data | | | | blocked | | | | ---------X | | | | | | | (PDR_ReqInfo*) - represents the PDR_Request_Info message encapsulated into a PHR message message. This message is processed only by the ingress and egress nodes. Figure 10: Severe Congestion handling Operation applied to Example 2 Westberg, et al. Expires October 2001 [Page 36] Internet Draft Resource Management in Diffserv Framework April 2001 6. Interoperability with external resource reservation schemes The RMD framework is initially designed for a single edge-to-edge Diffserv domain. As part of the global Internet, this single edge- to-edge Diffserv domain will have to interoperate with other domains that may or may not be Diffserv-capable and which may use different resource reservation schemes. The RMD framework, which is specified as an open framework, MUST be able to interoperate with these external resource reservation schemes. That is, the PDR functionality will have to take care of interoperability between the external resource reservation schemes and the PHR protocol. The external resource reservation scheme could be applied either on an end-to-end or an edge-to-edge basis. In order to describe this interoperability, the two most typical scenarios are chosen: - Interoperability of the RMD framework with an RSVP/Intserv domain For a description of this interoperability, the Integrated Services over Differentiated Services framework [RFC2998] is used as a reference. The framework for Integrated Services (Intserv) operation over Differentiated Services (Diffserv) views the two architectures as complementary towards deploying end-to-end QoS. It is primarily intended to support the quantitative (guaranteed) end-to-end services that have not been commercially deployed yet by RSVP/Intserv due to the lack of scalability. The specific realization of the RSVP/Intserv - Diffserv interoperation depends on Diffserv resource management and on Diffserv network region RSVP awareness. Resource management in Diffserv can either be static (managed by human agents) or dynamic (via protocols). When the resource management in Diffserv is performed using the RSVP protocol, then according to the scenario described in Section 5.2.1.2, the RSVP protocol will have to be used as an external resource reservation protocol that will initiate and maintain the PDR resource reservation states used at the edges of the RMD domain. Independently of the Diffserv resource management, the service mapping of Intserv-defined services to Diffserv-defined services is essential for Intserv-over-Diffserv operation, unless Diffserv is used only as transmission medium. Service Westberg, et al. Expires October 2001 [Page 37] Internet Draft Resource Management in Diffserv Framework April 2001 mapping depends on appropriate selection of PHB, admission control and policy control on the Intserv request based on the available resources and policies in the Diffserv domain. In this framework, it is the edge nodes that will perform the service mapping on receiving of the RESV message. - Dynamically Assigned Trunk Reservations In this case, the SLAs/SLSs between different Diffserv domains are negotiated in a dynamic way. In this scenario, RSVP aggregation [BaIt01] is used to signal QoS requests, that is, negotiate the SLAs/SLSs between Diffserv domains. Furthermore, the DSCP marking is performed in a domain outside the RMD domain, such as the neighboring Diffserv domain located upstream. When the RSVP aggregation protocol is used to dynamically assign the trunk reservations, then according to the scenario described in Section 5.2.1.2, the RSVP aggregation protocol will have to be used as an external resource reservation protocol that will initiate and maintain the PDR resource reservation states used at the edges of the RMD domain. 7. Applicability scope of the RMD framework The RMD framework is designed to be applicable to core networks and any type of access networks, wired and wireless, as long as they are using the Diffserv architecture edge-to-edge. As a particular example, the RMD framework applicability to wireless cellular access networks, that is, IP-based Radio Access Networks (RANs), is considered. The specific characteristics of the RAN (see [PaKa01]) constrain the resource management strategies applied in the IP-based RAN with strict requirements, which are explained in [PaKa01] in detail. These requirements are not satisfactorily met by the current resource management strategies (see [PaKa01]). The RMD framework design on the other hand satisfies these specific resource management requirements, which gives the RMD framework an advantage over the current resource management strategies. In order to fulfill the specific requirements related to resource management strategies applied in the IP-based RAN given in [PaKa01], Westberg, et al. Expires October 2001 [Page 38] Internet Draft Resource Management in Diffserv Framework April 2001 the PDR protocol in the RMD framework MUST be able to support the bi- directional reservations. This means that the PDR protocol MUST support the bi-directional feature described in Section 3.2.5. in addition to the mandatory ones given in Section 3.2.1 to 3.2.4. 8. Tunneling When PHR/PDR signaling messages are tunneled within the RMD Diffserv domain, the tunneling messages MUST include the PHR/PDR option field. 9. Security Considerations The general security and tunneling considerations stated in Section 6 of [RFC2475] apply also to this RMD framework. In addition, unlike Differentiated Services PHBs, and PDBs, the RMD framework allows the edge nodes to reserve bandwidth or other QoS parameters dynamically. This flexibility makes it more vulnerable to erroneous reservations and sabotage. In order to keep functioning properly, the edge nodes MUST be certain that any flow reserving resources in the core network is allowed to do this and only up to that flow's agreed-upon limit. If the edge node detects erroneous or malicious behavior, it MUST police that flow to the agreed-upon limits or reject it entirely. Because of the use of soft state, the RMD framework can recover relatively easily from incorrect reservations. Thus, it is quite safe to deploy the RMD framework in a well-controlled network with trustworthy edge nodes. In order to prevent abuse of the QoS capabilities of the core network, the ingress nodes SHOULD filter any PHR or PDR related header information coming from the outside before sending it through the core network. Whether this information needs to be preserved and later re-inserted or if it should be discarded from the packet or if the entire packet should be discarded is an open issue. 10. Conclusions The Resource Management in Diffserv (RMD) framework presented in this memo is an open framework, which by means of the PHR and PDR functionality provides a scalable and simple solution for resource Westberg, et al. Expires October 2001 [Page 39] Internet Draft Resource Management in Diffserv Framework April 2001 reservation in a single edge-to-edge Diffserv domain. Furthermore, the Resource Management in Diffserv framework provides the necessary functionality for interoperability with other external resource management strategies, which makes it a part of the effort to achieve end-to-end QoS deployment. The RMD framework applicability to Diffserv-based core networks and various wired and wireless access networks is also of particular importance. 11. References [BaIt01] Baker, F., Iturralde, C. Le Faucher, F., Davie, B., "Aggregation of RSVP for IPv4 and IPv6 Reservations", Internet Draft, Work in progress. [Doy98] Doyle, J, "CCIE Professional Development: Routing TCP/IP", Volume 1, CISCO Press, 1998. [NiKa01] Nichols, K., Carpenter, B., "Definition of Differentiated Services Per Domain Behaviors and Rules for their Specification", Internet Draft, Work in progress. [RODA] Westberg, L., Karagiannis, G., Partain, D., Oosthoek, S., Jacobsson, M., Rexhepi, V., "Resource Management in Diffserv On DemAnd (RODA) PHR", Internet Draft, Work in progress. [PaKa01] Partain, D., Karagiannis, G., Westberg, L., "Resource Reservation Issues in Cellular Access Networks", Internet Draft, Work in progress. [RFC1633] Braden, R., Clark, D., Shenker, S., "Integrated Services in the Internet Architecture: An Overview", IETF RFC 1633, 1994. [RFC1905] Case, J., McCloghrie, K., Rose, M. and S. Waldbusser, "Protocol Operations for Version 2 of the Simple Network Management Protocol (SNMPv2)", RFC 1905, 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC2119, March 1997. Westberg, et al. Expires October 2001 [Page 40] Internet Draft Resource Management in Diffserv Framework April 2001 [RFC2205] Braden, R., Zhang, L., Berson, S., Herzog, A., Jamin, S., "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification", IETF RFC 2205, 1997. [RFC2474] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1998. [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Zh., Weiss, W., "An Architecture for Differentiated Services", IETF RFC 2475, 1998. [RFC2543] Handley, M., Schulzrinne, H., Schooler, E., Rosenberg, J., "SIP: Session Initiation Protocol", IETF RFC 2543, 1999. [RFC2597] Heinanen, J., Baker, F., Weiss, W., Wroclawski, J., "Assured Forwarding PHB group", IETF RFC 2597, 1999. [RFC2598] Jacobson, V., Nichols, K., Poduri, K., "An Expedited Forwarding PHB", IETF RFC 2598, 1999. [RFC2638] Nichols, K., Jacobson, V., Zhang, L., " A two-bit Differentiated Services Architecture for the Internet", IETF RFC 2638, 1999. [RFC2676] Apostolopoulos, G., Willians, D., Kamat, S., Guerin, R., Orda, A., Przygienda, T., "QoS Routing Mechanisms and OSPF Extensions", IETF Experimental RFC 2676, August 1999. [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J., "Requirements for Traffic Engineering Over MPLS", IETF Informational RFC 2702, September 1999. [RFC2748] Durham, D., Boyle, J., Cohen, R., Herzog, S., Raja, R., Sastry, A., "The COPS (Common Open Policy Service) Protocol" IETF RFC 2748, January 2000. [RFC2859] Fang, W., Seddigh, N., Nandy, B., "A Time Sliding Window Three Colour Marker (TSWTCM)", IETF Experimental RFC 2859, June 2000. Westberg, et al. Expires October 2001 [Page 41] Internet Draft Resource Management in Diffserv Framework April 2001 [RFC2998] Bernet, Y., Yavatkar, R., Ford, P., baker, F., Zhang, L., Speer, M., Braden, R., Davie, B., "Felstaine, E., "Framework for Integrated Services operation over Diffserv Networks", IETF RFC 2998, 2000. [WeTu00] Westberg. L., Turanyi Z. R., Partain, D., "Load Control of Real-Time Traffic", Internet Draft, Work in progress. 12. Acknowledgements Special thanks to Geert Heijenk for reviewing this and providing useful input. 13. Authors' Addresses Lars Westberg Ericsson Research Torshamnsgatan 23 SE-164 80 Stockholm Sweden EMail: Lars.Westberg@era.ericsson.se Martin Jacobsson Ericsson EuroLab Netherlands B.V. Institutenweg 25 P.O.Box 645 7500 AP Enschede The Netherlands EMail: Martin.Jacobsson@eln.ericsson.se Georgios Karagiannis Ericsson EuroLab Netherlands B.V. Institutenweg 25 P.O.Box 645 7500 AP Enschede The Netherlands EMail: Georgios.Karagiannis@eln.ericsson.se Simon Oosthoek Ericsson EuroLab Netherlands B.V. Institutenweg 25 Westberg, et al. Expires October 2001 [Page 42] Internet Draft Resource Management in Diffserv Framework April 2001 P.O.Box 645 7500 AP Enschede The Netherlands EMail: Simon.Oosthoek@eln.ericsson.se David Partain Ericsson Radio Systems AB P.O. Box 1248 SE-581 12 Linkoping Sweden EMail: David.Partain@ericsson.com Vlora Rexhepi Ericsson EuroLab Netherlands B.V. Institutenweg 25 P.O.Box 645 7500 AP Enschede The Netherlands EMail: Vlora.Rexhepi@eln.ericsson.se Robert Szabo Net Lab Ericsson Hungary Ltd. Laborc u. 1 H-1037 Budapest Hungary EMail: robert.szabo@eth.ericsson.se Pontus Wallentin Ericsson Radio Systems AB P.O. Box 1248 SE-581 12 Linkoping Sweden EMail: Pontus.Wallentin@era.ericsson.se 14. Appendix 1 An example of the algorithm used to reserve and update aggregated soft states per DSCP is the sliding window algorithm. The terms used in this algorithm are defined as follows: - u: is the number of resource units to be reserved or refreshed. Westberg, et al. Expires October 2001 [Page 43] Internet Draft Resource Management in Diffserv Framework April 2001 - Window: a buffer used to keep track of the reservation state over a single refresh period length. The window used is the same length as the refresh period. - Cell: the window is split into a number of cells. The duration of the cell defines the reaction time of the algorithm. - threshold: maximum number of resources that may be reserved. - countarray: array containing the number of reserved and refreshed resource units in each of the previous cells (array size = number of cells per window). - rfcount: counter for the number of currently reserved and refreshed resource units by "PHR_Resource_Request" and "PHR_Refresh_Update" messages in the current cell. - lastsum: sum of reserved resource units in all the previous cells of a single window. - newsum: sum of reserved and refreshed resource units of all cells in the current window, including the resource units of the current cell. NB, this is a dynamic value, increasing gradually during the currently active cell. - Interior nodes keep only the countarray, threshold, rfcount, lastsum and newsum per DSCP. The algorithm is simple, so it can be easily implemented in hardware by simple counters. Its inputs are the refresh period length and the maximum number of reserved resource units allowed on the link. The latter is denoted by . We assume external QoS requests with similar characteristics (e.g., voice). Moreover, it is assumed that the ingress node sends one "PHR_Refresh_Update" per refresh period. If the network uses more PHBs for real-time traffic, then a separate copy of the algorithm may be run for each PHB, resulting in per-PHB admission. The algorithm counts the number of resource units in "PHR_Resource_Request" and "PHR_Refresh_Update" messages in a current cell (). The result of the counting is an upper limit on the number of resource units reserved on the link, as some reservations may have gone by the end of the cell. The value of this counter is used in the next cell to decide on admission (either or ). When a new reservation is admitted, this Westberg, et al. Expires October 2001 [Page 44] Internet Draft Resource Management in Diffserv Framework April 2001 value is increased to take the new reservation into account. Pseudo-code of "sliding window": nrofcells = 15 // number of cells in the periodlength countarray[0..nrofcells-1] = 0 // exactly nrofcells cells in // countarray rfcount = 0 // count of PHR_Resource_Request // and PHR_Refresh_Update messages // in current cell // "invariants" at the cell boundary: lastsum = sum from i=0 to nrofcells-1 of countarray[i] // lastsum represents the total amount of reserved bandwidth up // to the current cell for a complete refresh period-length newsum = lastsum - countarray[0] + rfcount // see this as a macro that is constantly up to date! // newsum is going to be the "lastsum" for the next // cell. So the oldest cell is left out and the new rfcount // (PHR_Resource_Request/PHR_Refresh_Update) value is included. on event: arrival of signal message p // let p denote the arriving message and p(u) the number of // resource units to be reserved or refreshed. // there are 2 options for which cell p is processed in: // a) the cell active at the time of arrival and // b) the cell active at the time of processing // from a protocol perspective a) is preferable, but may // be more difficult to implement, since it requires an // extra check when going to the next cell, to check if // there are still messages in the queue for the current // cell. select on p case p = PHR_Resource_Request message if p(u) + lastsum <= threshold then rfcount += p(u) lastsum += p(u) else Westberg, et al. Expires October 2001 [Page 45] Internet Draft Resource Management in Diffserv Framework April 2001 mark p end if case p = PHR_Refresh_Update message if p(u) + newsum <= threshold then rfcount += p(u) else mark p end if end select end event on event: cell ends // advance the window to the next cell slide_window(countarray) countarray[nrofcells-1] = rfcount // sum all the cells in countarray lastsum = sum(countarray) newsum = lastsum - countarray[0] rfcount = 0 end event function slide_window ( a : array ) // after this operation, a[0] contains what was previously // in a[1]. The same goes for all the other values in a, // except for the last, which is set to 0 end function Table of Contents 1 Introduction .................................................... 2 1.1 Definitions/Terminology ....................................... 4 2 Overview of the RMD Framework Protocols ......................... 6 2.1 RMD framework scenarios ....................................... 8 2.2 PDR protocol functions ........................................ 10 2.3 PHR protocol functions ........................................ 10 3 The PDR protocols ............................................... 11 3.1 Introduction .................................................. 11 3.2 Per Domain Reservation (PDR) protocol features ................ 12 Westberg, et al. Expires October 2001 [Page 46] Internet Draft Resource Management in Diffserv Framework April 2001 3.2.1 Ingress node addressing ..................................... 12 3.2.2 Error control ............................................... 13 3.2.3 Management of Reservation States ............................ 13 3.2.4 Resource Unavailability ..................................... 14 3.2.5 Bi-directional reservations ................................. 15 4 The PHR protocols ............................................... 16 4.1 Introduction .................................................. 16 4.2 Per Hop Reservation (PHR) protocol features ................... 17 4.2.1 One reservation state per Diffserv class PHB ................ 18 4.2.2 Sender-initiated ............................................ 18 4.2.3 Adapts to Load Sharing ...................................... 19 4.2.4 Severe Congestion Handling .................................. 20 5 Examples of RMD Operation ....................................... 22 5.1 Examples of Signaling Message Types ........................... 22 5.1.1 PHR signaling message types ................................. 23 5.1.2 PDR signaling message types ................................. 23 5.2 Example of Normal operation ................................... 25 5.2.1 Normal Operation using the reservation-based PHR ............ 25 5.2.1.1 Example 1 ................................................. 25 5.2.1.2 Example 2 ................................................. 29 5.2.2 Normal operation using the measurement-based PHR ............ 32 5.3 Example of Fault Handling Operation ........................... 33 5.3.1 Loss of PHR signaling messages .............................. 33 5.3.2 Severe Congestion Handling operation ........................ 34 6 Interoperability with external resource reservation schemes ..... 37 7 Applicability scope of the RMD framework ........................ 38 8 Tunneling ....................................................... 39 9 Security Considerations ......................................... 39 10 Conclusions .................................................... 39 11 References ..................................................... 40 12 Acknowledgements ............................................... 42 13 Authors' Addresses ............................................. 42 14 Appendix 1 ..................................................... 43 Westberg, et al. Expires October 2001 [Page 47]