Internet Engineering Task Force Ashish Mehra INTERNET DRAFT Dinesh Verma IBM T J Watson Research Center 25 February 1999 Expires: 25 August, 1999 Architectural Considerations for DiffServ Servers Status of Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Mehra Verma Expires 25 August 1999 [Page i] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 Abstract This draft motivates and presents architectural considerations for Differentiated Services [DSARCH] support in servers (referred to as DiffServ servers). We discuss possible deployment scenarios for Differentiated and Integrated Services, and highlight the benefits of supporting DiffServ functions on servers. We outline the requirements for DiffServ-enabled servers and propose a policy-based architecture that allows IntServ and DiffServ functions to coexist on DiffServ servers. We then propose a DiffServ API that can be used by applications to influence installed system policies on an application-specific basis, and can serve as a candidate for standardization. We also describe a number of host-specific DiffServ functions that can be efficiently supported on servers. 1.Introduction The differentiated services architecture [DSARCH] provides a method by which network operators can support different classes of service in a network. The service differentiation is geared towards a model where an access device determines the class of service of data packets passing through it and changes the Type of Service (TOS) byte of the IP header [DSHEAD]. The routers in the network provide support for different Per Hop Behaviors (PHBs)(e.g., Assured Forwarding [ASREF], Expedited Forwarding [EFREF]) in order to provide different classes of service. There have been drafts proposed on the architecture of differentiated services access box [DIFFEDGE] as well as discussion of how differentiated services will inter-operate with the integrated services and signaled quality of service (QoS) [DIFFEDGE] [RSVPDIFF]. The prevailing model for differentiated services deployment in the working group assumes that differentiated services deployment would be typically done in the manner shown in Figure 1. It shows two stub networks connected together with a transit network. The working assumption has been that the stub networks would support integrated services while the transit network(s) would support differentiated services. Mehra Verma Expires 25 August 1999 [Page ii] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 / Stub \ / Transit \ / Stub \ / Network \ / Network \ / Network \ |---| | |---| |---| |---| |---| | |---| |Tx |-| |ER1|---|BR1| |BR2|---|ER2| |-|Rx | |---| | |-- | |---| |---| |---| | |---| \ / \ / \ / \ / \ / \ / Figure 1: Sample Network Configuration While the model proposed above has several merits, we believe that there is a significant need for a model where differentiated services needs to be extended to the end-host. This is highly desirable in the deployment of servers which need to process a large number of connections request per seconds. While differentiated services marking by the end-host has been allowed for in the drafts currently in the working group, adequate attention has not been given to the architecture of the servers that support differentiated services. Another important feature overlooked in the current discussion on differentiated services is the impact of intervening proxies (including application-level gateways/relays) on packet PHB marking. Proxies are a common deployment in current networks, for reasons of security (e.g. SOCKS proxies), content filtering and caching (HTTP proxies), voice call aggregation (H.323 proxies) etc. Whenever a proxy is deployed in the network, differentiated services markings done as per the current architecture would be lost and would need to be reconstructed. Moreover, translation of QoS requirements across proxies may be necessary to honor network policies for the DiffServ domains interconnected by the proxies. Therefore, it is important to develop an architecture model which would support the existence of proxies in a network. When proxies need to be supported in a differentiated services network, it is much more efficient to provide diffserv marking support as an end-host mechanism rather than as a network mechanism. The goal of this Internet Draft is to propose a common mechanism to extend support for diffserv functions to the end hosts. The draft presents the reasons why we believe extension of diffserv functions to the end-host is important, and it outlines an architecture of the end-host that can be used towards this purpose. We also discuss the QoS components at end-hosts that may need to be standardized in order to provide interoperability among different vendors. 2. Deployment Scenarios for IntServ and DiffServ In accordance with the sample network described in Figure 1, the following possible scenarios could be used for deployment of QoS features in the network. The three networks shown in the Figure are characterized as IntServ/DiffServ networks. In a Mehra Verma Expires 25 August 1999 [Page iii] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 practical deployment, each of these networks can consist of multiple subnetworks. For ease of notation, we would call one of the stub networks as a client network, and the other stub network as a server network. This is intended to simply identify the networks uniquely. The communicating end-hosts on the stub networks may indeed operate in a peer-like relationship rather than a client-server relationship. We further assume that each of the network operators would choose to offer some flavor of QoS support. In context of the above, the following possible configuration may be available: Client Stub Network Transit Network Server Stub Network 1. DiffServ DiffServ DiffServ 2. IntServ IntServ IntServ 3. IntServ DiffServ IntServ 4. DiffServ IntServ DiffServ 5. DiffServ IntServ IntServ 6. IntServ DiffServ DiffServ Scenario 1 implements diffserv all through the network. The advantages of this approach is that a scalable and uniform QoS mechanism is available in the network. Using policy support [POLICY], DiffServ can be enabled in the stub networks without modifying existing applications. However, there is no end-to-end signaled QoS available for the applications. Scenario 2 implements intserv all through the network. The advantages and limitations of such a deployment are well understood. While this approach enables per-flow signaled end-to-end QoS for the applications, this can cause significant scaling problems for the transit network. Scenario 3 is the one described in [DIFFEDGE] and [RSVPDIFF] drafts. This enables the aggregation of multiple flows in the transit network, and reduces the scaling problems. However, scaling issues associated with excessive signaling load on high-end servers (e.g. high-volume web-servers or Voice-on-IP servers) would still remain. Scenario 4 is in a sense the inverse of Scenario 3, where the stub networks support DiffServ while the transit network supports IntServ. This may be used in cases where IntServ mechanism are used to set up aggregated reserved bandwidth tunnels in the transit network, with or without the use of tag-switching [MPLS]. The scaling issues at server as well as transit networks are alleviated by the use of aggregation Mehra Verma Expires 25 August 1999 [Page iv] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 and/or DiffServ. However, the overhead associated with establishment and operation of aggregated tunnels could be a potential drawback. Scenarios 5 and 6 are mirror images of one another, where the network supports IntServ in one half and DiffServ in another. At stub networks where scalability concerns are paramount, DiffServ half would be more desirable. However, some stub networks may want to allow applications to support signaled QoS which then gets mapped into an appropriate diffserv tunnel. Each of the scenarios outlined above have their merits and drawbacks. We would expect the Internet to support a mix of the above, with best-efforts networks further complicating the picture during initial deployments. Depending on the deployments of various ISPs, organizations may have to cope with one or more of the scenarios outlined above. In half of these scenarios, we see a need to support DiffServ on the end-hosts. In the next section, we give further justifications why such a support would be desirable for servers. 3. Why DiffServ on Servers In the Differentiated Services model and architecture [DSARCH,DSFRAME], it is often highly desirable for a customer's egress node to mark, meter, and/or shape all traffic leaving the egress node into a provider's network. The egress nodes may be boundary/access routers providing value-added connectivity between the customer and provider networks, or they may be hosts such as servers directly attached to the provider's network. The latter scenario of direct-attached hosts often arises when servers (such as web content servers or frontend presentation servers) comprise the egress nodes of a content provider or e-business vendor. Providing QoS functions such as traffic classification, marking, metering, and shaping on edge devices such as servers and proxies is desirable for the following reasons. Some of these have been briefly mentioned in [DSARCH, DSFRAME] as well. 3.1 Scalability Pushing more complex QoS functions to the edge allows realization of a simple and highly efficient network core that provides simple forwarding and QoS functions. This in turn enables the network core to be highly scalable in terms of resource requirements of the network elements and the total amount of traffic they can Mehra Verma Expires 25 August 1999 [Page v] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 forward successfully. Providing QoS functions at network servers also facilitates simplified packet processing and traffic handling at downstream egress nodes, since pre-marked and pre-conditioned traffic is likely to consume fewer buffer, processing, and bandwidth resources. However, unlike other edge devices which must explicitly create and maintain per-flow state, servers and proxies already maintain per-flow (a flow being a TCP connection or UDP session) state in the form of sockets and protocol control blocks. Moreover, a network edge device must invest substantial per-flow resources to manage traffic that is in violation of specified traffic profiles, either to discard such packets (drop) or hold back out-of-profile packets in local buffers (shape). Generating backpressure by notifying the offending traffic source may often be too expensive or slow, and may require mechanisms in the source to honor such notifications anyways so as to guard against applications that ignore such notifications or fail to adapt gracefully. Instead, a server or proxy implementing QoS functions can simply block the violating application (s), generate local notifications for the application immediately following a traffic violation, or return transmission failure indications for uncooperative applications. An added advantage of performing elaborate QoS functions at servers is the opportunity to avoid incorrect packet classification at network nodes due to fragmentation of IP packets in the network. Multi-Field (MF) classifiers [DSARCH] base their traffic classification decisions on the contents of transport-layer header fields. Since only the first fragment of a packet carries the packet header, and hence the transport-layer header, MF classifiers at network nodes may not correctly classify packet fragments following the first fragment of a packet. As pointed out in [DSARCH], maintaining fragmentation state at these network nodes is not a general solution due to fragment reordering at upstream nodes or divergent routing paths. In the absence of a well-established policy for handling fragmented packets, marking packets at the traffic source allows fragments to carry a common mark (assuming fragmentation, wherever it occurs, either does not re-mark the packet or re-marks all fragments of a packet). BA classifiers use the contents of incoming DS field of a packet to do the correct classification. Packet marking at the source makes it possible to use BA classifiers correctly with packet fragments. 3.2 Implementation Efficiency Supporting DiffServ functions on servers has the additional benefit of efficient classification, marking, and aggregation of flows. Since servers are typically the endpoints of TCP connections, they Mehra Verma Expires 25 August 1999 [Page vi] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 can classify and aggregate traffic at connection establishment time and exploit per-connection state to subsequently classify packets traversing these connections very efficiently. Similar mechanisms enable the server to efficiently generate DS codepoints (derived from the policy rule and/or the incoming TOS marking on the connection request (Section 6.3)) during connection classification, and mark packets subsequently. The same benefits apply to connected UDP traffic, i.e., where the application issues an explicit connect to send UDP traffic to a single destination. For unconnected UDP traffic, per-packet classification (similar to that performed at network elements) may be necessary if each packet is sent to a different destination. Section 4 discusses additional benefits of supporting DiffServ functions on servers. 3.3 Specificity and Granularity of Packet Marking Marking of packets at the traffic source has a number of key advantages over marking of packets at ingress nodes other than the traffic source. A traffic source can not only utilize the network level information present in packet headers, it can also refer to the per-flow state already available at the traffic source. While much of this state is available to network nodes as well (via packet headers), the use of volatile ports by applications and use of dynamic IP addresses (e.g., using DHCP) by client hosts makes it difficult and prohibitively expensive, if not impossible, to perform accurate traffic classification. More importantly, the traffic source can classify and mark traffic based on additional information available only at the traffic source. This "local" information includes one or more of the following: - the type of application or service - application-specific preferences - specific users (e.g., those using a given application) or subscribers of a service - specific groups to which users, applications, or services belong - security associations or privileges and other attributes derived from these. It may not be possible or desirable to export this information from the traffic source due to security and privacy concerns. Even if Mehra Verma Expires 25 August 1999 [Page vii] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 such information could be exported, doing so would require mechanisms to (a) communicate the information to the appropriate ingress nodes, (b) store the information in a suitable format at these network nodes, and (c) employ very intelligent classifiers to classify traffic appropriately. This approach requires that all information be carried or encoded in a structured manner in packet headers or the packet payload, which is not always feasible, or requires frequent out-of-band signaling, which is clearly not desirable for scalability. Moreover, given the unstructured nature of the local information, maintaining and looking up such state is likely to be very resource-intensive for egress or ingress nodes, further limiting scalability. It seems reasonable, therefore, for the traffic source to perform elaborate traffic classification based on a wide range of local and network level information attributes, and use appropriate DS codepoints for the set of service classes supported. While DS codepoints can only support a limited range of distinguishable service classes, the range is sufficiently large to allow the traffic source to realize a fine granularity of service differentiation. This seems to be a reasonable compromise between very fine-grain service differentiation (the extreme being per-flow traffic handling) and the complexity of QoS mechanisms at the network egress and ingress nodes. We note that source-based packet marking does not preclude policing (and possible re-marking) at the network egress in order to check that policies are being enforced correctly. 3.4 IPSEC Tunnels and Encryption The use of encrypted tunnels, e.g., IPsec tunnels, precludes packet classification based on layer 4 and higher layer information, by encrypting the relevant fields in the packet payload. If IPsec encryption is used on an end-to-end basis, nodes or network elements at the edge (but not the source) or in the core of the network would be unable to classify traffic based on packet headers or the payload, and apply the necessary traffic conditioning rules specified. In this scenario, being the endpoint of an end-to-end IPsec tunnel, only the traffic source has the capability to classify and condition outgoing traffic, via one or more QoS functions such as marking, metering, and shaping. This is because the source can perform traffic classification and conditioning *before* performing IPsec encryption. This has an additional advantage. For traffic conditioners such as droppers, which drop traffic in excess of the specified traffic profile, the Mehra Verma Expires 25 August 1999 [Page viii] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 source need not incur the overhead of IPsec encryption. Without source-based traffic classification and conditioning, excess traffic that would otherwise be dropped at downstream egress or ingress nodes would still be encrypted and transmitted by the source. 4. Requirements and Architecture for DiffServ Enabled Servers An architecture for diffserv enabled servers must satisfy the following requirements: - It should allow networked applications to be classified into a PHB which is different than that of best effort. Many existing networked applications, e.g. databases, transaction services, CICS, MQSeries, SAP etc., and middleware technologies (e.g., Java RMI, CORBA IIOP, HTTP) have been designed to run over best effort networks. During the initial deployment of differentiated services, it would be unlikely that these applications would be modified to generate packets with appropriate code points in their TOS marking. However, it is likely that these services would need to be provided better performance than other classes of traffic in the network. Thus, an architecture for diffserv servers must support the ability to support networked applications. - It should allow new applications to request the desired service level without a signaling protocol. The differentiated services architecture is based on a non-signaled approach whereby different classes of service are obtained by means of bilateral service level specifications (SLS). A new application must be able to express its desires with respect to the type of network performance it wants to receive. However, the server must use these desires to determine which network PHB or SLS to use for specific applications. The server must not rely on the existence of a signaling mechanism to request specific service level from the network. - It should allow policy decisions to be applied uniformly across existing applications as well as new applications that signal the desired service level. A similar policy enforcement should occur for applications that use a signaled interface (e.g. RAPI) to signal their performance needs. Since both types of applications would require access to the same network resources, they should be controlled and administered by the same policy information. Mehra Verma Expires 25 August 1999 [Page ix] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 - It should allow differentiated service support to coexist with secure communications using IP-sec or SSL. Secure communications using IP-sec or SSL is a reality in current Internet communications. Since IP-sec encrypts transport layer headers, the differentiated services architecture must perform its encryption functions prior to the IP-sec functions. SSL offers no hindrances as long as classification is based soley on contents of transport and network layer headers. - It should allow functions needed in order to conform with the rate requirements enforced by network layer SLSs. The end-host must not violate the terms of the contracts negotiated with the network provider. It must put into place appropriate mechanisms to ensure that applications comply with network level SLSs. - It must coexist with the support for integrated services in the end-host. Integrated Services and Differentiated Services offer a complementary set of QoS features, which may each be most appropriate to meet a different set of application requirements. Both methods should coexist in the end-host. An architecture of a server that implements native differentiated services support to meet the above requirements is shown in Figure 2. The differentiated services features at a server are implemented in the kernel by means of the data path. The data path enforces the functions in accordance with the diffserv traffic conditioning specifications (TCSs). All connections at the servers are classified into differented classes. Each class can be mapped onto a given PHB in the network, a maximum amount of outbound traffic, a maximum amount of inbound traffic, as well as maximum amounts of buffers that can be used by connections belonging to this class of service. Mehra Verma Expires 25 August 1999 [Page x] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 +-------------+ +-------------+ | RSVP-enabled| | QoS-Unaware | | Application | | Application | +-------------+ +-------------+ | | | +--------------+ +-------------+ | RSVP Agent | | Application | +--------------+ +-------------+ | | |-----------------------------------+ | | DiffServ API | | | +---------------+ | | DiffServ Cfg | | | (Policy Agent)|<----> Configuration/Policy | | | | +---------------+ | | +------------+ | | Monitoring |<------------> SNMP | +------------+ +-----------+ | | Data | | | path |-------------------+ +-----------+ Figure 2. Architecture of DiffServ Enabled Server At the application level, access to differentiated services is controlled by means of a diffserv configuration agent. Under the usual operations, this configuration would need to be driven by means of policies defined in a configuration file or a policy repository. Thus, the diffserv configuration agent is also the policy agent for the server. Applications may or may not interact with the policy agent. In the initial deployments and for the vast majority of applications, applications would not be aware of the differentiated services capability. These applications should still be able to get better than best effort service by means of the policy agent and policy configurations. Some applications may wish to control their own choice of the classes of service they request. These applications should be able to interact with the Policy agent using an API to determine the menu of available differentiated service classes and to select one of them. It would be desirable for this interface (API) to be a standard, so that applications could be developed to run on multiple vendor platforms. Mehra Verma Expires 25 August 1999 [Page xi] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 In order to coexist with the integrated services support, the RSVP daemon on the end-host must interact with the differentiated services configuration agent (or the policy agent) to coordinate the resource sharing and data path configuration. Both RSVP agent and the policy agent may communicate with the data path, provided the configuration information coming from them is consistent. The functions along the data path would be the same as the ones identified in the [DIFFEDGE] draft. As mentioned in Section 3, the server host may exploit additional local information when implementing DiffServ functions, either to improve functionality or efficiency or both. Section 6 discusses several host-specific functions that are natural candidates for DiffServ servers. 5. The DiffServ API In order to permit inter-operation among different vendor applications, we would like to develop a standard API that would enable applications to specify their Differentiated Services requirements. In this section, we sketch the structure of such an API. One may question whether such an API is necessary, or whether existing application interfaces to set the TOS byte on a connection are adequate for this purpose. We believe that simply marking the TOS byte is an inadequate solution since the mapping of DiffServ PHBs to specific TOS encoding can not be determined in advance by applications to make the appropriate calls. Moreover, even if this were possible, application requests for specific PHB assignments should be validated against configured policies in the system. Furthermore, an application may opt to select from a choice of different PHBs that are available to it within a specific environment. The determination of these PHBs needs to be standardized in order to promote inter-vendor interoperability. Additionally, if DiffServ classification is to be performed on the basis of application-specific information (e.g., user preferences), applications using DiffServ services need a mechanism to communicate classification requests to the policy agent, which would validate these requests against configured policies in the system. The policy agent may then derive appropriate filters to be installed in the data path. Therefore, we believe that a standardized DiffServ API is highly desirable in the industry. The abstraction offered by the API would consist of different Service Classes that are available to the application, and the ability to select one of the specific classes for its connections. A key requirement for the API is that it must support IPv4 as well as IPv6 traffic. The general API calls would consist of the following routines: Mehra Verma Expires 25 August 1999 [Page xii] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 - Register: The registration call would create a handle for an application to communicate with the DiffServ Configuration Agent (Policy Agent). Its input arguments would specify actions to be invoked in case of exceptions, e.g. a callback function to denote failure of network resources. The output arguments would include a handle to use for future communications with the agent. - UnRegister: This would release any context or state maintained in the diffserv configuration associated with the specific application. The input argument would include the handle returned from a prior registration call. - ListServiceClasses: This would obtain a list of all service classes that are supported at the host. The input argument would consist of a handle, while the output argument would consist of a list of service class names to be returned to the application. - CheckServiceClass: This would determine the name of the matching service class that would be most appropriate for a specific application or a socket. The input arguments would include the attributes that identify the application or the socket, and the output argument would specify the name of the corresponding service class that would be provided to the application as per the existing configured policies. - PutinServiceClass: This function would find the name of the matching service class for a traffic flow, and configure the data path tables so that packets on that specific traffic flow would be mapped to the specific service class. A traffic flow may be identified by specifying the local and remote end-points of a socket. - RemovefromServiceClass: This function would remove a specific traffic flow from the service class. The flow must have been put into the specific service class either from default configuration or from the use of the PutinServiceClass function. The flow would then be mapped to the default service class in the system. - FindServiceDetails: This function would obtain the specific details of a service class. The input argument would include the name of the service class, and the output argument would include the description of the service class. The details would include information about the maximum rate allowed on the service class, the maximum number of connections allowed in the service class, the PHB to which the service class is mapped, and can also include information about expected or measured performance of traffic flows that belong to the specified service class. Mehra Verma Expires 25 August 1999 [Page xiii] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 The above API would be used by policy-aware applications to configure their diffserv specific needs. Of course, there would be several applications that can obtain their diffserv classification by means of configuration only, without needing the invocation of a specific API. The policy agent would configure the data path tables, and no change to existing applications would be needed. Applications with well-known server ports, or a configurable range of server ports can use this scheme. Many existing applications already use a configurable range of server ports in order to facilitate firewall traversal. 6. Host-Specific DiffServ Functions Differentiated Services can be realized efficiently at servers via appropriate placement of QoS functions in the protocol stack (referred to as the QoS Module). In addition to the QoS functions listed in [DIFFEDGE], DiffServ servers may also implement the following functions that enhance a server's ability to provide service differentiation while ensuring fairness. 6.1 Fair Flow Aggregation As mentioned in Section 3.1, implementing Differentiated Services functions on servers offers significant advantages in terms of network scalability. This is because servers can aggregate flows together based on policies or SLSs. Policy rules or SLSs for Differentiated Services typically apply to sets of flows (referred to as aggregates). For example, a single policy rule may govern all traffic flowing to a particular subnet or destination. This allows the network core to be relatively simple and streamlined with significantly reduced processing and memory requirements. While regulating all aggregated traffic as per the specified TCS, the server may wish to ensure that traffic from each flow belonging to the aggregate is treated fairly, e.g., obtains a fair share of the specified aggregate bandwidth on the locally attached link. For a TCS specifying traffic policing or shaping, this means that in-profile traffic from each flow receives a fair share of the available link bandwidth. For a TCS specifying traffic policing with packet marking, on average the same number of packets of each flow would be marked with the specified in-profile or out-of-profile DS codepoints. To maintain fairness or any other sharing policy, per-flow state is needed for each flow belonging to the aggregate. Hosts (e.g., servers) already maintain complete per-flow state at the socket and Mehra Verma Expires 25 August 1999 [Page xiv] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 transport layers in the form of data sockets and protocol control blocks. While strictly fair division of available local bandwidth (i.e., on the attached link or interface) may suffice for UDP traffic, fair allocation for TCP flows involves compensating for different congestion levels and round-trip times experienced by individual connections. The information available in the per-flow state at the server makes it possible to devise effective policies and mechanisms for fairness. Moreover, as described earlier, with minimal modifications the per-flow state can greatly facilitate traffic classification and policy rule association. 6.2 Marking of SYN-ACKs A server supporting Differentiated Services is expected to mark traffic originating on each flow governed by an active policy rule or SLS enabled at the server. However, for TCP connections, since connection establishment packets compete with regular data packets for network resources, the server should ensure that the SYN-ACK generated on receipt of a new connection request (i.e., a SYN packet) is marked as per the policy applicable to the new connection. This further implies that the server may need to classify a connection, to determine its policy or SLS association, on receipt of a SYN packet, and not upon completion of the 3-way handshake that fully establishes a TCP connection. Note that this implies that the server must invest processing cycles classifying a partially established connection, which might get aborted subsequently. However, since the server has already accepted the new connection request for further processing (having found space in the partial listen queue), it is desirable to treat the generated SYN-ACK packet at the same or even higher service level than that specified for data packets on the connection. This reduces the likelihood that the processing cycles invested by the server in handling the new connection request would be wasted due to congestion at the nearest bottleneck router (or even the output queues at the selected outgoing interface at the server). 6.3 Outbound Packet Marking using Inbound TOS It might be the case that an incoming packet at the server has been marked already, perhaps by the client or the upstream ingress node. For example, a SYN packet may be marked to encode the desired service level for the new connection request. Similarly, ACKs for data packets may be marked to direct them to an appropriate service class. In these cases the server may choose to derive the service level (i.e., DS codepoint) for packets sent on behalf of the new connection Mehra Verma Expires 25 August 1999 [Page xv] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 (including the SYN-ACK), from the DS codepoint specified in the applicable policy rule or SLS and the DS codepoint carried in the incoming packet. In contemporary protocol stacks, TCP sets the TOS field in the packet from the value stored in the corresponding protocol control block just before handing the packet to the IP layer. In order for a DiffServ data-path to generate the correct DS codepoint, it must be able to examine the DS codepoint associated with the incoming packet before any packet is output on the connection. For incoming SYN packets, this can be achieved by storing the packet's DS codepoint in the corresponding TCP protocol control block before passing control to the QoS Module for classification. The datapath calculates the new DS codepoint after classifying the packet to determine the associated policy rule or SLS, and overwrites the value stored in the TCP PCB with the new value. TCP's packet output routine would automatically use the new value in the TOS field of outbound packets. A similar approach works with incoming ACK packets, except that no explicit classification need be performed. The support for this feature would be desirable to preserve DiffServ encoding across various application-level proxies deployed within a DS-domain. The security implications of this feature are discussed in Section 7. 6.4 Inbound Connection Rate Control In addition to marking, policing, and shaping outbound traffic, a server may exercise control over the rate of incoming TCP connection requests. This would effectively ``police'' incoming SYN packets to the desired rate of connections, while discarding the rest. Similar to the handling of outbound packets, policy rules governing incoming connection requests may specify filters to selectively police client connection requests. Inbound connection rate control may help in controlling the amount of server resources consumed by a service or application on behalf of a particular client. 6.5 Multicast Issues Since all TCP transfers are unicast in nature, the multicast-related issues in the context of server support for Differentiated Services arise for UDP traffic. UDP traffic sent to a multicast address is sent to one or more adjacent routers on the multicast distribution tree rooted at the server. Since this function is performed at the IP layer, the actual number of multicast packets transmitted and the interfaces involved may not known to the QoS Module (which may be Mehra Verma Expires 25 August 1999 [Page xvi] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 integrated with the socket and transport layers). For a given policy rule, if the same bandwidth is available at each of the interfaces, multicast packets do not pose any problems for the QoS functions. However, if the bandwidth available on different interfaces is not the same, applying QoS functions correctly on multicast packets would be difficult. We note that the treatment of multicast packets in the Differentiated Services framework remains an open issue. 7. Security Considerations DiffServ servers should provide appropriate mechanisms to guard against denial-of-service attacks, especially theft of service. While various solutions exist to defend against denial-of-service attacks, the inbound connection rate control function described earlier may help limit the severity of such an attack for specific applications. Theft of service can occur if the server uses the inbound TOS on a SYN packet or an ACK packet to derive the outbound TOS for SYN-ACK packets and data packets, respectively. A malicious attacker could steal an excessive proportion of the resources at a given type of service (such as EF) by marking SYN packets with the corresponding DS codepoint. To prevent such theft of service, a DiffServ server should ensure that control packets (such as SYN-ACK packets) carrying DS codepoints are also subject to appropriate policy-based traffic conditioning functions. Acknowledgments The authors would like to acknowledge the helpful comments and suggestions of the following individuals: Tsipora Barzilai, Mandis Beigi, Brian Carpenter, Zane Dodson, Edward Ellesson, Ray Jennings, Dilip Kandlur, Arvind Krishna, Vinod Peris, John Tavs, Renu Tewari and Ken White. References [AFREF] J Heinanen et. al., `` Assured Forwarding PHB Group'', Internet Draft draft-ietf-diffserv-af-04.txt, January 1999. [EFREF] V. Jacobson et. al, ``An Expedited Forwarding PHB'', Internet Draft draft-ietf-diffserv-phb-ef-01.txt, November 1998. Mehra Verma Expires 25 August 1999 [Page xvii] Internet Draft draft-mehra-diffserv-servers-00.txt 25 February 1999 [DIFFEDGE] Y. Bernet, D. Durham and F. Reichmeyer, ``Requirements of Diff-Serv Boundary Routers'', Internet Draft draft-bernet-diffedge-01.txt, November 1998. [DSARCH] S. Blake, et. al. "An Architecture for Differentiated Services", Internet RFC 2475, December 1998. [DSFRAME] Y. Bernet, J. Binder, S. Blake, et. al. " A Framework for Differentiated Services", Internet Draft , November 1998. [DSHEAD] K. Nichols et. al., "Definition of the Differentiated Services Field (DS Byte) in the IPv4 and IPv6 Headers", Internet RFC 2474, December 1998. [POLICY] J. Strassner and E. Ellesson, ``Policy Framework Core Information Model'', IETF Draft draft-ietf-policy-core-schema-00.txt, November 1998. [RSVPDIFF] Y. Bernet, R. Yavatkar et. al., ``A Framework for use of RSVP with DiffServ Networks'', Internet Draft draft-ietf-diffserv-rsvp-01.txt, November 1998. Authors' Address Ashish Mehra Phone: (914) 784-7628 Dinesh Verma Phone: (914) 784-7466 IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 Email: mehraa,dverma@watson.ibm.com Mehra Verma Expires 25 August 1999 [Page xviii]