Internet Engineering Task Force                                  Ashish Mehra
INTERNET DRAFT                                                   Dinesh Verma
                                               IBM T J Watson Research Center
                                                              25 February 1999
						      Expires: 25 August, 1999

             Architectural Considerations for DiffServ Servers
                    <draft-mehra-diffserv-servers-00.txt>


    Status of Memo


       This document is an Internet-Draft and is in full
       conformance with all provisions of Section 10 of RFC2026.


       Internet-Drafts are working documents of the Internet
       Engineering Task Force (IETF), its areas, and its working
       groups.  Note that other groups may also distribute working
       documents as Internet-Drafts.


       Internet-Drafts are draft documents valid for a maximum of
       six months and may be updated, replaced, or obsoleted by
       other documents at any time.  It is inappropriate to use
       Internet-Drafts as reference material or to cite them other
       than as ``work in progress.''


       The list of current Internet-Drafts can be accessed at
       http://www.ietf.org/ietf/1id-abstracts.txt


       The list of Internet-Draft Shadow Directories can be
       accessed at http://www.ietf.org/shadow.html.

Mehra Verma                  Expires 25 August 1999                 [Page i]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    Abstract


       This draft motivates and presents architectural
       considerations for Differentiated Services [DSARCH]
       support in servers (referred to as DiffServ servers).  We
       discuss possible deployment scenarios for Differentiated
       and Integrated Services, and highlight the benefits of
       supporting DiffServ functions on servers.  We outline the
       requirements for DiffServ-enabled servers and propose a
       policy-based architecture that allows IntServ and DiffServ
       functions to coexist on DiffServ servers.  We then propose a
       DiffServ API that can be used by applications to influence
       installed system policies on an application-specific basis,
       and can serve as a candidate for standardization.  We also
       describe a number of host-specific DiffServ functions that
       can be efficiently supported on servers.

1.Introduction

    The differentiated services architecture [DSARCH] provides a method
    by which network operators can support different classes of service
    in a network.  The service differentiation is geared towards a model
    where an access device determines the class of service of data
    packets passing through it and changes the Type of Service (TOS) byte
    of the IP header [DSHEAD]. The routers in the network provide support
    for different Per Hop Behaviors (PHBs)(e.g., Assured Forwarding
    [ASREF], Expedited Forwarding [EFREF]) in order to provide different
    classes of service.

    There have been drafts proposed on the architecture of differentiated
    services access box [DIFFEDGE] as well as discussion of how
    differentiated services will inter-operate with the integrated
    services and signaled quality of service (QoS) [DIFFEDGE] [RSVPDIFF].
    The prevailing model for differentiated services deployment in the
    working group assumes that differentiated services deployment would
    be typically done in the manner shown in Figure 1.  It shows two stub
    networks connected together with a transit network.  The working
    assumption has been that the stub networks would support integrated
    services while the transit network(s) would support differentiated
    services.

Mehra Verma                 Expires 25 August 1999                   [Page ii]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999


          / Stub   \       /   Transit	  \       /  Stub  \
         / Network  \     /    Network	   \     /  Network \
  |---|	|        |---|   |---|          |---|   |---|        | |---|
  |Tx |-|        |ER1|---|BR1|          |BR2|---|ER2|        |-|Rx |
  |---|	|        |-- |   |---|          |---|   |---|        | |---|
         \          /     \                /     \          /
          \        /       \              /       \        /


                    Figure 1: Sample Network Configuration

    While the model proposed above has several merits, we believe that
    there is a significant need for a model where differentiated services
    needs to be extended to the end-host.  This is highly desirable in
    the deployment of servers which need to process a large number of
    connections request per seconds.  While differentiated services
    marking by the end-host has been allowed for in the drafts currently
    in the working group, adequate attention has not been given to the
    architecture of the servers that support differentiated services.

    Another important feature overlooked in the current discussion
    on differentiated services is the impact of intervening proxies
    (including application-level gateways/relays) on packet PHB marking.
    Proxies are a common deployment in current networks, for reasons of
    security (e.g.  SOCKS proxies), content filtering and caching (HTTP
    proxies), voice call aggregation (H.323 proxies) etc.  Whenever a
    proxy is deployed in the network, differentiated services markings
    done as per the current architecture would be lost and would need to
    be reconstructed.  Moreover, translation of QoS requirements across
    proxies may be necessary to honor network policies for the DiffServ
    domains interconnected by the proxies.  Therefore, it is important
    to develop an architecture model which would support the existence
    of proxies in a network.  When proxies need to be supported in a
    differentiated services network, it is much more efficient to provide
    diffserv marking support as an end-host mechanism rather than as a
    network mechanism.

    The goal of this Internet Draft is to propose a common mechanism to
    extend support for diffserv functions to the end hosts.  The draft
    presents the reasons why we believe extension of diffserv functions
    to the end-host is important, and it outlines an architecture of the
    end-host that can be used towards this purpose.  We also discuss the
    QoS components at end-hosts that may need to be standardized in order
    to provide interoperability among different vendors.

2.  Deployment Scenarios for IntServ and DiffServ

    In accordance with the sample network described in Figure 1,
    the following possible scenarios could be used for deployment
    of QoS features in the network.  The three networks shown in
    the Figure are characterized as IntServ/DiffServ networks.  In a

Mehra Verma                 Expires 25 August 1999                  [Page iii]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    practical deployment, each of these networks can consist of multiple
    subnetworks.

    For ease of notation, we would call one of the stub networks as a
    client network, and the other stub network as a server network.
    This is intended to simply identify the networks uniquely.  The
    communicating end-hosts on the stub networks may indeed operate in
    a peer-like relationship rather than a client-server relationship.
    We further assume that each of the network operators would choose to
    offer some flavor of QoS support.

    In context of the above, the following possible configuration may be
    available:

        Client Stub Network      Transit Network      Server Stub Network
     1.      DiffServ                DiffServ              DiffServ
     2.      IntServ                 IntServ               IntServ
     3.      IntServ                 DiffServ              IntServ
     4.      DiffServ                IntServ               DiffServ
     5.      DiffServ                IntServ               IntServ
     6.      IntServ                 DiffServ              DiffServ

    Scenario 1 implements diffserv all through the network.  The
    advantages of this approach is that a scalable and uniform QoS
    mechanism is available in the network.  Using policy support
    [POLICY], DiffServ can be enabled in the stub networks without
    modifying existing applications.  However, there is no end-to-end
    signaled QoS available for the applications.

    Scenario 2 implements intserv all through the network.  The
    advantages and limitations of such a deployment are well understood.
    While this approach enables per-flow signaled end-to-end QoS for the
    applications, this can cause significant scaling problems for the
    transit network.

    Scenario 3 is the one described in [DIFFEDGE] and [RSVPDIFF] drafts.
    This enables the aggregation of multiple flows in the transit
    network, and reduces the scaling problems.  However, scaling issues
    associated with excessive signaling load on high-end servers (e.g.
    high-volume web-servers or Voice-on-IP servers) would still remain.

    Scenario 4 is in a sense the inverse of Scenario 3, where the stub
    networks support DiffServ while the transit network supports IntServ.
    This may be used in cases where IntServ mechanism are used to set up
    aggregated reserved bandwidth tunnels in the transit network, with or
    without the use of tag-switching [MPLS]. The scaling issues at server
    as well as transit networks are alleviated by the use of aggregation

Mehra Verma                 Expires 25 August 1999                   [Page iv]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    and/or DiffServ.  However, the overhead associated with establishment
    and operation of aggregated tunnels could be a potential drawback.

    Scenarios 5 and 6 are mirror images of one another, where the network
    supports IntServ in one half and DiffServ in another.  At stub
    networks where scalability concerns are paramount, DiffServ half
    would be more desirable.  However, some stub networks may want to
    allow applications to support signaled QoS which then gets mapped
    into an appropriate diffserv tunnel.

    Each of the scenarios outlined above have their merits and drawbacks.
    We would expect the Internet to support a mix of the above, with
    best-efforts networks further complicating the picture during
    initial deployments.  Depending on the deployments of various ISPs,
    organizations may have to cope with one or more of the scenarios
    outlined above.

    In half of these scenarios, we see a need to support DiffServ on the
    end-hosts.  In the next section, we give further justifications why
    such a support would be desirable for servers.

3.  Why DiffServ on Servers

    In the Differentiated Services model and architecture
    [DSARCH,DSFRAME], it is often highly desirable for a customer's
    egress node to mark, meter, and/or shape all traffic leaving the
    egress node into a provider's network.  The egress nodes may be
    boundary/access routers providing value-added connectivity between
    the customer and provider networks, or they may be hosts such as
    servers directly attached to the provider's network.  The latter
    scenario of direct-attached hosts often arises when servers (such as
    web content servers or frontend presentation servers) comprise the
    egress nodes of a content provider or e-business vendor.

    Providing QoS functions such as traffic classification, marking,
    metering, and shaping on edge devices such as servers and proxies is
    desirable for the following reasons.  Some of these have been briefly
    mentioned in [DSARCH, DSFRAME] as well.

3.1 Scalability

    Pushing more complex QoS functions to the edge allows realization
    of a simple and highly efficient network core that provides simple
    forwarding and QoS functions.  This in turn enables the network
    core to be highly scalable in terms of resource requirements of
    the network elements and the total amount of traffic they can

Mehra Verma                  Expires 25 August 1999                   [Page v]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    forward successfully.  Providing QoS functions at network servers
    also facilitates simplified packet processing and traffic handling
    at downstream egress nodes, since pre-marked and pre-conditioned
    traffic is likely to consume fewer buffer, processing, and bandwidth
    resources.  However, unlike other edge devices which must explicitly
    create and maintain per-flow state, servers and proxies already
    maintain per-flow (a flow being a TCP connection or UDP session)
    state in the form of sockets and protocol control blocks.

    Moreover, a network edge device must invest substantial per-flow
    resources to manage traffic that is in violation of specified
    traffic profiles, either to discard such packets (drop) or hold
    back out-of-profile packets in local buffers (shape).  Generating
    backpressure by notifying the offending traffic source may often be
    too expensive or slow, and may require mechanisms in the source to
    honor such notifications anyways so as to guard against applications
    that ignore such notifications or fail to adapt gracefully.  Instead,
    a server or proxy implementing QoS functions can simply block the
    violating application (s), generate local notifications for the
    application immediately following a traffic violation, or return
    transmission failure indications for uncooperative applications.

    An added advantage of performing elaborate QoS functions at
    servers is the opportunity to avoid incorrect packet classification
    at network nodes due to fragmentation of IP packets in the
    network.  Multi-Field (MF) classifiers [DSARCH] base their traffic
    classification decisions on the contents of transport-layer header
    fields.  Since only the first fragment of a packet carries the packet
    header, and hence the transport-layer header, MF classifiers at
    network nodes may not correctly classify packet fragments following
    the first fragment of a packet.  As pointed out in [DSARCH],
    maintaining fragmentation state at these network nodes is not a
    general solution due to fragment reordering at upstream nodes or
    divergent routing paths.  In the absence of a well-established
    policy for handling fragmented packets, marking packets at the
    traffic source allows fragments to carry a common mark (assuming
    fragmentation, wherever it occurs, either does not re-mark the
    packet or re-marks all fragments of a packet).  BA classifiers use
    the contents of incoming DS field of a packet to do the correct
    classification.  Packet marking at the source makes it possible to
    use BA classifiers correctly with packet fragments.

3.2 Implementation Efficiency

    Supporting DiffServ functions on servers has the additional benefit
    of efficient classification, marking, and aggregation of flows.
    Since servers are typically the endpoints of TCP connections, they

Mehra Verma                 Expires 25 August 1999                   [Page vi]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    can classify and aggregate traffic at connection establishment time
    and exploit per-connection state to subsequently classify packets
    traversing these connections very efficiently.  Similar mechanisms
    enable the server to efficiently generate DS codepoints (derived from
    the policy rule and/or the incoming TOS marking on the connection
    request (Section 6.3)) during connection classification, and mark
    packets subsequently.

    The same benefits apply to connected UDP traffic, i.e., where
    the application issues an explicit connect to send UDP traffic to
    a single destination.  For unconnected UDP traffic, per-packet
    classification (similar to that performed at network elements) may be
    necessary if each packet is sent to a different destination.  Section
    4 discusses additional benefits of supporting DiffServ functions on
    servers.

3.3 Specificity and Granularity of Packet Marking

    Marking of packets at the traffic source has a number of key
    advantages over marking of packets at ingress nodes other than the
    traffic source.  A traffic source can not only utilize the network
    level information present in packet headers, it can also refer to
    the per-flow state already available at the traffic source.  While
    much of this state is available to network nodes as well (via packet
    headers), the use of volatile ports by applications and use of
    dynamic IP addresses (e.g., using DHCP) by client hosts makes it
    difficult and prohibitively expensive, if not impossible, to perform
    accurate traffic classification.  More importantly, the traffic
    source can classify and mark traffic based on additional information
    available only at the traffic source.  This "local" information
    includes one or more of the following:

      - the type of application or service

      - application-specific preferences

      - specific users (e.g., those using a given application) or
        subscribers of a service

      - specific groups to which users, applications, or services belong

      - security associations or privileges

    and other attributes derived from these.


    It may not be possible or desirable to export this information from
    the traffic source due to security and privacy concerns.  Even if

Mehra Verma                 Expires 25 August 1999                  [Page vii]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    such information could be exported, doing so would require mechanisms
    to (a) communicate the information to the appropriate ingress nodes,
    (b) store the information in a suitable format at these network
    nodes, and (c) employ very intelligent classifiers to classify
    traffic appropriately.  This approach requires that all information
    be carried or encoded in a structured manner in packet headers
    or the packet payload, which is not always feasible, or requires
    frequent out-of-band signaling, which is clearly not desirable for
    scalability.  Moreover, given the unstructured nature of the local
    information, maintaining and looking up such state is likely to be
    very resource-intensive for egress or ingress nodes, further limiting
    scalability.

    It seems reasonable, therefore, for the traffic source to perform
    elaborate traffic classification based on a wide range of local
    and network level information attributes, and use appropriate DS
    codepoints for the set of service classes supported.  While DS
    codepoints can only support a limited range of distinguishable
    service classes, the range is sufficiently large to allow the traffic
    source to realize a fine granularity of service differentiation.
    This seems to be a reasonable compromise between very fine-grain
    service differentiation (the extreme being per-flow traffic handling)
    and the complexity of QoS mechanisms at the network egress and
    ingress nodes.

    We note that source-based packet marking does not preclude policing
    (and possible re-marking) at the network egress in order to check
    that policies are being enforced correctly.

3.4 IPSEC Tunnels and Encryption

    The use of encrypted tunnels, e.g., IPsec tunnels, precludes packet
    classification based on layer 4 and higher layer information, by
    encrypting the relevant fields in the packet payload.  If IPsec
    encryption is used on an end-to-end basis, nodes or network elements
    at the edge (but not the source) or in the core of the network would
    be unable to classify traffic based on packet headers or the payload,
    and apply the necessary traffic conditioning rules specified.  In
    this scenario, being the endpoint of an end-to-end IPsec tunnel,
    only the traffic source has the capability to classify and condition
    outgoing traffic, via one or more QoS functions such as marking,
    metering, and shaping.

    This is because the source can perform traffic classification and
    conditioning *before* performing IPsec encryption.  This has an
    additional advantage.  For traffic conditioners such as droppers,
    which drop traffic in excess of the specified traffic profile, the

Mehra Verma                Expires 25 August 1999                  [Page viii]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    source need not incur the overhead of IPsec encryption.  Without
    source-based traffic classification and conditioning, excess traffic
    that would otherwise be dropped at downstream egress or ingress nodes
    would still be encrypted and transmitted by the source.

4.  Requirements and Architecture for DiffServ Enabled Servers

    An architecture for diffserv enabled servers must satisfy the
    following requirements:

      - It should allow networked applications to be classified into a
        PHB which is different than that of best effort.

        Many existing networked applications, e.g.  databases,
        transaction services, CICS, MQSeries, SAP etc., and middleware
        technologies (e.g., Java RMI, CORBA IIOP, HTTP) have been
        designed to run over best effort networks.  During the initial
        deployment of differentiated services, it would be unlikely
        that these applications would be modified to generate packets
        with appropriate code points in their TOS marking.  However, it
        is likely that these services would need to be provided better
        performance than other classes of traffic in the network.  Thus,
        an architecture for diffserv servers must support the ability to
        support networked applications.

      - It should allow new applications to request the desired service
        level without a signaling protocol.

        The differentiated services architecture is based on a
        non-signaled approach whereby different classes of service are
        obtained by means of bilateral service level specifications
        (SLS). A new application must be able to express its desires
        with respect to the type of network performance it wants to
        receive.  However, the server must use these desires to determine
        which network PHB or SLS to use for specific applications.  The
        server must not rely on the existence of a signaling mechanism to
        request specific service level from the network.

      - It should allow policy decisions to be applied uniformly across
        existing applications as well as new applications that signal the
        desired service level.  A similar policy enforcement should occur
        for applications that use a signaled interface (e.g.  RAPI) to
        signal their performance needs.

        Since both types of applications would require access to the same
        network resources, they should be controlled and administered by
        the same policy information.

Mehra Verma                 Expires 25 August 1999                   [Page ix]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

      - It should allow differentiated service support to coexist with
        secure communications using IP-sec or SSL.

        Secure communications using IP-sec or SSL is a reality in current
        Internet communications.  Since IP-sec encrypts transport layer
        headers, the differentiated services architecture must perform
        its encryption functions prior to the IP-sec functions.  SSL
        offers no hindrances as long as classification is based soley on
        contents of transport and network layer headers.

      - It should allow functions needed in order to conform with the
        rate requirements enforced by network layer SLSs.

        The end-host must not violate the terms of the contracts
        negotiated with the network provider.  It must put into place
        appropriate mechanisms to ensure that applications comply with
        network level SLSs.

      - It must coexist with the support for integrated services in the
        end-host.

        Integrated Services and Differentiated Services offer a
        complementary set of QoS features, which may each be most
        appropriate to meet a different set of application requirements.
        Both methods should coexist in the end-host.

    An architecture of a server that implements native differentiated
    services support to meet the above requirements is shown in Figure 2.

    The differentiated services features at a server are implemented
    in the kernel by means of the data path.  The data path enforces
    the functions in accordance with the diffserv traffic conditioning
    specifications (TCSs).  All connections at the servers are classified
    into differented classes.  Each class can be mapped onto a given
    PHB in the network, a maximum amount of outbound traffic, a maximum
    amount of inbound traffic, as well as maximum amounts of buffers that
    can be used by connections belonging to this class of service.

Mehra Verma                  Expires 25 August 1999                   [Page x]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999


          +-------------+                            +-------------+
          | RSVP-enabled|                            | QoS-Unaware |
          | Application |                            | Application |
          +-------------+                            +-------------+
                |                                          
                |                                     
                |                                   
             +--------------+                 +-------------+
             | RSVP Agent   |                 | Application |
             +--------------+                 +-------------+
                  |                                   |
                  |-----------------------------------+
                  |                   | DiffServ API 
                  |                   |
                  |            +---------------+
                  |            | DiffServ Cfg  |                 
                  |            | (Policy Agent)|<---->   Configuration/Policy 
                  |            |               |                 
                  |            +---------------+                   
                  |         
                  |                    +------------+
                  |                    | Monitoring |<------------> SNMP
                  |                    +------------+
             +-----------+                   |
             |   Data    |                   |
             |   path    |-------------------+
             +-----------+

       Figure 2. Architecture of DiffServ Enabled Server


    At the application level, access to differentiated services is
    controlled by means of a diffserv configuration agent.  Under the
    usual operations, this configuration would need to be driven by means
    of policies defined in a configuration file or a policy repository.
    Thus, the diffserv configuration agent is also the policy agent for
    the server.

    Applications may or may not interact with the policy agent.  In
    the initial deployments and for the vast majority of applications,
    applications would not be aware of the differentiated services
    capability.  These applications should still be able to get better
    than best effort service by means of the policy agent and policy
    configurations.

    Some applications may wish to control their own choice of the classes
    of service they request.  These applications should be able to
    interact with the Policy agent using an API to determine the menu of
    available differentiated service classes and to select one of them.
    It would be desirable for this interface (API) to be a standard,
    so that applications could be developed to run on multiple vendor
    platforms.

Mehra Verma                 Expires 25 August 1999                   [Page xi]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999


    In order to coexist with the integrated services support, the RSVP
    daemon on the end-host must interact with the differentiated services
    configuration agent (or the policy agent) to coordinate the resource
    sharing and data path configuration.  Both RSVP agent and the policy
    agent may communicate with the data path, provided the configuration
    information coming from them is consistent.

    The functions along the data path would be the same as the ones
    identified in the [DIFFEDGE] draft.  As mentioned in Section 3,
    the server host may exploit additional local information when
    implementing DiffServ functions, either to improve functionality
    or efficiency or both.  Section 6 discusses several host-specific
    functions that are natural candidates for DiffServ servers.

5.  The DiffServ API

    In order to permit inter-operation among different vendor
    applications, we would like to develop a standard API that would
    enable applications to specify their Differentiated Services
    requirements.  In this section, we sketch the structure of such an
    API.

    One may question whether such an API is necessary, or whether
    existing application interfaces to set the TOS byte on a connection
    are adequate for this purpose.  We believe that simply marking the
    TOS byte is an inadequate solution since the mapping of DiffServ
    PHBs to specific TOS encoding can not be determined in advance
    by applications to make the appropriate calls.  Moreover, even
    if this were possible, application requests for specific PHB
    assignments should be validated against configured policies in
    the system.  Furthermore, an application may opt to select from
    a choice of different PHBs that are available to it within a
    specific environment.  The determination of these PHBs needs to be
    standardized in order to promote inter-vendor interoperability.
    Additionally, if DiffServ classification is to be performed on the
    basis of application-specific information (e.g., user preferences),
    applications using DiffServ services need a mechanism to communicate
    classification requests to the policy agent, which would validate
    these requests against configured policies in the system.  The policy
    agent may then derive appropriate filters to be installed in the data
    path.  Therefore, we believe that a standardized DiffServ API is
    highly desirable in the industry.

    The abstraction offered by the API would consist of different Service
    Classes that are available to the application, and the ability to
    select one of the specific classes for its connections.  A key
    requirement for the API is that it must support IPv4 as well as
    IPv6 traffic.  The general API calls would consist of the following
    routines:


Mehra Verma                 Expires 25 August 1999                  [Page xii]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

     -  Register:  The registration call would create a handle for an
        application to communicate with the DiffServ Configuration Agent
        (Policy Agent).  Its input arguments would specify actions to
        be invoked in case of exceptions, e.g.  a callback function to
        denote failure of network resources.  The output arguments would
        include a handle to use for future communications with the agent.

     -  UnRegister:  This would release any context or state maintained
        in the diffserv configuration associated with the specific
        application.  The input argument would include the handle
        returned from a prior registration call.

     -  ListServiceClasses:  This would obtain a list of all service
        classes that are supported at the host.  The input argument would
        consist of a handle, while the output argument would consist of a
        list of service class names to be returned to the application.

     -  CheckServiceClass:  This would determine the name of the matching
        service class that would be most appropriate for a specific
        application or a socket.  The input arguments would include the
        attributes that identify the application or the socket, and the
        output argument would specify the name of the corresponding
        service class that would be provided to the application as per
        the existing configured policies.

     -  PutinServiceClass:  This function would find the name of the
        matching service class for a traffic flow, and configure the data
        path tables so that packets on that specific traffic flow would
        be mapped to the specific service class.  A traffic flow may be
        identified by specifying the local and remote end-points of a
        socket.

     -  RemovefromServiceClass:  This function would remove a specific
        traffic flow from the service class.  The flow must have been put
        into the specific service class either from default configuration
        or from the use of the PutinServiceClass function.  The flow
        would then be mapped to the default service class in the system.

     -  FindServiceDetails:  This function would obtain the specific
        details of a service class.  The input argument would include
        the name of the service class, and the output argument would
        include the description of the service class.  The details would
        include information about the maximum rate allowed on the service
        class, the maximum number of connections allowed in the service
        class, the PHB to which the service class is mapped, and can also
        include information about expected or measured performance of
        traffic flows that belong to the specified service class.


Mehra Verma                Expires 25 August 1999                  [Page xiii]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    The above API would be used by policy-aware applications to configure
    their diffserv specific needs.  Of course, there would be several
    applications that can obtain their diffserv classification by means
    of configuration only, without needing the invocation of a specific
    API. The policy agent would configure the data path tables, and
    no change to existing applications would be needed.  Applications
    with well-known server ports, or a configurable range of server
    ports can use this scheme.  Many existing applications already use a
    configurable range of server ports in order to facilitate firewall
    traversal.

6.  Host-Specific DiffServ Functions

    Differentiated Services can be realized efficiently at servers
    via appropriate placement of QoS functions in the protocol stack
    (referred to as the QoS Module).  In addition to the QoS functions
    listed in [DIFFEDGE], DiffServ servers may also implement the
    following functions that enhance a server's ability to provide
    service differentiation while ensuring fairness.

6.1 Fair Flow Aggregation

    As mentioned in Section 3.1, implementing Differentiated Services
    functions on servers offers significant advantages in terms of
    network scalability.  This is because servers can aggregate flows
    together based on policies or SLSs.  Policy rules or SLSs for
    Differentiated Services typically apply to sets of flows (referred
    to as aggregates).  For example, a single policy rule may govern
    all traffic flowing to a particular subnet or destination.  This
    allows the network core to be relatively simple and streamlined with
    significantly reduced processing and memory requirements.

    While regulating all aggregated traffic as per the specified TCS,
    the server may wish to ensure that traffic from each flow belonging
    to the aggregate is treated fairly, e.g., obtains a fair share of
    the specified aggregate bandwidth on the locally attached link.
    For a TCS specifying traffic policing or shaping, this means that
    in-profile traffic from each flow receives a fair share of the
    available link bandwidth.  For a TCS specifying traffic policing with
    packet marking, on average the same number of packets of each flow
    would be marked with the specified in-profile or out-of-profile DS
    codepoints.

    To maintain fairness or any other sharing policy, per-flow state
    is needed for each flow belonging to the aggregate.  Hosts (e.g.,
    servers) already maintain complete per-flow state at the socket and

Mehra Verma                 Expires 25 August 1999                  [Page xiv]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    transport layers in the form of data sockets and protocol control
    blocks.  While strictly fair division of available local bandwidth
    (i.e., on the attached link or interface) may suffice for UDP
    traffic, fair allocation for TCP flows involves compensating for
    different congestion levels and round-trip times experienced by
    individual connections.  The information available in the per-flow
    state at the server makes it possible to devise effective policies
    and mechanisms for fairness.  Moreover, as described earlier, with
    minimal modifications the per-flow state can greatly facilitate
    traffic classification and policy rule association.

6.2 Marking of SYN-ACKs

    A server supporting Differentiated Services is expected to mark
    traffic originating on each flow governed by an active policy rule
    or SLS enabled at the server.  However, for TCP connections, since
    connection establishment packets compete with regular data packets
    for network resources, the server should ensure that the SYN-ACK
    generated on receipt of a new connection request (i.e., a SYN packet)
    is marked as per the policy applicable to the new connection.  This
    further implies that the server may need to classify a connection, to
    determine its policy or SLS association, on receipt of a SYN packet,
    and not upon completion of the 3-way handshake that fully establishes
    a TCP connection.

    Note that this implies that the server must invest processing cycles
    classifying a partially established connection, which might get
    aborted subsequently.  However, since the server has already accepted
    the new connection request for further processing (having found
    space in the partial listen queue), it is desirable to treat the
    generated SYN-ACK packet at the same or even higher service level
    than that specified for data packets on the connection.  This reduces
    the likelihood that the processing cycles invested by the server in
    handling the new connection request would be wasted due to congestion
    at the nearest bottleneck router (or even the output queues at the
    selected outgoing interface at the server).

6.3 Outbound Packet Marking using Inbound TOS

    It might be the case that an incoming packet at the server has been
    marked already, perhaps by the client or the upstream ingress node.
    For example, a SYN packet may be marked to encode the desired service
    level for the new connection request.  Similarly, ACKs for data
    packets may be marked to direct them to an appropriate service class.
    In these cases the server may choose to derive the service level
    (i.e., DS codepoint) for packets sent on behalf of the new connection

Mehra Verma                 Expires 25 August 1999                   [Page xv]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    (including the SYN-ACK), from the DS codepoint specified in the
    applicable policy rule or SLS and the DS codepoint carried in the
    incoming packet.

    In contemporary protocol stacks, TCP sets the TOS field in the packet
    from the value stored in the corresponding protocol control block
    just before handing the packet to the IP layer.  In order for a
    DiffServ data-path to generate the correct DS codepoint, it must be
    able to examine the DS codepoint associated with the incoming packet
    before any packet is output on the connection.  For incoming SYN
    packets, this can be achieved by storing the packet's DS codepoint in
    the corresponding TCP protocol control block before passing control
    to the QoS Module for classification.  The datapath calculates the
    new DS codepoint after classifying the packet to determine the
    associated policy rule or SLS, and overwrites the value stored in
    the TCP PCB with the new value.  TCP's packet output routine would
    automatically use the new value in the TOS field of outbound packets.
    A similar approach works with incoming ACK packets, except that no
    explicit classification need be performed.

    The support for this feature would be desirable to preserve DiffServ
    encoding across various application-level proxies deployed within a
    DS-domain.  The security implications of this feature are discussed
    in Section 7.

6.4 Inbound Connection Rate Control

    In addition to marking, policing, and shaping outbound traffic, a
    server may exercise control over the rate of incoming TCP connection
    requests.  This would effectively ``police'' incoming SYN packets to
    the desired rate of connections, while discarding the rest.  Similar
    to the handling of outbound packets, policy rules governing incoming
    connection requests may specify filters to selectively police client
    connection requests.  Inbound connection rate control may help in
    controlling the amount of server resources consumed by a service or
    application on behalf of a particular client.

6.5 Multicast Issues

    Since all TCP transfers are unicast in nature, the multicast-related
    issues in the context of server support for Differentiated Services
    arise for UDP traffic.  UDP traffic sent to a multicast address is
    sent to one or more adjacent routers on the multicast distribution
    tree rooted at the server.  Since this function is performed at the
    IP layer, the actual number of multicast packets transmitted and the
    interfaces involved may not known to the QoS Module (which may be

Mehra Verma                 Expires 25 August 1999                  [Page xvi]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    integrated with the socket and transport layers).  For a given policy
    rule, if the same bandwidth is available at each of the interfaces,
    multicast packets do not pose any problems for the QoS functions.
    However, if the bandwidth available on different interfaces is not
    the same, applying QoS functions correctly on multicast packets would
    be difficult.  We note that the treatment of multicast packets in the
    Differentiated Services framework remains an open issue.

7.  Security Considerations

    DiffServ servers should provide appropriate mechanisms to guard
    against denial-of-service attacks, especially theft of service.
    While various solutions exist to defend against denial-of-service
    attacks, the inbound connection rate control function described
    earlier may help limit the severity of such an attack for specific
    applications.  Theft of service can occur if the server uses the
    inbound TOS on a SYN packet or an ACK packet to derive the outbound
    TOS for SYN-ACK packets and data packets, respectively.  A malicious
    attacker could steal an excessive proportion of the resources at a
    given type of service (such as EF) by marking SYN packets with the
    corresponding DS codepoint.  To prevent such theft of service, a
    DiffServ server should ensure that control packets (such as SYN-ACK
    packets) carrying DS codepoints are also subject to appropriate
    policy-based traffic conditioning functions.

Acknowledgments

    The authors would like to acknowledge the helpful comments and
    suggestions of the following individuals:  Tsipora Barzilai, Mandis
    Beigi, Brian Carpenter, Zane Dodson, Edward Ellesson, Ray Jennings,
    Dilip Kandlur, Arvind Krishna, Vinod Peris, John Tavs, Renu Tewari
    and Ken White.

References


    [AFREF] J Heinanen et. al., `` Assured Forwarding PHB Group'',
         Internet Draft draft-ietf-diffserv-af-04.txt, January 1999.


    [EFREF] V. Jacobson et. al, ``An Expedited Forwarding PHB'', Internet
         Draft draft-ietf-diffserv-phb-ef-01.txt, November 1998.

Mehra Verma                Expires 25 August 1999                  [Page xvii]

Internet Draft    draft-mehra-diffserv-servers-00.txt        25 February 1999

    [DIFFEDGE]  Y. Bernet, D. Durham and F. Reichmeyer, ``Requirements
         of Diff-Serv Boundary Routers'', Internet Draft
         draft-bernet-diffedge-01.txt, November 1998.


    [DSARCH]  S. Blake, et. al. "An Architecture for Differentiated
         Services", Internet RFC 2475, December 1998.


    [DSFRAME]  Y. Bernet, J. Binder, S. Blake, et. al. " A
         Framework for Differentiated Services", Internet Draft
         <draft-ietf-diffserv-framework-01.txt>, November 1998.


    [DSHEAD]  K. Nichols et. al., "Definition of the Differentiated
         Services Field (DS Byte) in the IPv4 and IPv6 Headers", Internet
         RFC 2474, December 1998.


    [POLICY]  J. Strassner and E. Ellesson, ``Policy Framework Core
         Information Model'', IETF Draft draft-ietf-policy-core-schema-00.txt,
         November 1998.


    [RSVPDIFF]  Y. Bernet, R. Yavatkar et. al., ``A Framework for
         use of RSVP with DiffServ Networks'', Internet Draft
         draft-ietf-diffserv-rsvp-01.txt, November 1998.

Authors' Address


Ashish Mehra               Phone: (914) 784-7628
Dinesh Verma               Phone: (914) 784-7466
IBM T. J. Watson Research Center
P.O. Box 704
Yorktown Heights, NY 10598
Email: mehraa,dverma@watson.ibm.com

Mehra Verma                Expires 25 August 1999                 [Page xviii]