Internet DRAFT - draft-floyd-pushback-messages


Internet Engineering Task Force                        Sally Floyd/ACIRI
INTERNET DRAFT                                       Steve Bellovin/AT&T
draft-floyd-pushback-messages-00.txt                 John Ioannidis/AT&T
                                                Kireeti Kompella/Juniper
                                                        Ratul Mahajan/UW
                                                       Vern Paxson/ACIRI
                                                              July, 2001
                                                  Expires: January, 2002

      Pushback Messages for Controlling Aggregates in the Network

                          Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at


Floyd et al.                 Standards Track                    [Page 1]

Internet Draft              Pushback Messages                  July 2001

1.  Introduction

   Pushback [MB01] is designed to detect and control high bandwidth
   aggregates in the network. An aggregate is a collection of packets
   with a common property. For instance, with the destination prefix as
   the common property, all packets with a matching prefix define an
   aggregate.  During a time of severe congestion from a flash crowd or
   from a denial of service (DoS) attack, a router might enforce a rate-
   limit on the traffic aggregate responsible for the congestion.  In
   addition, the congested router could ask adjacent upstream routers to
   limit the amount of traffic they send for that aggregate.  This
   upstream rate-limiting is called pushback and can be recursively
   propagated to routers further upstream.  It serves to spatially
   isolate the traffic aggregate so that other traffic sharing the same
   downstream links is not impaired by the aggregate.

   By imposing only a rate limit, rather than a complete blockage, of
   the aggregate, pushback aims to minimize "collateral damage" suffered
   by the non-hostile traffic matching the aggregate during a DoS
   attack.  In general, the hope is that during a DoS attack, pushback
   will propagate sufficiently far in the network so that non-hostile
   traffic fits within the rate limit imposed on its specific path to
   the destination, and accordingly does not have its performance

   This document specifies messages passed between cooperating routers.
   It does not address procedures in routers for identifying aggregates
   to be rate-limited, or for determining the rate-limits for those
   identified aggregates.  The goal is to specify an experimental
   standard for pushback messages so that we can learn from the
   experimental use of pushback.  We expect that the specifications for
   pushback messages will evolve over time, as we gain more experience
   with their use.

   There are two main pushback messages - REQUEST and STATUS.  Pushback
   REQUEST messages are sent to upstream routers asking them to rate-
   limit the aggregate.  Such a request for rate-limiting is only
   advisory; the upstream router is not compelled to follow the request.
   As part of rate-limiting on behalf of the downstream router, the
   upstream router sends periodic STATUS messages to the downstream
   router. The STATUS messages report the arrival rate for that
   aggregate at the upstream router, and enable the congested router to
   take decisions regarding the continuance of pushback.  In addition to
   REQUEST and STATUS messages, REFRESH messages reinforce the soft-
   state rate-limiting, and CANCEL messages terminate it.

   Pushback messages can be used in two ways.  In one pushback type,
   pushback messages are used to request upstream rate-limiting for the

Floyd et al.                 Standards Track                    [Page 2]

Internet Draft              Pushback Messages                  July 2001

   specified aggregate.  In a second pushback type (DUMMY_PROP),
   pushback messages are used simply to get information about the
   arrival rates of an aggregate at upstream routers.

   A pushback in progress can be visualized as a tree (or, with
   multipath routing, possibly a graph), where the congested router
   initiating the pushback is the root. The parent of a router in the
   tree is the downstream router from which it got the pushback REQUEST.
   Routers that do not propagate pushback further are leaves of the

   The following sections specify the format for pushback messages and
   the timing of REFRESH and STATUS messages.  This document also
   specifies the procedures for propagating pushback REQUEST and REFRESH
   messages further upstream, and for merging the resulting STATUS
   messages from upstream routers.

2.  The Common Header

   All pushback messages have the following fields prepended in the

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    |  Version  |AdF|    Msg Type   |   Rate-Limiting Session ID    |
    |            Pushback Initiating Router's IP Address            |
    |                      Sender's IP Address                      |

   The first field specifies the version of the pushback protocol the
   sender speaks.  The protocol described in this document is Version=0.

   "AdF" specifies the type of address family used.  Currently defined
   values are IPv4=0 and IPv6=1.  Other values are reserved for future

   The message type is one of REQUEST (= 0), REFRESH (= 1), STATUS (=
   2), or CANCEL (= 3).  The fields following the common header are
   dependent on the type of the message.

   The Rate-limiting Session ID (RLSID) is generated by the congested
   router initiating pushback. It MUST be unique among all current rate-
   limiting sessions initiated by this router.  The RLSID combined with
   the IP address of the congested router defines a pushback session
   over the whole network.  A router receives both these fields from the

Floyd et al.                 Standards Track                    [Page 3]

Internet Draft              Pushback Messages                  July 2001

   downstream router requesting pushback. These fields enable the
   routers to map incoming messages to the appropriate rate-limiting
   session.  A router MAY use its different addresses when initiating
   different pushback sessions.

   Note that if the router's address reflects a private addressing
   realm, then it MUST be altered upon crossing into a different
   addressing realm.  Ideally this transformation uses a new address
   unique to the router; if not available, then the address of the
   router propagating pushback (by sending the message) into the
   different realm is used.

   The sender's IP address has been included in the pushback message,
   making message interpretation independent of the IP source address
   field.  This eliminates any confusion regarding which interface's
   address is included in that field (that is, whether it is the IP
   address of the message-sending interface, or of some other interface
   at the router).  It also helps when pushback messages traverse
   between routers that are not directly connected.  If a sender sends
   pushback messages to two peers in two different addressing realms, so
   that the sender doesn't have a unique address to send to both peers,
   then the sender will use different values for the sender's IP address
   in the two messages.

   Both the IP addresses will be 128 bit fields for IPv6.

2.  The Pushback REQUEST Message

   Pushback requests (type REQUEST=0) are sent upstream when a router
   wants the aggregate to be rate-limited upstream. The fields in a
   pushback REQUEST, in addition to the common header, are shown below.

Floyd et al.                 Standards Track                    [Page 4]

Internet Draft              Pushback Messages                  July 2001

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    | PType |SRMode |   Max Depth   | Depth in Tree |   Reserved    |
    |                      Bandwidth Limit                          |
    |                      Expiration Time                          |
    |                     Status Frequency                          |
    |                                                               |
    |                    Congestion Signature                       |
    |                                                               |

   "PType" denotes the type of pushback and determines the upstream
   router's behavior in various ways.  PType=0 (HI_DROP_PROP) requests
   the upstream router to propagate pushback if the restricted aggregate
   suffers a high drop rate due to the restriction.  (This is the usual
   mode of pushback.)

   PType=1 (ALWAYS_PROP) requests that the upstream router propagate
   pushback irrespective of the drop rate experienced by the aggregate.
   It would typically be used when the aggregate is known with high
   confidence to be malicious.

   PType=2 (DUMMY_PROP) indicates no actual rate-limiting should take
   place---the downstream router is just interested in the arrival rate
   estimate of this aggregate.  The extent of propagation of these
   pushback messages is controlled by the congested router using "Max
   Depth" (explained below) to determine the arrival rate of an
   aggregate several hops upstream.

   Other values of PType are reserved for future definition.

   The pushback requester can specify the mode in which it wants
   feedback with the "SRMode" (Status Reporting Mode) field.  SRMode=0
   (COMPACT) specifies the feedback should be just the total arrival-
   rate estimate of the aggregate.  SRMode=1 (CLOSEST) specifies the
   feedback should include per-router feedback for upstream routers, and
   if there is not room for all of them, then those closest (lowest hop
   count) should be preferred.  SRMode=2 (FURTHEST) is similar to
   CLOSEST, but prefers routers further away from the congested router.
   SRMode=3 (SAMPLE) specifies that instead a pseudo-random subset of

Floyd et al.                 Standards Track                    [Page 5]

Internet Draft              Pushback Messages                  July 2001

   the upstream routers should be included.  These last two modes
   facilitate different forms of mapping to aid with tracing back the
   attack sources.  Other values are reserved for future definition.

   Using "Max Depth" the congested router can control the maximum number
   of hops pushback will propagate to.  The special value of 255
   indicates unrestricted propagation; 254 indicates that pushback
   should be propagated up to, but not across, the AS boundary.

   The depth in the tree is the distance of the requesting node from the
   root of the pushback tree. The depth of the root is zero, and a
   child's depth is one more than the depth of its parent.  Depth
   information is useful in setting timers for sending feedback.  If a
   router receives overlapping pushback requests from multiple peers,
   its depth is one more than the minimum depth of its parents.

   The bandwidth limit is a single precision (32-bit) floating point
   number in IEEE format, as described in [SPG97].  It expresses the
   rate in bytes per second.  It is only a requested upper bound for the
   bandwidth to be given to the aggregate.  If congested, the upstream
   router could send less than that upper bound.  In addition, the
   upstream router is not *required* to observe the requested bound.

   The expiration time is the time period after which the pushback
   request expires if no REFRESH messages arrive.  The status frequency
   gives the time after which the upstream routers should send STATUS
   messages downstream.  Both the times are specified in milliseconds,
   and represented using 32-bit integers in network byte order.

2.0.1 The Congestion Signature

   The congestion signature is the description of the aggregate that is
   to be rate-limited. Its specification is a major open question.  As
   an example, the attack signature might consist of the destination
   prefix(es) characterizing the aggregate.  At the other extreme, we
   could allow arbitrary ACLs of fields in the packet header.  For this
   first experimental use of pushback, we will limit the congestion
   signature to depend only on the source and destination IP addresses
   in the packet header.  This excludes the use of aggregates defined by
   the protocol field in the IP header and the port numbers in the
   transport header; an example is an aggregate consisting of all DNS

   Congestion signatures are specified as TLVs (type-length-value):

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

Floyd et al.                 Standards Track                    [Page 6]

Internet Draft              Pushback Messages                  July 2001

    |      Type      |    Length    |          Value ....           |
    |              .... Value and final Padding ....                |

   The Length field includes the Type and Length fields, as well as the
   length of the Value and any Padding.

   Values are padded up to 32-bit alignment.  If, after doing so, more
   data remains in the datagram, then it's interpreted as another TLV.

   Type=0 (SRC_PREFIX) indicates a source address prefix.  Its Length is
   4 bytes plus the length of an address in the format specified by AdF
   above.  The first octet of Value (bits 16 through 23 in the first
   line of the above figure) are reserved.  The second octet gives the
   prefix length, in the range 1-32 (IPv4) or 1-128 (IPv6).  Other
   values are reserved.

   Type=1 (DST_PREFIX) is the same but for destination address prefix.

   When all prefixes in the list are of the same type, the congestion
   signature describes packets that have the corresponding field (source
   or destination) matching one of the prefixes in the list.  In
   presence of both source and destination prefixes, packets belonging
   to the aggregate are those destined for one of the destination
   prefixes *and* coming from one of the source prefixes.

2.1 Propagating a Pushback REQUEST Message

   When propagating a pushback request upstream, the router MUST insert
   the correct depth information, which is one more than the depth of
   its parent(s).

   In addition, the destination prefixes in the congestion signature
   MUST be checked to see whether they have to be *narrowed*, to
   restrict the rate-limiting only to traffic headed for the downstream
   router that requested pushback, as follows.  Suppose the congested
   router X identifies a certain aggregate A with destination prefix
   128.95/16. X will ask its upstream router Y (among others) to rate-
   limit traffic from aggregate A (128.95/16).  However, Y cannot use
   the same specification directly because while Y could be forwarding
   128.95.1/24 to X, it might not be forwarding the rest of 128.95/16 to
   X. If Y (and routers upstream of Y) started rate-limiting all of
   128.95/16, the network would drop traffic which would not have

Floyd et al.                 Standards Track                    [Page 7]

Internet Draft              Pushback Messages                  July 2001

   reached the congested router X.

   To avoid this unnecessary packet-dropping, it is important that Y
   look at its routing table to find prefixes within 128.95/16 that are
   forwarded to X. Y has to check all extensions of the given prefix in
   the routing table.

   The issue of narrowing the congestion signature occurs when a
   pushback request is propagated upstream by a router (thus becoming a
   non-leaf in the tree), or when the pushback request is passed from
   the output interface to an input interface at a router.  The above
   algorithm for narrowing the congestion signature works only for
   congestion signatures with a destination address component in them.
   It cannot be applied to other signatures, pure source-based ones, for
   instance.  We do not deal with the issue of narrowing non-
   destination-based signatures in this document except noting that it
   can be done given the right routing information at the upstream

   A router could receive requests from different downstream routers
   with overlapping congestion signatures.  Future work might address
   the possibility of merging two different rate-limiting sessions in
   this case.

2.2 Pushback REFRESH Messages

   Pushback REFRESH messages are initiated by the congested router that
   started the pushback, if it wants the pushback to continue.  For
   uninterrupted rate-limiting, these messages should be generated
   before the rate-limiting expires at the upstream routers

   The REFRESH message is identical to the REQUEST message, so that if
   the upstream router has crashed in the meanwhile, the state can be
   reestablished.  However, the message type is set to REFRESH so that,
   if state already exists, it is matched against the RLSID and router
   address fields so that the receiving router does not have to go
   through the process of setting up state from scratch.

   REFRESH messages can change any field specified earlier in the
   pushback REQUEST.  On receiving the pushback REFRESH message the
   upstream routers update the expiration time for the rate-limit
   session and the limit imposed on the aggregate, and set the timer for
   the STATUS message.  Non-leaf routers in the pushback tree SHOULD
   send REFRESH messages further upstream after dividing the rate limit
   among upstream neighbors.  If the aggregate specification has
   changed, the router MUST check if the new aggregate needs to be
   narrowed, using the process described above, before propagating the
   pushback REFRESH.

Floyd et al.                 Standards Track                    [Page 8]

Internet Draft              Pushback Messages                  July 2001

2.3 Pushback CANCEL Messages

   The pushback CANCEL message is sent upstream to stop rate-limiting
   the aggregate.  It SHOULD be propagated upstream by routers that have
   propagated pushback requests (non-leaf routers in the pushback tree).

   The CANCEL message has no fields beyond those present in the common

3.  Pushback STATUS Messages

   Upstream routers that receive a pushback REQUEST send pushback STATUS
   messages to the router from whom they got their REQUEST. The
   additional fields in the STATUS message are:

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    |                   Arrival Rate Estimate                       |
    | SRMode| Rsrvd |     Height    |            NumElem            |
    |                          <Router ID>                          |
    |                         <Router Info>                         |
    |                    <Arrival Rate at Router>                   |
    |                                                               |

   The arrival rate estimate is a single precision floating-point number
   in IEEE format, as described in [SPG97].  It expresses the arrival
   rate of the aggregate in bytes per second if there was no rate-
   limiting upstream of the STATUS sender.  The arrival rate for the
   first STATUS message is computed over the interval since the receipt
   of the pushback REQUEST.  For the subsequent messages, it is computed
   over the interval since the last STATUS message.

   The SRMode field specifies the mode in which status is reported.  It
   is the same as that in the pushback REQUEST message. The supported
   modes and their semantics are described in Section 2.

   "Height" denotes the height of the sender in the pushback tree. It is
   zero for leaf nodes, and one more than the maximum height among
   children for non-leaf nodes. This field tells the receiver how far
   pushback has propagated upstream of it.

Floyd et al.                 Standards Track                    [Page 9]

Internet Draft              Pushback Messages                  July 2001

   For status modes that return a list of routers, NumElem gives the
   number of listed routers.  Then, for each router

       * Router ID gives an address identifying the router
       * Router Info gives information associated with the router.
         Currently, this consists of:

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       |S|              Reserved                   |   Depth in Tree   |

   S is a bit that if set means that the entry is Stale, meaning that no
   new STATUS message has been received from the router since the last
   time the sending router sent a STATUS message.

   Depth in Tree reflects the depth of the router in the pushback tree.

3.1 Aggregating STATUS Messages

   A parent node uses the STATUS messages from its children to construct
   the STATUS message it sends downstream. The merger technique used
   depends on the mode of the STATUS message. In COMPACT mode, the
   parent node adds the arrival rate estimates received from the
   children and its estimate from upstream routers that were not sent
   pushback messages. In other modes, the lists obtained from the
   various children are merged, and appended with the parent node's own
   estimate, as indicated in the previous section. The height is one
   more than the maximum height recieved from the children.

3.2 Timing of STATUS Messages

   The status frequency specified in the pushback REQUEST and REFRESH
   messages is the rate at which the originating router would like to
   receive status reports. Since the upstream routers are at different
   distances from the root, the timer values they set have to be
   different.  In particular, routers further away from the root should
   set smaller timer values because they get messages late and their
   STATUS messages take time getting to the root node.

   On receiving a REQUEST or REFRESH message, the routers set a timer to
   send the STATUS message. The value of this timer is the status
   frequency minus the _depth_ * _k_, where _depth_ is the router's
   depth in the pushback tree, and _k_ is a constant that signifies the
   maximum round trip time for a message over a pushback tree edge
   (including message processing time).  _k_ should be configured to
   some comfortable upper bound like 100 ms (it is same for all the

Floyd et al.                 Standards Track                   [Page 10]

Internet Draft              Pushback Messages                  July 2001

   routers in the pushback tree).  For satellite hops or other links
   with round-trip times greater than the configured value _k_, the
   consequences will simply be stale STATUS messages.

   Setting timers in this fashion means that parents are likely to
   obtain fresh STATUS messages from their children before their own
   STATUS message timer goes off. This in turn means that fresh STATUS
   messages are sent further downstream after aggregation.  If a parent
   router's timer fires before it has received STATUS message from one
   of its children, it MUST send its own STATUS message downstream using
   the last value received from this child or its own estimate, and, if
   including an individual rate report for this child, marking it with
   S=1 to indicate it is Stale.

   The status timer is set again immediately after sending the STATUS
   messages downstream. The value of the timer is the same for all the
   routers in this case, and is equal to the status frequency, since the
   required offset has already been achieved.  If a router receives a
   REFRESH message before its status timer expires, new timers are set
   as described above.

   A small jitter can be applied to status timers so that the downstream
   router recieves STATUS messages from its children at different times.

   In some cases, the original sender of the pushback REQUEST might want
   some variation in the status timers to provide some degree of
   protection against gaming adversaries that try to time their bursts
   to avoid detection.  This variation could be achieved by the original
   sender by making changes to the Status Frequency specified in the
   pushback REFRESH messages.

4.  Authentication for Pushback Messages

   Pushback messages require some form of authentication, even if the
   pushback messages are between adjacent routers.  However, this
   document currently does not specify the form of authentication to be

5.  Messages between Routers and Local Agents

   Some routers might send packet headers from a sample of the traffic
   to an agent for outboard processing, and receive control messages
   back from the agent about identified aggregates to be rate-limited.
   The router and local agent will also exchange control messages, for
   example, to control the sampling at the router.  The formats for
   these messages will probably be addressed in a separate document.

Floyd et al.                 Standards Track                   [Page 11]

Internet Draft              Pushback Messages                  July 2001

   Because this is a purely local conversation between a router and an
   attached local agent, it is not necessary that a router and its
   attached local agent follow the protocol suggested in that document.

6.  Messages exchanged with the NOC

   In some cases the NOC (Network Operations Center) will want to have
   final approval before an aggregate is rate-limited.  Thus, one
   category of pushback messages will be the messages exchanged with the
   NOC.  This draft currently does not specify these messages.



   There is a list of people who can be either co-authors, or can be
   acknowledged in this section.  So far, this list includes the
   following.  The pushback authors:  Ratul Mahajan, Steven M. Bellovin,
   Sally Floyd, John Ioannidis, Vern Paxson, and Scott Shenker.  From
   Juniper: Kireeti Kompella.  From Cisco: Barbara Fraser, David Meyer.
   Other: Randy Bush.


   [MB01] Ratul Mahajan, Steven M. Bellovin, Sally Floyd, John
   Ioannidis, Vern Paxson, and Scott Shenker, Controlling High Bandwidth
   Aggregates in the Network, February 2001.  URL:

   [SPG97] S. Shenker, C. Partridge, R. Guerin.  Specification of
   Guaranteed Quality of Service.  RFC 2212.  September 1997.

Security Considerations

   We will eventually address the potential DoS features and security
   vulnerabilities of pushback in detail here.

IANA Considerations


      Sally Floyd
      Phone: +1 510 666 2989

Floyd et al.                 Standards Track                   [Page 12]

Internet Draft              Pushback Messages                  July 2001

      Steve Bellovin
      Phone: +1.973.360.8656
      AT&T Labs - Research

      John Ioannidis
      Phone: +1.973.360.7012
      AT&T Labs - Research

      Kireeti Kompella
      Juniper Networks
      1994 N. Mathilda Ave
      Sunnyvale, CA 94089

      Ratul Mahajan
      Phone: +1 206 616 1853
      Univerity of Washington

      Vern Paxson
      Phone: +1 510 666 2882

      This draft was created in July 2001.
      It expires January 2002.

Floyd et al.                 Standards Track                   [Page 13]