Network Working Group Pedro Marques Internet Draft Nischal Sheth Expiration Date: April 11, 2005 Juniper Networks Robert Raszuk Jared Mauch Barry Greene NTT/Verio Cisco Systems Inc. Danny McPerson Arbor Networks October 2004 Dissemination of flow specification rules draft-marques-idr-flow-spec-01.txt Status of this Memo This document is an Internet-Draft and is subject to all provisions of section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 11, 2005. Marques, et al. [Page 1] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 Copyright Notice Copyright (C) The Internet Society (2004). Abstract This document defines a new BGP NLRI encoding format that can be used to distribute traffic flow specifications. This allows the routing system to propagate information regarding special treatment that is desired for sub-components of a particular IP prefix. Additionally it defines an application of that encoding format to traffic filtering of inter-domain flows such as what is necessary in order to mitigate (distributed) denial of service attacks. The information is carried via the Border Gateway Protocol (BGP), thereby reusing protocol algorithms, operational experience and administrative processes such as inter-provider peering agreements. Table of Contents 1 Introduction .............................................. 3 2 Flow specifications ....................................... 4 3 Dissemination of Information .............................. 5 4 Traffic filtering ......................................... 10 5 Validation procedure ...................................... 11 6 Traffic Filtering Actions ................................. 12 7 Monitoring ................................................ 12 8 Security considerations ................................... 13 9 Acknowledgments ........................................... 13 10 References ................................................ 13 11 Authors' Addresses ........................................ 13 Marques, et al. [Page 2] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 1. Introduction Modern IP routers contain both capability to forward traffic accord- ing to aggregate IP prefixes as well as the capability to identify and special case particular flows of traffic. The latter are usually referred to as ACL or firewall engines. While forwarding information is, typically, dynamically signaled accross the network via routing protocols, there is no agreed upon mecanism to dynamically signal flows across autonomous-systems. For several applications, it may be necessary to exchange control information pertaining to aggregated traffic flow definitions which cannot be expressed using destination address prefixes only. An aggregated traffic flow is considered to be an n-tuple consisting on several matching criteria such as source and destination address prefixes, IP protocol and transport protocol port numbers. The intention of this document is to define a general procedure to encode such flow specification rules as a BGP NLRI which can be reused for several different control applications. Additionally, we define the required mechanisms to utilize this definition to the problem of immediate concern to the authors: intra and inter provider distribution of traffic filtering rules to filter (Distributed) Denial of Service (DoS) attacks. By expanding routing information with flow specifications, the rout- ing system can take advantage of the ACL/firewall capabilities in the router's forwarding path. Flow specifications can be seen as more specifc routing entries to an unicast prefix and are expected to depend upon the existing unicast data information. For example, a flow specification received from a external autonomous-system will need to be validated against unicast routing before being accepted. If the aggregate traffic flow defined by the unicast destination prefix is forwarded to a given BGP peer, then the local system can install more specific flow rules which result in different forwarding behaviour as requested by this system. The choice of BGP as the carrier of this control information is also justified by the fact that the key issues in terms of complexity are problems which are common to unicast route distribution and have already been solved in the current environment. From an algorithmic perspective, the main problem that presents itself is the distributed loop-free distribution of pairs. The key, in this particular instance, being a flow Marques, et al. [Page 3] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 specification. From an operational perspective, the utilization of BGP as the car- rier for this information, allows a network service provider to reuse both internal route distribution infrastructure (e.g.: route reflec- tor or confederation design) and existing external relationships (e.g.: inter-domain BGP sessions to a customer network). While it is certainly possible to address this problem using other mechanisms the authors believe that this solution offers the substan- tial advantage of being an incremental addition to deployed mecha- nisms. 2. Flow specifications A flow specification is an n-tuple consisting on several matching criteria that can be applied to IP traffic. A given IP packet is said to match the defined flow if it matches all the specified criteria. A given flow may be associated with a set of attributes, depending on the particular application, such attributes may or may not include reachability information (i.e. NEXT_HOP). Well-known or AS-specific community attributes can be used to encode a set of predeterminate actions. A particular application is identified by a specific (AFI, SAFI) pair and corresponds to a distinct set of RIBs. Those RIBs should be treated independently from each other in order to assure non-inter- ference between distinct applications. BGP itself treats the NLRI as an opaque key to an entry in its databases. Entries that are placed in the Loc-RIB are then associated with a given set of semantics which is application dependent. This is consistent with existing BGP applications. For instance IP unicast routing (AFI=1, SAFI=1) and IP multicast reverse-path information (AFI=1, SAFI=2) are handled by BGP without any particular semantics being associated with them until installed in the Loc-RIB. Standard BGP policy mechanisms, such as UPDATE filtering by NLRI pre- fix and community matching, SHOULD apply to the newly defined NLRI- type. Network operators can also control propagation of such routing updates by enabling or disabling the exchange of a particular (AFI, SAFI) pair on a given BGP peering session. Marques, et al. [Page 4] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 3. Dissemination of Information We define a "Flow Specification" NLRI type that may include several components such as destination prefix, source prefix, protocol, ports, etc. This NLRI is treated as an opaque bit string prefix by BGP. Each bit string identifies a key to a database entry which a set of attributes can be associated with. This NLRI information is encoded using MP_REACH_NLRI and MP_UNREACH_NLRI attributes as defined in [BGP-MP]. Whenever the cor- responding application does not require Next Hop information, this shall be encoded as a 0 octet length Next Hop in the MP_REACH_NLRI attribute and ignored on receipt. The NLRI field of the MP_REACH_NLRI and MP_UNREACH_NLRI is encoded as a two byte NLRI length value in octets followed by a variable length NLRI value. +------------------------------+ | NLRI length (2 octets) | +------------------------------+ | NLRI value (variable) | +------------------------------+ The Flow Specification NLRI-type consists of several optional subcom- ponents. A specific packet is considered to match the flow specifica- tion when it matches the intersection (AND) of all the components present in the specification. The following component types are defined: + Type 1 - Route Distinguisher Encoding: Route Distinguisher value, encoded as specified in [2547]. This allows this NLRI to carry information for more than one routing realm. This value should be ommited when distributing information for the Internet routing realm. When the RD is present, routes should also contain the Route Target extended community. + Type 2 - Destination Prefix Encoding: Defines the destination prefix to match. Prefixes are encoded as in BGP UPDATE messages, a length in bits is followed by enough octets to contain the prefix information. Marques, et al. [Page 5] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 + Type 3 - Source Prefix Encoding: Defines the source prefix to match. + Type 4 - IP Protocol Encoding: Contains a set of {operator, value} pairs that are used to match IP protocol value byte in IP packets. The operator byte is encoded as: 7 6 5 4 3 2 1 0 +---+---+---+---+---+---+---+---+ | e | a | len | 0 |lt |gt |eq | +---+---+---+---+---+---+---+---+ -i. End of List bit. Set in the last {op, value} pair in the list. -ii. And bit. If unset the previous term is logically ORed with the current one. If set the operation is a logical AND. It should be unset in the first operator byte of a sequence. The AND operator has higher priority than OR for the purposes of evalu- ating logical expressions. -iii. The lenght of value field for this operand is given as (1 << len). -iv. Lt - less than comparisson between data and value. -v. gt - greater than comparisson between data and value. -vi. eq - equality between data and value. The bits lt, gt, and eq can be combined to produce "less or equal", "greater or equal" and inequality values. + Type 5 - Port Encoding: Defines a list of {operation, value} pairs that matches source OR destination TCP/UDP ports. This list is encoded using the numeric operand format defined above. Values are encoded as 1 or 2 byte quantities. Marques, et al. [Page 6] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 + Type 6 - Destination port Encoding: Defines a list of {operation, value} pairs used to match the des- tination port of a TCP or UDP packet. Values are encoded as 1 or 2 byte quantities. + Type 7 - Source port Encoding: Defines a list of {operation, value} pairs used to match the source port of a TCP or UDP packet. Values are encoded as 1 or 2 byte quantities. + Type 8 - ICMP type Encoding: Defines a list of {operation, value} pairs used to match the type field of an icmp packet. Values are encoded using a single byte. + Type 9 - ICMP code Encoding: Defines a list of {operation, value} pairs used to match the code field of an icmp packet. Values are encoded using a single byte. + Type 10 - TCP flags Encoding: Bitmask values are encoded using a single byte, using the bit definitions specified in the TCP header format [rfc793]. This type uses the bitmask operand format, which differs from the numeric operator format in the lower nibble. 7 6 5 4 3 2 1 0 +---+---+---+---+---+---+---+---+ | e | a | len | 0 | 0 |not| m | +---+---+---+---+---+---+---+---+ Marques, et al. [Page 7] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 -i. Top nibble (End of List bit, And bit and Length field), as defined for in the numeric operator format. -ii. Not bit. If set, logical negation of operation. -iii. Match bit. If set this is a bitwise match operation defined as "(data & value) == value"; if unset (data & value) evaluates to true if and of the bits in the value mask are set in the data. + Type 11 - Packet length Encoding: Match on the total IP packet length (excluding L2 but including IP header). Values are encoded using as 1 or 2 byte quantities. + Type 12 - DSCP Encoding: Defines a list of {operation, value} pairs used to match the IP TOS octect. + Type 13 - Fragment Encoding: Uses bitmask operand format defined above. Bitmask values: -i. Bit 0 - Dont fragment -ii. Bit 1 - Is a fragment -iii. Bit 2 - First fragment -iv. Bit 3 - Last fragment Flow specification components must follow strict type ordering. A given component type may or may not be present in the specification, but if present it MUST precede any component of higher numeric type value. If a given component type within a prefix in unknown, the prefix in question cannot be used for traffic filtering purposes by the receiver. Since a Flow Specification as the semantics of a logical AND of all components, if a component is FALSE by definition it can- not be applied. However for the purposes of BGP route propagation Marques, et al. [Page 8] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 this prefix should still be transmitted since BGP route distribution is independent on NLRI semantics. Flow specification components are to be interpreted as a bit match at a given packet offset. When more than one component in a flow speci- fication tests the same packet offset the behavior is undetermined. The encoding is chosen in order to account for future extensibility. An example of a Flow Specification encoding for: "all packets to 10.0.1/24 and TCP port 25". destination proto port +-------------+--------+-----------+ 02 18 0a 01 01 04 81 06 05 81 19 (hex) Decode for protocol: 0x04 type 0x81 operator = end-of-list, value size=1, =. 0x06 value An example of a Flow Specification encoding for: "all packets to 10.0.1/24 from 192/8 and port {range [137, 139] or 8080". destination source port +-------------+---------+------------------------+ 02 18 0a 01 01 03 08 c0 05 03 89 45 8b 91 1f 90 (hex) Decode for port: 0x05 type 0x03 value size=1, >= 0x89 value 137 0x45 &, value size=1, <= 0x8b value 139 0x91 end-of-list, value-size=2, = 0x1f90value 8080 This constitutes a NLRI with an NLRI length of 16 octets. Implementations wishing to exchange flow specification rules MUST use BGP's Capability Advertisement facility to exchange the Multiprotocol Extension Capability Code (Code 1) as defined in [BGP-MP]. The (AFI, SAFI) pair carried in the Multiprotocol Extension capability MUST be the same as the one used to identify a particular application that uses this NLRI-type. Marques, et al. [Page 9] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 4. Traffic filtering Traffic filtering policies have been traditionally considered to be relatively static. The popularity of traffic-based denial of service (DoS) attacks, which often requires the network operator to be able to use traffic filters for detection and mitigation, brings with it requirements that are not fully satisfied by existing tools. Several techniques are currently used to control traffic filtering of DoS attacks. Among those, one of the most common is to inject uni- cast route advertisements corresponding to a destination prefix being attacked. One variant of this technique marks such route advertise- ments with a community that gets translated into a discard next-hop by the receiving router. Other variants, attract traffic to a partic- ular node that serves as a deterministic drop point. Using unicast routing advertisements to distribute traffic filtering information has the advantage of using the existing infrastructure and inter-as communication channels. This can allow, for instance, for a service provider to accept filtering requests from customers for address space they own. There are several drawbacks, however. An issue that is immediately apparent is the granularity of filtering control: only destination prefixes may be specified. Another area of concern is the fact that filtering information is intermingled with routing information. The mechanism defined in this document is designed to address these limitations. We use the flow specification NLRI defined above to con- vey information about traffic filtering rules for traffic that should be discarded. This mechanism is designed to, primarily, allow an upstream autonomous system to perform inbound filtering, in their ingress routers of traffic that a given downstream AS wishes to drop. In order to achieve that goal, we define an application specific NLRI identifier (AFI=1, SAFI=TBD) along with specific sematic rules. BGP routing updates containing this identifier use the flow specification NLRI encoding to convey particular aggregated flows that require spe- cial treatment. Marques, et al. [Page 10] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 5. Validation procedure Flow specifications received from a BGP peer and which are accepted in the respective Adj-RIB-In are used as input to the route selection process. Although the forwarding attributes of two routes for the same Flow Specification prefix may be the same, BGP is still required to perform its path selection algorithm in order to select the cor- rect set of attributes to advertise. The first step of the BGP Route Selection procedure [BGP-BASE] (sec- tion 9.1.2) is to exclude from the selection procedure routes that are considered non-feasible. In the context of IP routing information this step is used to validate that the NEXT_HOP attribute of a given route is resolvable. The concept can be extended, in the case of Flow Specification NLRI, to allow other validation procedures. A flow specification NLRI SHOULD be validated such that it is consid- ered unfeasible if it contains an non-empty AS_PATH and that AS_PATH does not match the AS_PATH of the best match unicast route that includes the specified destination address prefix. The underlying concept is that the neighboring AS that advertises the best unicast route for a destination is allowed to advertise flow spec information that conveys a less or equally specific destination prefix. The neighboring AS is the immediate destination of the traffic described by the Flow Specification. If it requests these flows to be dropped that request can be honored without concern that it repre- sents a denial of service in itself. Supposedly, the traffic is being dropped by the downstream autonomous-system and there is no added value in carrying the traffic to it. BGP implementations MUST also enforce that the AS_PATH attribute of a route received via eBGP contains the neighboring AS in the left-most position of the AS_PATH attribute. While this rule is optional in the BGP specification, it becomes necessary to enforce it for security reasons. Marques, et al. [Page 11] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 6. Traffic Filtering Actions The default action for a traffic filtering flow specification is to accept IP traffic that matches that particular rule. The following extended community values can be used to specify par- ticular actions. type extended community encoding -------------------------------------------------------- 0x8006 traffic-rate 2-byte as#, 4-byte float 0x8007 sample-rate 2-byte as#, 4-byte float 0x8008 redirect 6-byte Route Target 0x8009 marking 1-byte DSCP value A traffic-rate of 0 should result on all traffic that matches the par- ticular flow to be discarded. The redirect extended community allows the traffic to be redirected to a VRF routing instance that list the specified route-target in its import policy. If several local instances match this criteria, the choice between them is a local matter (for example, the instance with the low- est Route Distinguisher value can be elected). The traffic marking extended community instruct a system to modify the DSCP bits of a transiting IP packet to the corresponding value. This extended community is encoded as a sequence of 5 zero bytes followed by the DSCP value. 7. Monitoring Traffic filtering applications require monitoring and traffic statis- tics facilities. While this is an implementation specific choice, implementations SHOULD provide: - A mechanism to log the packet header of filtered traffic, - A mechanism to count the number of matches for a given Flow Specification rule. Marques, et al. [Page 12] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 8. Security considerations Inter-provider routing is based on a web of trust. Neighboring autonomous-systems are trusted to advertise valid reachability infor- mation. If this trust model is violated, a neighboring autonomous system may cause a denial of service attack by advertising reachabil- ity information for a given prefix for which it does not provide ser- vice. As long as traffic filtering rules are restricted to match the corre- sponding unicast routing paths for the relevant prefixes, the secu- rity characteristics of this proposal are equivalent to the existing security properties of BGP unicast routing. Where it not the case, this would open the door to further denial of service attacks. 9. Acknowledgments The authors would like to thank Yakov Rekhter, Dennis Ferguson and Chris Morrow for their comments. 10. References [BGP-BASE] Y. Rekhter, T. Li, S. Hares, "A Border Gateway Protocol 4 (BGP-4)", draft-ietf-idr-bgp4-20.txt, 03/03 [BGP-MP] T. Bates, R. Chandra, D. Katz, Y. Rekhter, "Multiprotocol Extensions for BGP-4", RFC2858. 11. Authors' Addresses Pedro Marques Juniper Networks 1194 N. Mathilda Ave. Sunnyvale, CA 94089 Email: roque@juniper.net Marques, et al. [Page 13] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 Nischal Sheth Juniper Networks 1194 N. Mathilda Ave. Sunnyvale, CA 94089 E-mail: nsheth@juniper.net Robert Raszuk Cisco Systems, Inc. Al. Jerozolimskie 146C 02-305 Warsaw, Poland Email: rraszuk@cisco.com Barry Greene Cisco Systems, Inc. Email: bgreene@cisco.com Jared Mauch NTT/VERIO 8285 Reese Lane Ann Arbor, MI, 48103-9753 Email: jmauch@verio.net | jared@puck.nether.net Danny McPherson Arbor Networks Email: danny@arbor.net Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assur- ances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can Marques, et al. [Page 14] Internet Draft draft-marques-idr-flow-spec-01.txt October 2004 be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFOR- MATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Marques, et al. [Page 15]