Francois Le Faucheur Dan Tappan Gargi Nalawade Cisco Systems, Inc. IETF Internet Draft Expires: December, 2003 Document: draft-lefaucheur-mp-nh-01.txt October, 2003 BGP-4 NEXT_HOP-v2 Attribute Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are Working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document specifies a new BGP attribute, called the NEXT_HOP-v2 attribute, which can optionally be used in conjunction with the MP_REACH_NLRI defined in [MP-BGP], with the NLRI field defined in [BGP-4] or with the UPDATE-v2 message defined in [UPDATE-v2], to advertise next hop information associated with a different Network Layer protocol to the one associated with the NLRI. This is desirable or required in a number of environments, but is not always currently achievable easily with [MP-BGP]. In addition, the NEXT_HOP-v2 provides the generic capability to advertise information (set of TLVs) associated with the next hop. The extensions proposed in this document are backward compatible: a router which supports the extensions can interoperate with a router that doesn't support the extensions. Le Faucheur et al. 1 BGP-4 NEXT_HOP-v2 Attribute October 2003 1. Introduction [MP-BGP] defines extensions to BGP-4 to enable it to carry routing information for multiple Layer protocols (e.g. IPv4-VPN, IPv6, IPv6- VPN). This is achieved by associating a particular Network Layer protocol with the NLRI via the AFI field. In [MP-BGP], the SAFI field is used to further qualify the semantics of the NLRI (e.g. unicast, multicast, label, VPN ...). The SAFI field could potentially be used to also convey the Network Layer Protocol of the next hop information but this would require allocation of one SAFI value for each possible combination of (i) NLRI semantics and (ii) next hop AFI/SAFI. Considering there already are quite a few such combinations and that this number is likely to explode as new AFI/SAFI values are being defined in IETF for new applications ([L2VPN], [TUNN-SAFI], ...), a more flexible/scalable way of allowing advertisement of next hop information from a different Network Layer protocol to the one of the NLRI is necessary. There are already many situations where the next hop information to be advertised is indeed from a different Network Layer protocol to the one of the NLRI. In a number of such situations, the [MP-BGP] limitation has been circumvented by effectively embedding the meaningful next hop information inside the next hop information field of the same Network Layer protocol as the NLRI, and somehow flagging this fact through ad-hoc padding of the unused bits of the field. [RFC2547] is an example of this since it calls for advertisement of IPv4 next hop information along with IPv4-VPN NLRI, which is achieved by prepending a Null Route Distinguisher to the IPv4 Next Hop address. [BGP-TUN] is another example of this since it calls for advertisement of IPv4 next hop information along with IPv6 NLRI, which is achieved by encoding the IPv4 next hop address as an IPv4-mapped IPv6 address. [IPv6-VPN] is yet another example of this since it calls for advertisement of IPv4 or IPv6 next hop information along with IPv6-VPN NLRI, which is achieved by prepending a Null Route Distinguisher to the next hop address and, when the meaningful next hop is IPv4, by encoding it as an IPv4-mapped IPv6 address. In a number of other situations, the [MULTI-BGP] limitation could not be circumvented in similar ways because the Network Layer protocol of the meaningful next hop information is such that the next hop address to convey cannot be aligned to the format corresponding to the Network Layer protocol of the NLRI by simple padding. Support of IPv4 VPNs over an IPv6 backbone is an example of this since it calls for advertisement of IPv6 next hop information along with IPv4-VPN NLRI. Le Faucheur et. al 2 BGP-4 NEXT_HOP-v2 Attribute October 2003 As a generic solution to this problem, this document specifies a new BGP attribute, called the BGP-4 NEXT_HOP-v2 attribute. This attribute can be used to advertise next hop information, when the nexthop is associated with a different Network Layer protocol from the one associated with the NLRI. It can be used in conjunction with the MP_REACH_NLRI defined in [MP-BGP], with the NLRI field of the Update message defined in [BGP-4], or with the Update-v2 message defined in [UPDATE-v2]. In addition, the NEXT_HOP-v2 attribute provides a generic capability to advertise information (set of TLVs) associated with the next hop. The extensions proposed in this document are backward compatible: a router which supports the extensions can interoperate with a router that doesn't support the extensions. 2. NEXT_HOP-v2 attribute (Type Code TBD) This is an optional non-transitive attribute that can be used for the purpose of advertising the address, in any Network Layer protocol regardless of the Network Layer protocol of the NLRI, that should be used as the next hop to the destinations advertised in the NLRI field of the Update message, in the MP_NLRI field of the MP_REACH_NLRI attribute, or in the NLRI field of the Update-v2 message. The attribute is encoded as shown below: +---------------------------------------------------------+ | Length of Next Hop (2 octets) | +---------------------------------------------------------+ | Address Family Identifier (2 octets) | +---------------------------------------------------------+ | Subsequent Address Family Identifier (2 octets) | +---------------------------------------------------------+ | Reserved (2 octets) | +---------------------------------------------------------+ | Length of Next Hop Network Address (1 octet) | +---------------------------------------------------------+ | Network Address of Next Hop (variable) | +---------------------------------------------------------+ | Set of Next Hop TLVs (variable length) | +---------------------------------------------------------+ | Reserved (1 octet) | +---------------------------------------------------------+ Where: "Length of Next Hop" (2 octets): This field indicates the total length in octets of all the following fields related to the Next Hop (ie AFI, SAFI, Le Faucheur et. al 3 BGP-4 NEXT_HOP-v2 Attribute October 2003 Length of Network Address, Network Address and set of Next Hop TLVs). Address Family Identifier (2 octets): This field carries the identity of the Network Layer protocol associated with the Next Hop Network Address that follows. Presently defined values for this field are specified in RFC1700 (see the Address Family Numbers section). Subsequent Address Family Identifier (2 octets): This field provides additional information about the type of the Next Hop Network Address that follows. Values for this field are specified in [MULTI-BGP] as well as other documents including [BGP-LABEL], [TUNNEL-SAFI] and [L2VPN]. "Reserved" (2 octets): This field MUST be set to 0 by the sender and ignored by the receiver. Length of Next Hop Network Address (1 octet): This field indicates the length in octets of the Next Hop Network Address field which follows. Next Hop Network Address (variable length): This field contains the Network Address of the next router on the path to the destination system(s). Set of Next Hop TLVs (variable length): This field carries zero or more TLVs associated with the Next Hop whose address is contained in the previous field. Specification of these TLVs is beyond the scope of this document. "Reserved" (1 octet): This field MUST be set to 0 by the sender and ignored by the receiver. 3. Use of BGP Capability Advertisement A BGP speaker that uses NEXT_HOP-v2 SHOULD use the Capability Advertisement procedures defined in [BGP-CAP] to determine whether it could use the NEXTHOP-v2 attribute for a particular combination of NLRI SAFI and next hop SAFI with a particular peer. The fields in the Capabilities Optional Parameter are set as follows. The Capability Code field is set to TBD (which indicates NEXT_HOP-v2 capability). Le Faucheur et. al 4 BGP-4 NEXT_HOP-v2 Attribute October 2003 The Capability Length field is set to a variable value which is the length of the Capability Value field (which follows). The Capability Value field has the following format: +-----------------------------------------------------+ | NLRI AFI - 1 (2 octets) | +-----------------------------------------------------+ | NLRI SAFI - 1 (2 octets) | +-----------------------------------------------------+ | Reserved Field (2 octets) | +-----------------------------------------------------+ | Number of Nexthop AFI/SAFIs (1 octet) | +-----------------------------------------------------+ | Nexthop AFI - 11 (2 octets) | +-----------------------------------------------------+ | Nexthop SAFI - 11 (2 octets) | +-----------------------------------------------------+ | ..... | +-----------------------------------------------------+ | Nexthop AFI - 1n (2 octets) | +-----------------------------------------------------+ | Nexthop SAFI - 2n (2 octets) | +-----------------------------------------------------+ | | | ..... | +-----------------------------------------------------+ | NLRI AFI - m (2 octets) | +-----------------------------------------------------+ | NLRI SAFI - m (2 octet) | +-----------------------------------------------------+ | Reserved Field (2 octets) | +-----------------------------------------------------+ | Number of Nexthop AFI/SAFIs (1 octet) | +-----------------------------------------------------+ | Nexthop AFI - m1 (2 octets) | +-----------------------------------------------------+ | Nexthop SAFI - m1 (2 octets) | +-----------------------------------------------------+ | ..... | +-----------------------------------------------------+ | Nexthop AFI - mp (2 octets) | +-----------------------------------------------------+ | Nexthop SAFI - mp (2 octets) | +-----------------------------------------------------+ where the list of Next-Hop AFI/SAFIs indicates the nexthop formats supported in the NEXT_HOP-v2 attribute for each NLRI AFI/SAFI. and where : Le Faucheur et. al 5 BGP-4 NEXT_HOP-v2 Attribute October 2003 AFI - Address Family Identifier (16 bits). Values for this field are specified in [RFC1700] as well as other documents including [L2VPN]. SAFI - Subsequent Address Family Identifier (8 bits). Values for this field are specified in [MULTI-BGP] as well as other documents including [BGP-LABEL], [TUNNEL-SAFI] and [L2VPN]. Res - Reserved (16 bits) field. Should be set to 0 by the sender and ignored by the receiver. To have a bidirectional exchange of routing information between a pair of BGP speakers for NLRIs of a particular AFI/SAFI (AFI1/SAFI1) using the NEXT_HOP-v2 attribute to carry a next hop of a particular AFI/SAFI (AFI2/SAFI2), each such speaker MUST advertise to the other (via the Capability Advertisement mechanism) the capability to use the NEXT_HOP-v2 attribute with the corresponding pair in the Capability Value field. When used in conjunction with the Update-v2 message [UPDATE-v2], the NEXT_HOP-v2 capability is inferred from the Update-v2 capability and MUST not be advertised separately (i.e. the NEXT_HOP-v2 Capability specified above is not used and only the Update-v2 Capability defined in [UPDATE-v2] is used). 4. Operations When: - a BGP Speaker wants to advertise itself as the router that should be used as the next hop to the destinations advertised in the NLRI field (of the Update message or of the Update-v2 message) or in the MP_NLRI field of the MP_REACH_NLRI attribute, and wants to advertise one of its Network Layer addresses for a Network Layer protocol (AFI2/SAFI2) which is different to the Network Layer protocol (AFI1/SAFI1) of the NLRI destinations, and - the BGP speaker and the BGP peer have both advertised the corresponding NEXT_HOP-v2 capability for the tuple, the BGP speaker MUST include the NEXT_HOP-v2 attribute and convey its Network Layer address inside. When : - a BGP speaker and its BGP peer have advertised support of the NEXT_HOP-v2 Capability for a given tuple, and - the BGP speaker receives a BGP advertisement for that tuple with next hop information encoded both in the MP_REACH_NLRI and in the NEXT_HOP-v2, Le Faucheur et. al 6 BGP-4 NEXT_HOP-v2 Attribute October 2003 the BGP speaker MUST ignore the information encoded in the nexthop field of the MP_REACH_NLRI and use the next hop information encoded in the NEXT_HOP-v2. When: - a BGP speaker and its BGP peer have advertised support of the NEXT_HOP-v2 Capability for a given tuple, and - the BGP speaker receives a BGP advertisement for that tuple containing both a NEXT_HOP attribute and a NEXT_HOP-v2 attribute, the BGP speaker MUST ignore the NEXT_HOP attribute and use the next hop information encoded in the NEXT_HOP-v2. When: - a BGP speaker and its BGP peer have advertised support of the NEXT_HOP-v2 Capability for a given tuple, and - the BGP speaker receives a BGP advertisement for that tuple without a NEXT_HOP-v2 attribute, the BGP speaker should signal an error notification to the peer. For peers that support [BGP-SOFT], a BGP-v4 Soft-Notification message should be sent back to the peer for the NLRI AFI/SAFI with Error Code "Update Message Error" and Error Sub-Code TBD and the speaker should soft-reset the session for that AFI/SAFI. For peers that don't support [BGP-SOFT], the BGP speaker may reset the session with the peer with the Error Code "Update Message Error" and Error Subcode TBD. If a BGP speaker not supporting the NEXT_HOP-v2 Capability for a given tuple, receives a BGP advertisement for that tuple with the nexthop encoded in the NEXT_HOP-v2 attribute, the BGP speaker should signal an error notification to the peer. For peers that support [BGP-SOFT], a BGP- v4 Soft-Notification message should be sent back to the peer for the NLRI AFI/SAFI with Error Code "Update Message Error" and Error Sub- Code TBD and the speaker should soft-reset the session for that AFI/SAFI. For peers that don't support [BGP-SOFT], the BGP speaker may reset the session with the peer with the Error Code "Update Message Error" and Error Subcode TBD. 5. Usage Examples 5.1. IPv4 VPNs over IPv6 Core The NEXT_HOP-v2 attribute may be used for support of IPV4 VPNs over an IPv6 backbone. In this application, PE Routers would advertise IPv4-VPN NLRI information in the MP_REACH_NLRI along with an IPv6 next hop in the NEXT_HOP-v2 attribute. Le Faucheur et. al 7 BGP-4 NEXT_HOP-v2 Attribute October 2003 During BGP Capability Advertisement, the PE routers would include the following tuple in the Capability Value field of the NEXT_HOP-v2 capability: - 5.2. IPv4 over IPv6 Core The NEXT_HOP-v2 attribute may be used for support of IPv4 reachability over an IPv6 core. In this application, BGP speakers would advertise IPv4 NLRI information in the MP_REACH_NLRI along with an IPv6 next hop in the NEXT_HOP-v2 attribute. During BGP Capability Advertisement, the PE routers would include the following tuple in the Capability Value field of the NEXT_HOP-v2 capability: - 5.3. L2 VPNs over IPv6 The NEXT_HOP-v2 attribute could be used for support of Layer 2 VPN autodiscovery over an IPv6 core. In this application, BGP speakers would advertise L2VPN NLRI information in the MP_REACH_NLRI along with an IPv6 next hop in the NEXT_HOP-v2 attribute. During BGP Capability Advertisement, the L2 VPN PEs would include the following tuple in the Capability Value field: - 5.4. IPv4 VPNs over IPv4 with multiple Tunnel Encaps Consider an environment where multiple IPv4 tunneling methods can be used and tunnel endpoint information is distributed as per [TUNN- SAFI]. The NEXT_HOP-v2 attribute could be used to distribute IPv4 VPNs reachability information along with a next hop from the Tunnel- SAFI format. During BGP Capability Advertisement, the PEs would include the following tuple in the Capability Value field: - 6. Security Considerations This document does not raise any additional security issues beyond those of BGP-4 and the Multiprotocol extensions for BGP-4. The same security mechanisms are applicable. Le Faucheur et. al 8 BGP-4 NEXT_HOP-v2 Attribute October 2003 7. Acknowledgments We thank Jim Guichard, Robert Raszuk, Pedro Marques and Himanshu Shah for their comments and suggestions on this document. References [BGP-4] Rekhter et al., "a Border Gateway Protocol 4 (BGP-4)", draft-ietf-idr-bgp4-21.txt, work in progress [MULTI-BGP] Bates et al, Multiprotocol Extensions for BGP-4, draft- ietf-idr-rfc2858bis-04.txt, work in progress. [RFC2547] Rosen et al., BGP/MPLS VPNs, draft-ietf-l3vpn-rfc2547bis- 01.txt, work in progress. [BGP-TUNN] Ooms et al., Connecting IPv6 Islands across IPv4 Clouds with BGP, draft-ooms-v6ops-bgp-tunnel-00.txt, work in progress. [IPv6-VPN] De Clercq et al., BGP-MPLS VPN extension for IPv6 VPN, draft-ietf-l3vpn-bgp-ipv6-01.txt, work in progress. [BGP-CAP] Chandra et al., "Capabilities Advertisement with BGP-4", RFC2842 [RFC1700] Postel et al., Assigned Numbers, STD2, RFC1700 (see also http://www.iana.org/iana/assignments.html) [BGP-LABEL] Rekhter et al., "Carrying Label Information in BGP-4", RFC3107. [UPDATE-v2] Nalawade et al., "BGPv4 Update-v2 Message", draft-nalawade-bgp-update-v2-00.txt, work in progress [TUNN-SAFI] Nalawade et al., "IPv4-Tunnel SAFI", draft-nalawade-kapoor-tunnel-safi-00.txt, work in progress. [L2VPN] Kompella at al., "Layer 2 VPNs Over Tunnels", draft-kompella-ppvpn-l2vpn-00.txt, work in progress. [BGP-SOFT] Nalawade et al., "BGP-v4 SOFT-NOTIFICATION message", draft-nalawade-bgp-soft-notify-00.txt, work in progress. Authors' Address: Francois Le Faucheur Cisco Systems, Inc. Village d'Entreprise Green Side - Batiment T3 400, Avenue de Roumanille Le Faucheur et. al 9 BGP-4 NEXT_HOP-v2 Attribute October 2003 06410 Biot-Sophia Antipolis France Email: flefauch@cisco.com Dan Tappan Cisco Systems, Inc. 300 Beaver Brook Road Boxborough, 01719, MA USA Email: tappan@cisco.com Gargi Nalawade Cisco Systems, Inc. 510 McCarthy Blvd Milpitas, 95035, CA USA Email: gargi@cisco.com Le Faucheur et. al 10