Network Working Group J. Scudder Internet-Draft Cisco Systems Expires: February 9, 2006 August 8, 2005 BGP Monitoring Protocol draft-scudder-bmp-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on February 9, 2006. Copyright Notice Copyright (C) The Internet Society (2005). Abstract This document proposes a simple protocol, BMP, which can be used to monitor BGP sessions. BMP is intended to provide a more convenient interface for obtaining route views for research purpose than the current screen-scraping approach in common use today. The design goals are to keep BMP simple, useful, easily implemented, and minimally service-affecting. BMP is not suitable for use as a routing protocol. Scudder Expires February 9, 2006 [Page 1] Internet-Draft BGP Monitoring Protocol August 2005 1. Background Many researchers wish to have access to the contents of routers' BGP RIBs. For their purposes, the contents of the Adj-RIB-Out are not sufficient -- the Adj-RIBs-In are what is desired. At present, this data can only be obtained through screen-scraping. The BMP protocol provides access to Loc-RIB and/or Adj-RIB-In data through to key mechanisms -- table dump, and ongoing monitoring. Table dump provides a way to obtain a snapshot of the Adj-RIBs-In. The table is dumped in the form of standard BGP updates wrapped in a simple encapsulation. Ongoing monitoring simply forwards an encapsulated copy of each incoming BGP PDU. The use of standard BGP updates means that it should be possible for a very lightweight BMP listener to decapsulate the BMP feed and pass it through to an unmodified BGP implementation, if desired. BMP operates over TCP. All options are controlled by configuration on the monitored router. Communication is unidirectional, from the monitored router to the monitoring station. There is no initialization or handshaking phase -- messages are simply sent over the TCP connection, according to configuration, as soon as the connection is established. BMP monitoring is subject to tail drop in the event that the monitoring session cannot keep pace with incoming BGP messages. This is a consequence of the requirement to keep BMP simple and non- service affecting. The monitoring station may recover state synchronization using table dump. Also, the probability of tail drop can be reduced by a variety of means discussed below. 2. BMP Message Format BMP encapsulates BGP messages as follows. 2.1 Table Dump and Monitor The table dump and monitor messages share a common format. Table dump is used for initial synchronization or periodic polling of the routing table. Monitoring is used for ongoing monitoring. Both functions are discussed in more detail in following sections. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type |L|R| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer Address | | | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer AS | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp (seconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp (microseconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PDU ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BGP PDU | ... (variable length) ... o Type: Set to: * Type 1: Table Dump * Type 2: Table Monitor o The L Flag: Set to one if the message reflects the Loc-RIB (i.e., if it reflects the application of inbound policy). It is set to zero if the message reflects the Adj-RIB-In. Scudder Expires February 9, 2006 [Page 2] Internet-Draft BGP Monitoring Protocol August 2005 o The R Flag: If this flag is set, a table dump or table monitor is being requested, sourced from the sender. The sender MUST set the Peer Address, Peer ID, and Peer AS fields to its own in a request message. o Reserved: Set to 0 o Peer Address: The remote IP address associated with the TCP session over which the encapsulated PDU was received. If an IPv4 address is carried in this field, it is padded with 0's in the most significant digits (the leftmost digits). During a table dump, Peer Address MUST be set to zero. o Peer ID: The BGP Identifier of the peer from which the encapsulated PDU was received. During a table dump, Peer ID MUST be set to zero. o Peer AS: The Autonomous System number of the peer from which the encapsulated PDU was received. If a 32 bit AS number is stored in this field [AS-4BYTES], it should be padded with 16 0's in the most significant digits. o Timestamp: The time when the encapsulated PDU was received, expressed in seconds and microseconds since midnight (0 hour), January 1, 1970. If zero, the time is unavailable. o PDU ID: A nonzero identifier associated with the encapsulated BGP PDU. If the implementation does not support identifiers, the value MUST be sent as zero. Otherwise, a monotonically incrementing (modulo 2^32) value must be used. The PDU ID MAY wrap; it is the responsibility of the monitoring station to detect and handle wraps. Following the header is a BGP PDU. The length of the PDU can be determined by parsing it in the normal fashion as specified in [BGP]. 2.2 Tail Drop Notification This message is used to indicate that a tail drop has occurred, probably because a buffer overflowed. (The tail drop message should not be sent if no messages were dropped!) Since a tail drop can result in loss of synchronization between the local router and the monitoring station which can only be recovered through a table dump, the local router MAY terminate the monitoring session following the tail drop notification. It is anticipated that the monitoring station will notice this happened and take appropriate recovery action. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Reserved | Messaged Dropped | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: Set to 3 Messages Dropped: Unsigned integer; provides a count of how many messages were dropped. A value of zero indicates that no count is available. Scudder Expires February 9, 2006 [Page 3] Internet-Draft BGP Monitoring Protocol August 2005 2.3 Peer Down Notification This message is used to indicate that a peering session was terminated. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Reason | Error Code | Error Subcode | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer Address | | | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer AS | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp (seconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp (microseconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ o Type: Set to 4 o Reason: A code used to indicate why the session was closed. Defined values, and the associated error code and subcode, are: * Reason 1: The local system closed the session. The error code and error subcode indicate the reason for the termination. The codes to be used correspond to those which would be sent in a notification message to the peer and are defined in the base BGP specification [BGP]. * Reason 2: The local system closed the session. No notification message was sent. The error code and error subcode should be zero. * Reason 3: The remote system closed the session with a notification message. The notification message (encapsulated in a monitor message) should precede this peer down notification message. The error code and error subcode should be zero. * Reason 4: The remote system closed the session without a notification message. The error code and error subcode should be zero. o Peer Address: The remote IP address associated with the TCP session over which the encapsulated PDU was received. If an IPv4 address is carried in this field, it is padded with 0's in the most significant digits (the leftmost digits). During a table dump, Peer Address MUST be set to zero. o Peer ID: The BGP Identifier of the peer from which the encapsulated PDU was received. During a table dump, Peer ID MUST be set to zero. o Peer AS: The Autonomous System number of the peer from which the encapsulated PDU was received. If a 32 bit AS number is stored in this field [AS-4BYTES], it should be padded with 16 0's in the most significant digits. 3. Table Dump Table dump is used to provide a snapshot of the Adj-RIB-In. It does so by sending all routes stored in the Adj-RIB-In using standard BGP Update messages, each encapsulated in a BMP Table Dump message. For each prefix in the Adj-RIBs-In, the active route for that prefix ("bestpath") MUST be the last route for that prefix which is transmitted. There is no other requirement on the ordering of messages in the table dump. For example, there is no requirement all updates for a given prefix occur in strict succession. Note that the semantics of table dump differ from those of normal BGP in that Scudder Expires February 9, 2006 [Page 4] Internet-Draft BGP Monitoring Protocol August 2005 successive updates for a given prefix do not represent implicit withdraws. In fact, it does not make sense for BMP table dump update to contain a non-empty withdrawn routes field, and such updates should not be sent as part of a table dump. Since table dump has no withdraw or replacement semantics, a table dump should not include a given route more than once. Depending on the implementation or configuration, it may only be possible to send the Loc-RIB instead of the Adj-RIB-In. This is because it is possible that a BGP implementation may not store, for example, routes which have been filtered out by policy. If this is the case, the implementation may send the Loc-RIB instead. If the implementation is able to provide information about when routes were received, it MAY provide such information in the BMP timestamp field. Otherwise, the BMP timestamp field MUST be set to zero, indicating that time is not available. Following the final message in a table dump, the End-of-RIB marker [BGP-GR] MUST be transmitted. The EoR marker is encapsulated within a BMP header, just as with all other table dump messages. 4. Ongoing Monitoring Ongoing monitoring is accomplished by encapsulating each incoming BGP PDU in a monitor message and forwarding it to the monitoring station. Ordering must be maintained among PDUs received on any given session. There is no requirement to maintain any particular ordering among PDUs received on different sessions. All BGP PDUs must be encapsulated and forwarded. This includes keepalives as well as malformed PDUs (encapsulated malformed PDUs will typically be followed by a peer down message for obvious reasons). If it is not possible to forward a BGP PDU, a tail drop notification message must be sent. The session MAY subsequently be dropped, or forwarding of PDUs MAY continue when resources become available. 5. Implementation Considerations The frequency of incoming updates will typically be much higher during initial convergence after BGP startup. For this reason, to speed initial convergence implementations may wish to provide the option to enable BMP only after BGP has converged. Scudder Expires February 9, 2006 [Page 5] Internet-Draft BGP Monitoring Protocol August 2005 6. Using BMP Use of BMP to obtain a single table dump is self-explanatory. It is worth devoting a few words to ongoing monitoring, however. A baseline state snapshop will usually be required for any given monitoring application. The suggested sequence is as follows: o First, open a monitoring session. It may be worth waiting until a PDU has been received over the monitoring session before proceeding, as a way of verifying that the session is "live". Since BGP keepalive messages are forwarded over the monitoring session just as all other BGP messages are, it is guaranteed that even on a very stable network, a message will be received within a short time. o Second, open a table dump over a second session. Conceptually, the PDUs from the monitoring session come after those received on the table dump session. One implemention option would be to buffer the monitoring session PDUs until the table dump has completed, then process first the dump PDUs and then the monitoring PDUs. 7. IANA Considerations This document defines four TLV types for transferring BGP messages between cooperation systems: o Type 1: Table Dump o Type 2: Table Monitor o Type 3: Tail Drop o Type 4: Peer Drop Type values 5 through 16575 MUST be assigned using the "IETF Consensus" policy defined in RFC2434 [RFC2434]. 8. Security Considerations This document defines a mechanism to obtain a full dump or provide continuous monitoring of a BGP speaker's local BGP table, including received BGP messages. This capability could allow an outside party to obtain information not otherwise obtainable, and also open another avenue of attack against a BGP speaker. Implementations of this protocol MUST require manual configuration of the monitored and monitoring devices. Users of this protocol MAY use some type of secure transmission mechanism, such as [IPSEC], to transmit this data. Scudder Expires February 9, 2006 [Page 6] Internet-Draft BGP Monitoring Protocol August 2005 9. References 9.1 Normative References [BGP] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 1771, March 1995. [BGP-GR] Sangli, S., Rekhter, Y., Fernando, R., Scudder, J., and E. Chen, "Graceful Restart Mechanism for BGP", RFC draft-ietf-idr-restart-10.txt, June 2004. [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. 9.2 Informative References [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload (ESP)", RFC 2406, November 1998. Author's Address John Scudder Cisco Systems Scudder Expires February 9, 2006 [Page 7] Internet-Draft BGP Monitoring Protocol August 2005 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Scudder Expires February 9, 2006 [Page 8]