Network working group R.Wang Internet Draft Huawei Technologies Intended status: Standards Track January 4, 2011 Expires: July 2011 IMP Protocol draft-wang-igp-monitor-protocol-00.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 4, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Wang, et al. Expires July 4, 2011 [Page 1] Internet-Draft I January 2011 Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract This document proposes a protocol named IMP, which can be used for monitoring IGP routing protocol. Compared to the widely used packets catch and route listening method, this protocol is more convenient for obtaining routing information and capturing routing changes, can well satisfy network routing monitoring and research purpose. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. Table of Contents 1. Introduction ................................................ 2 2. Problem statement ........................................... 3 3. IMP Protocol ................................................ 5 3.1. IMP Session ............................................ 5 3.2. Route Report ........................................... 6 3.3. Statistics Report....................................... 7 3.4. Multiple routing instances support ...................... 7 4. Message Format .............................................. 7 4.1. IMP Header ............................................. 7 4.2. Routing Instance Header................................. 8 4.3. Open Message ........................................... 9 4.4. Keep Alive Message...................................... 9 4.5. Route Report Message.................................... 9 4.6. Statistics Report Message.............................. 10 5. Security Considerations..................................... 11 6. IANA Considerations ........................................ 11 7. Acknowledgments ............................................ 11 8. References ................................................. 11 8.1. Normative References................................... 11 8.2. Informative References................................. 11 Authors' Addresses ............................................ 11 1. Introduction Routing protocol is used to direct IP packet forwarding. The quality and usability of IP network is depend on routing protocol's Wang, et al. Expires July 4, 2011 [Page 2] Internet-Draft I January 2011 correctness, so, monitor the status of routing protocol, diagnose routing fault and recovery fault as fast as possible are important to IP network O&M. When link has fault or change, routing changing can spread automatically till routing convergence to a new usable path. When a new router linked to existing network, router's information and new import routing information can spread to whole network automatically, at the same time, the new router can get other router's routing information also. These features simplifies configuration and deployment of network extraordinarily, but meanwhile, introduce the uncertainty of service flow path, make operator hard to know the route of service , they can't make sure whether IP packet is forwarding or not, all of this make network fault locate more and more difficulty. 2. Problem statement Routing abnormity monitoring and service path detecting are hot topic in route management area. As for routing abnormity monitoring, there is no any router and SNMP-based management system can well meet customer's requirement. As for path detecting, usually use ping and traceroute tools. Ping can detect the connectivity from one router to another given IP address. Traceroute can detect the hop by hop information of given IP address peer. These tools have some disadvantages such as, 1 May get invalid path at route convergence period 2 Can only support fault locating on spot, but can't support off-line fault diagnosis, and what's more, can't provide history change information of service path, so it is hard to locate service's instantaneous fault. 3 Can't discover ECMP paths, so the operator has to take lots of time to analysis routing table for estimating service off-load status. 4 Help to locate fault manually ,but it is low-efficiency, can't well satisfy automatic fault location requirement. Wang, et al. Expires July 4, 2011 [Page 3] Internet-Draft I January 2011 There has some works done on route monitoring and path discovering now, aim to overcome these disadvantages, typical solution is using collector to establish OSPF/ISIS routing neighborhood to deployed routers. It utilize router's flooding mechanism to collect route information, then can detect routing abnormity by analysing routing information actively, and calculate required IP path by simulating router's path calculation process. Using route analysis system to collect route update message is use for routing monitor and path discovery, but the use of listener is also have some disadvantages: 1 As for establishing route neighborhood between listener and router ,the router can't tell the difference between listener peer and router peer, so there is no chance for the listened router to take suitable priority level of processing for its different type of peers. Another question is security issue, incorrect listener may import unexpected route update to network. 2 When network has many Areas, listener has to establish neighborhood to each OSPF/ISIS route Ares at least one, because OSPF/ISIS neighborhood can't across other routes, it needs deploy one listener for every Area. The management cost and hardware cost is high because it needs deploy so many listeners. Another solution is using IP Tunnel cross IP network to establish remote route neighbor session, but its configuration cost and maintenance cost is high, and besides, during route convergence process , it have chance to collect insufficient messages due to path dis-connectivity. 3 Needs hard listener device, and occupy precise router's physical ports. When listen VRF route instance, in extreme situation, PE router may needs support port to communicate with listener for each VRF. But in Metro and Access network , there is many small Wang, et al. Expires July 4, 2011 [Page 4] Internet-Draft I January 2011 and insulated route Areas, because remote ISIS/OSPF peer is not always available, so needs more listener device, it is make the hardware cost more high. 4 Complicated and discommodious deployment. Need go to scene to deploy listener and configure route peer in related routers. When needs to adjust the position of listener, it needs repeat the process above. The IGP monitoring protocol is a simplified and effective route monitoring solution, it is include two parts: 1 Routing collection: implement by router and monitoring system, the router packed LSDB and the update information as OSPF/ISIS protocol message format and send to listener system. 2 Routing analysis: implement on monitoring system. Receive route update message from routers, detect route abnormity, calculate IP path. This solution can satisfied route message collection and monitoring requirement excellently, and not import performance effect on router's other task. 3. IMP Protocol 3.1. IMP Session IMP is a connection-oriented protocol, it is composed of client and server. Server is deployed in router; Client is deployed in monitoring system. IMP message transport is based on TCP. Router listened at a given port, monitoring system establish connection to router via this port. After TCP connection established, Router send an open message to monitoring system, the message will take a session-keep time. When Wang, et al. Expires July 4, 2011 [Page 5] Internet-Draft I January 2011 Monitoring system receives open message, it check the parameters in this message, if this open message is acceptable, monitoring system send a keep alive message to router to acknowledge open message, hereto, the connection has established, then can send Route Monitoring, Statistic Report and keep alive messages. Keep alive message is sent at 1/3(but can't less than one second) of hold-time, if negotiated hold-time is ZERO, it means needs not to send keep alive message. If there is any abnormity occurs, it just needs to simply drop the connection. 3.2. Route Report Route report is aimed to obtain router's LSDB data, include initial synchronization message and real-time update. After IMP connection is established, router initiates the first initial synchronization by send a route report message which encapsulates a snapshot of LSDB to monitoring system. Meanwhile, router start real-time listen, when LSDB have change, it sends route report message to monitoring system immediately. Router need to judge whether send route report message to monitoring system or not, the principle is the same as send link state update message to IGP neighbor. The difference between them is that the router needs not to send period audit check messages(such as ISIS CSNP) and period link state updates (that is, the link state advertisements with only sequence number and remaining lifetime change) to IMP peer. i.e., as for LSA/LSP that router sends to its IGP neighbors, if there has route message change, route send a copy to IMP peer as IMP message format. Route report is not a real-time copy of LSA/LSP message. As for IMP route message, packet sending many be delayed due to routing busy, but at the condition of not effect route convergence and route calculation, routes need to notify monitoring system as soon as possible. So, these is delay between route receive LSA/LSP and create and send to monitoring system. If the link state changed again during this period, the router should try to transfer every time link state changes to the monitoring system, but also allows the router to send directly the final state to the monitoring system. Wang, et al. Expires July 4, 2011 [Page 6] Internet-Draft I January 2011 3.3. Statistics Report The purpose of statistics report is to monitor interesting events occur on router or some concerned packet counters. Following statistic types are pre-defined: Number of the received link state advertisements with invalid check sum, Number of duplicate link state advertisements received from the same peer. Statistic types can be expanded as needed. The IMP implementation can both report statistics periodically or immediately. It is not limited. However, configuration control of the timer and/or threshold values should be provided. 3.4. Multiple routing instances support Routers may support multiple routing instances. LSDB between routing instances are independent, typically example VRF. One IMP instance can simultaneously monitor multiple routing instances. IMP message carries an identifier of the routing instance to separate routing information that belong to different routing instances. 4. Message Format 4.1. IMP Header IMP header appears in all IMP messages. The rest of the data depends on the "Message Type" field in the IMP header. 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version | Message Length | Message Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Version (1 byte) IMP protocol version, the current version is '1'. Message Length (2 bytes): Length of the whole message in bytes, including headers and data. Message Type (1 byte): Type of the IMP message. The IMP client MUST ignore unrecognized message types when received. Type = 0: Open Message (OPEN) Type = 1: Keep Alive Message (KA) Wang, et al. Expires July 4, 2011 [Page 7] Internet-Draft I January 2011 Type = 2: Route Report (RR) Type = 3: Statistics Report (SR) 4.2. Routing Instance Header The routing instance header follows the IMP header for most IMP messages except ''Open'' and ''Keep Alive'' messages. The rest of the data depends on the ''Message Type'' field in the IMP header and the ''Protocol Type'' field in the routing instance header. 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol Type | Instance Type | Dist. Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Instance Distinguisher | ~ ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Routing Identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp (seconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp (microseconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Protocol Type (1 byte): Indicates the routing protocol type of the data following the routing instance header in the route report messages and statistics report messages. - Type = 0: OSPF - Type = 1: ISIS Instance Type (1 byte): Routers may have multiple routing instances (example L3VPNs). This field is used to indicate the type of the routing instance which the routing data following the routing instance header belongs to. - Type = 0: Global Instance - Type = 1: L3VPN Instance Distinguisher Length(1 byte) Length of instance distinguisher in bytes Wang, et al. Expires July 4, 2011 [Page 8] Internet-Draft I January 2011 Instance Distinguisher: This field is used to distinguish routing instances from each other. For a ''Global Instance'', this field is set to null, correspondingly the ''Distinguisher Length'' field is set to zero. For a ''L3VPN Instance'', it is set to route distinguisher of the particular L3VPN instance that the routing data belongs to. Routing Identifier 4 bytes Identifier used by routing protocol to separate different routing domains in router. It can be router ID for OSPF, or system ID for ISIS. Timestamp: The time when the routes were received by the router, it is used for the monitoring system to know when the route change happened. This field expressed in seconds and microseconds since midnight (zero hour), January 1, 1970 (UTC). If zero, the time is unavailable. Precision of the timestamp depends on implementation. 4.3. Open Message Open message is the first message after the session is established. Following the IMP header is a 2-byte field that indicates the hold time of the session in seconds. 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Hold Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Hold Time 2 bytes This field is used to indicate the expected session hold time in seconds, the receiver either accepts the time value, or reject the connection. 4.4. Keep Alive Message Keep alive messages have only the IMP header, does not contain any other data. If the ''Hold Time'' in the open message is set to zero, keep alive messages are not needed. 4.5. Route Report Message Route Report messages are used for initial synchronization and real- time monitoring of LSDB. Wang, et al. Expires July 4, 2011 [Page 9] Internet-Draft I January 2011 Following the IMP header and routing instance header is an OSPF or ISIS PDU, depends on the ''Protocol Type'' field in the routing instance header. 4.6. Statistics Report Message This message is used to help the monitoring system to observe interesting events that happen on the router. Following the IMP header and routing instance header is a 4-byte field that indicates the number of counters in the statistics report message where each counter is encoded as a TLV. 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Statistics Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Each counter is encoded as following 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Statistic Type | Statistic Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Statistic Data | ~ ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Statistic Count (4 bytes): Indicates the number of counters in the statistics report message. Statistic Type (2 bytes): Indicates the type of the statistic carried in the ''Statistic Data'' field. Currently following types are defined: - Statistic Type = 0: Number of link state advertisements invalidated due to bad check sum. - Statistic Type = 1: Number of duplicate link state advertisements. Statistic Length (2 bytes): Indicates the length of the ''Statistic Data'' field. Wang, et al. Expires July 4, 2011 [Page 10] Internet-Draft I January 2011 This message is optional. The IMP implementation MUST ignore the unrecognized statistic type. 5. Security Considerations TBD. 6. IANA Considerations TBD. 7. Acknowledgments 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 8.2. Informative References TBA Authors' Addresses Ruihong Wang Huawei Technologies Co., Ltd. Email: wrhong@huawei.com Wei Meng Huawei Technologies Co., Ltd. Email: mengwei00107932@huawei.com Liang Yu Huawei Technologies Co., Ltd. Email: liang.yu@huawei.com Wang, et al. Expires July 4, 2011 [Page 11]