Working Group Name Sanjay Wadhwa Internet Draft Juniper Networks Expires: January 04, 2007 Derek Harkness Juniper Networks Thomas Haag T-Systems July 04, 2006 Control Plane Graceful Restart extensions for ANCP draft-wadhwa-ancp-graceful-restart-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on January 04, 2006. Copyright Notice Copyright (C) The Internet Society (2006). All Rights Reserved. Wadhwa et.al Expires January 22, 2007 [Page 1] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 Abstract This document describes proposed extensions to ANCP (Access Node Control Protocol) for supporting a graceful state resynchronization between NAS (network access server) and AN (access node) upon control plane restart. Base ANCP as a dedicated control protocol between a NAS and an AN, and its various use cases are defined in [1]. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. Table of Contents Table of Contents 2 1 Specification Requirements 3 2 Introduction 3 3 Graceful Restart Mechanism 5 4 Graceful Restart Procedures 5 4.1 Restarting Element.......................................5 4.2 ANCP Neighbor of Restarting Element......................6 5 IANA Considerations 8 6 Security Considerations 9 7 References 9 Author's Addresses 9 Intellectual Property Statement 9 Disclaimer of Validity 10 Copyright Statement 10 Acknowledgment 10 Wadhwa et.al Expires January 4, 2007 [Page 2] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 1 Specification Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", “SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 2 Introduction Typically ANCP protocol is run on the control plane on the NAS (network access server) and AN (access-node). ANCP provides for various use-cases as defined in [1]. The AN can dynamically inform the NAS of the upstream and downstream rates and other attributes of the DSL lines. The NAS uses some of these attributes to control the Wadhwa et.al Expires January 4, 2007 [Page 3] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 forwarding behavior of the subscriber's traffic. As an example the NAS uses the learnt downstream DSL train rate of a DSL line to shape the traffic that would traverse the DSL line, by adjusting its hierarchical QOS scheduler. Another example is to use the dynamically learnt data-link type and protocol encapsulation overhead on the access-loop in shaping the subscriber's traffic appropriately. Another use case is that the NAS controls line parameters in the Access Node. Typically ANCP protocol is run on the control plane on the NAS and AN. In most of current NAS platforms, forwarding plane is decoupled from the control plane. In such systems, the forwarding plane can continue to operate even if the control plane resets. It is possible for ANCP protocol to restart either due to a planned control plane reset on the NAS or due to a control plane reset caused by a software or hardware failure on the NAS. In this case the ANCP adjacency is lost but not all data provided by the AN to the NAS may be lost. Furthermore the subscriber sessions are still active independent of the ANCP adjacency state. Internal AN errors may cause the ANCP adjacency to be lost without a hard reset of the AN and associated loss of line state in the AN control plane. This may be due to a similar control and data plane separation in the AN, or e.g. due to failure of an AN uplink. Furthermore, a failure or packet loss on any intermediate node or network between the AN and the NAS can cause the ANCP adjacency to be lost. All of the above will cause the ANCP adjacency to be reset. However, the parameters learnt via ANCP prior to restart and applied to control the forwarding behavior are still applicable and in play during and after the control plane restart. This draft proposes extensions to ANCP to implement a graceful state resynchronization between the NAS and the AN upon control plane restart. This mechanism ensures that the control plane restart and subsequent state resynchronization have no impact on the forwarding and QOS control on the NAS. The default behavior on a loss of ANCP adjacency without graceful restart extensions is to cleanup any dynamic information learnt from the ANCP neighbor, and revert to default values for any such information. Wadhwa et.al Expires January 4, 2007 [Page 4] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 3 Graceful Restart Mechanism When the control plane on either the NAS or AN restarts, the state information between the NAS and AN needs to re-synchronized. This includes potentially updating existing state on the restarting entity to the latest information as known to the ANCP neighbor originating this information. Any state information on the restarting entity learnt from its ANCP neighbor prior to restart that is no longer applicable after the restart needs to be garbage-collected. Resources associated with such state needs to be reclaimed. The ANCP neighbor supporting graceful restart SHOULD maintain and use state learnt from the restarting neighbor till the restart is complete and the state has been resynchronized. However, the duration for which this state SHOULD be maintained on the ANCP neighbor after the ANCP adjacency between the restarting entity and the ANCP neighbor is lost is controlled by the restarting neighbor. The restarting neighbor SHOULD advertise the duration in seconds in its graceful restart capability announcement. This duration is referred to as "restarting time" and signifies the amount of time an ANCP neighbor should wait for session re-establishment and state re- synchronization with the restarting entity after the ANCP adjacency has timed out. The restarting time advertised by the restarting entity SHOULD be configurable. 4 Graceful Restart Procedures 4.1 Restarting Element Typically, ANCP restarting element supporting control plane graceful restart will use the state learnt from its ANCP neighbor in the forwarding plane during and after the restart. Immediately after restart the restarting element SHOULD mark all the state learnt prior to its restart from its ANCP neighbor as "stale". On re-establishing ANCP adjacency with each of its ANCP neighbors (e.g. single NAS can have multiple ANCP adjacencies as it can serve multiple partitions on an AN or multiple physical ANs), the restarting element will relearn state from its neighbors. The restarting element SHOULD update the learnt state and SHOULD unmark the state as being stale. Each ANCP neighbor SHOULD generate an "END-OF-STATE" adjacency message as soon as all existing information on the ANCP neighbor has been replayed to the restarting entity. When the restarting entity has received "END- OF-STATE" from an ANCP neighbor, it should remove any stale state that is relevant to that neighbor, from both its control and Wadhwa et.al Expires January 4, 2007 [Page 5] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 forwarding planes, and SHOULD reclaim the associated resources. The restarting element SHOULD keep track of its established adjacencies in non-volatile storage prior to restart. It SHOULD also note the graceful restart capability of each established adjacency. The restart is deemed complete when the restarting element has received and handled “END-OF-STATE” from each of these adjacencies. The restarting element SHOULD wait for session re-establishment and state re-synchronization from these adjacencies up to a maximum duration. This duration is referred to as "wait time". An absolute value for wait time in seconds SHOULD be configurable on the restarting element. At the expiry of “wait time”, the restarting element SHOULD remove all state that is still marked stale. Also, at the expiry of “wait time”, graceful restart for any partially synchronized adjacencies SHOULD be aborted. All the state for partially synchronized adjacencies SHOULD be removed, and these adjacencies SHOULD be reinitialized. All ANCP elements capable of restarting gracefully SHOULD advertise "Graceful Restart" capability TLV in its adjacency messages. The capability SHOULD also contain a suggested value for the "restart time". General format for a capability TLV and general handling of ANCP capability TLVs is defined in [1]. The "Graceful Restart" capability is defined as follows. Capability Type: Graceful Restart = 0x05 Length (in bytes): 4 Capability Data: Restart Time (in seconds) 4.2 ANCP Neighbor of Restarting Element An ANCP element capable of supporting graceful restart of a restarting element SHOULD maintain and use information learnt from an ANCP neighbor that has advertised "Graceful Restart Capability", even after the ANCP adjacency with the neighbor has been lost. The information SHOULD be used and maintained beyond adjacency loss, only up to the "restart time" advertised in the "Graceful Restart" capability TLV. On expiry of this time, any state learnt from the ANCP neighbor SHOULD be removed. This SHOULD result in reverting to default values any state created as a result of information announced by the ANCP neighbor. Wadhwa et.al Expires January 4, 2007 [Page 6] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 An ANCP element capable of supporting graceful restart of a restarting element SHOULD generate an adjacency message with code type "END-OF-STATE" after all relevant existing information has been replayed to the restarting element. The code field in the adjacency message as defined in section 5.3 of [1] is being expanded to include END-OF-STATE (5) as a new type. The adjacency message format as defined in [1] is as follows. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ver | Sub | Message Type | Timer |M| Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender Name | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Receiver Name | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Receiver Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PType | PFlag | Sender Instance | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Partition ID | Receiver Instance | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tech Type | # of TLVs | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Capability TLVs ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Wadhwa et.al Expires January 4, 2007 [Page 7] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 The code field specifies the function of the message. With the expansion of the codes, the set of codes that can be sent in adjacency messages are as follows: SYN: Code = 1 SYNACK: Code = 2 ACK: Code = 3 RSTACK: Code = 4 END-OF-STATE: Code = 5 The “END-OF-STATE” notification is also useful if the ANCP neighbor wants to perform a non-disruptive, forced state resynchronization operation without bringing down the ANCP adjacency. An ANCP element capable of supporting graceful restart of a restarting neighbor SHOULD advertise "Graceful Restart Helper" capability TLV in its adjacency messages. This capability TLV is defined as follows: Capability Type : Graceful Restart Helper = 0x06 Length (in bytes) : 0 Capability Data : NULL An ANCP element can be a “graceful restart helper” without itself being graceful restart capable and vice-versa. The normal ANCP adjacency negotiation mechanism which results in negotiation of least common capability set is relaxed for graceful restart, since the graceful restart procedures can handle asymmetric graceful restart capabilities. 5 IANA Considerations New capability types, and code type for adjacency message will need to be reserved. Wadhwa et.al Expires January 4, 2007 [Page 8] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 6 Security Considerations These extensions for graceful restart do not require any additional security considerations beyond general security considerations for ANCP as specified in [1]. 7 References [1] Wadhwa S. et al, “GSMP extensions for Access Node Control”, draft-wadhwa-gsmp-2control-configuration-02.txt, October, 2006. Author's Addresses Sanjay Wadhwa Juniper Networks 10 Technology Park Drive Westford, MA 01886 Email: swadhwa@juniper.net Derek Harkness Juniper Networks Email: dharkness@juniper.net Thomas Haag T-Systems Email: thomas.haag@t-systems.com Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of Wadhwa et.al Expires January 4, 2007 [Page 9] Internet-Draft draft-wadhwa-ancp-graceful-restart-00 July 2006 such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. "This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." Acknowledgment Wadhwa et.al Expires January 4, 2007 [Page 10]