Network Working Group Q. Zhang Internet-Draft D. Zhang Intended status: Informational Huawei Technologies Co.,Ltd Expires: July 2, 2011 December 29, 2010 Extending the Virtual Router Redundancy Protocol for Graceful Restart draft-zhang-vrrp-gr-00 Abstract Many existing routers enable their data planes to keep forwarding data packets during the restart of their control plains. This functionality is called "graceful restart" or "non-stop forwarding". A router supporting graceful restart is able to provide more reliable packet forwarding service during system maintaining procedures and software-error recoveries. This informational document analyzes the issues with enhancing VRRP to support graceful restart and introduces several candidate solutions. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on July 2, 2011. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. Zhang & Zhang Expires July 2, 2011 [Page 1] Internet-Draft VRRP extension supporting GR December 2010 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Candidate Solutions . . . . . . . . . . . . . . . . . . . . . . 3 3.1. Implicit Solutions . . . . . . . . . . . . . . . . . . . . 4 3.2. Explicit Solutions . . . . . . . . . . . . . . . . . . . . 4 4. Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 5 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 5 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 5 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 5 8.1. Normative References . . . . . . . . . . . . . . . . . . . 5 8.2. Informative References . . . . . . . . . . . . . . . . . . 5 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 6 Zhang & Zhang Expires July 2, 2011 [Page 2] Internet-Draft VRRP extension supporting GR December 2010 1. Introduction VRRP (Virtual Router Redundancy Protocol) [RFC3768] is an election protocol which organizes a group of routers on a same LAN as a virtual router providing high available services for users. VRRP can dynamically assign the responsibility in processing the traffic for the virtual router to one of the VRRP routers in the group, which is called a master router and others are referred to as backup routers. Backup routers monitor the states of the master router by monitoring the periodical advertisement packets sent by the master router. If backup routers cannot receive the advertisements for a pre-specified period, backup routers then believe that the master router cannot work properly and carry out an election. The router with the highest priority will be assigned as the new master router. Note that if the pre-master router has a higher priority than the newly elected one, it will try to seize its privilege back after solving its problem. In practice, the frequent switch of the master responsibility may introduce considerable overheads to the involved VRRP routers. On some occasions, routers may be busy with processing state transformation and fail to send signaling packets on time, which in the worst case can seriously disrupt the operations of VRRP routers. An intuitive solution to mitigate such issues is to implement the VRRP protocol at the data plain so that VRRP state will not be affected when the control plain cannot work appropriately. However, this approach breaks the boundary between the data plain and the control plain of the router architecture, and may introduce additional complexity in the design and implementation. This issue can be mitigated by extending VRRP to support Graceful restart (GR), which is a popular technology integrated in large amounts of routers. A router supporting GR is able to forward data packets while its control plain is being restarted. When a master gracefully restarts, it is still able to process the packets of the virtual router, and thus its responsibility does not have to be taken over by others. This informational draft analyzes the issues with enhancing VRRP to support GR so as to mitigate the issues discussed above. Additionally, candidate solutions are introduced. Normally, the proposed solutions need the cooperation of GR supporting IGPs. 2. Terminology GR: Graceful Restart. 3. Candidate Solutions Zhang & Zhang Expires July 2, 2011 [Page 3] Internet-Draft VRRP extension supporting GR December 2010 3.1. Implicit Solutions In order to prevent backup routers to raise a new election before the master router finishes its GR, the master router can simply extend the interval between its hello packets during the restart period. The master can achieve this by multicasting a hello packet with an extended interval (noted by T-GR) to its router before the restart. T-GR should be long enough for the master router to finish the restart. After receiving the hello packet, backup routers notice the change of the interval and then update their MasterDown timers with it. If the master achieves the restart in time, it will send hello packets with the normal interval. Again, backup routers will notice the change of the interval and reset their MasterDown timers with the new interval. However, if backup routers do not receive any new hello packets from the master router after waiting for T-GR, they will perform a new election process. A good thing of this solution is that it does not introduce any additional signaling packets. Routers only needs modify their internal operations. 3.2. Explicit Solutions The implicit solution introduced in section 4.1 is simple and effective in supporting graceful restart. However, in many typical scenarios, it is desired to explicitly clarify the start and the end of a GR. For instance, assuming two routers supporting GR are redundantly deployed to provide a highly available BRAS (Broadband Remote Access Router) service for users. A GR supporting extension of VRRP is executed to decide the role that each router acts as. Additionally, user information for access control is synchronized between the master router and the backup router. Therefore, when the master router cannot work properly, the backup router can take over it without breaking the access of the online users to the Internet. When the master router is gracefully restarted, it is still able to forward data packets for the users which have been already registered. So, the backup router can just wait for the finish of the restart rather than taking place of the master router to serve the users. However, the user information maintained in the control plain of the master router may be lost during the restart. The backup router then needs to help the master router re-create the states after the accomplishment of the GR. Therefore, after the GR, the master router needs to explicitly inform the backup router that it has finished the GR and been ready for synchronizing the user information. 4. Issues Several issues are listed as follows: Zhang & Zhang Expires July 2, 2011 [Page 4] Internet-Draft VRRP extension supporting GR December 2010 1. In practice, a newly restarted control plain can be busy in a certain period so that it may not always be able to send signaling packets according to the required interval. This issue may result in repeatedly triggering election procedures and seriously disturb the operations of VRRP routers. This issue should be considered during the design of the GR supporting extension of VRRP. 2. As mentioned above, during a GR, the control plain of the router cannot work properly. Therefore, a VRRP backup group executing certain services (e.g., BRAS) cannot accept any register queries from users during a GR. This may negatively influence the experiences of users. 3. Because during a GR, backup routers may have to wait for a relatively long period. During this period, any newly occurred issues (e.g., the failure of the data plain of the restarted router) may be difficult to be detected in time. To address this issue, additional failure detecting approach needs to be adopted. TBD. 5. IANA Considerations This document makes no request of IANA. Note to RFC Editor: this section may be removed on publication as an RFC. 6. Security Considerations 7. Acknowledgements 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 8.2. Informative References [RFC2338] Knight, S., Weaver, D., Whipple, D., Hinden, R., Mitzel, D., Hunt, P., Higginson, P., Shand, M., and A. Lindem, "Virtual Router Redundancy Protocol", RFC 2338, April 1998. Zhang & Zhang Expires July 2, 2011 [Page 5] Internet-Draft VRRP extension supporting GR December 2010 [RFC3768] Hinden, R., "Virtual Router Redundancy Protocol (VRRP)", RFC 3768, April 2004. Authors' Addresses Qiang Zhang Huawei Technologies Co.,Ltd Nanjing, China Phone: Fax: Email: njzq@huawei.com URI: Dacheng Zhang Huawei Technologies Co.,Ltd Beijing, China Phone: Fax: Email: zhangdacheng@huawei.com URI: Zhang & Zhang Expires July 2, 2011 [Page 6]