Keyur Parikh Internet Draft Megisto Systems Andy Koscinski Amber Networks Expires: Nov 2001 May 2001 Proposed mechanism for L2TP failover handling Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract The Layer Two Tunneling Protocol (L2TP) [RFC 2661] provides a standard method for tunneling PPP [RFC 1661] packets. This document describes an extension to L2TP that provides a mechanism to support a failover to a hot standby unit and a continued L2TP operation with a goal to minimize payload data loss during fail over. Table of Contents Status of this Memo................................................1 Abstract...........................................................1 Overview...........................................................3 Conventions used in this document..................................3 Proposed failover protocol.........................................3 Parikh/Koscinski Informational - Expires Nov 2001 1 draft-parikh-l2tpext-failover May 2001 Maintenance message type...........................................3 Sync Tunnels AVP...................................................4 Sync Sessions AVP..................................................4 Synchronization of tunnel sequence counters........................5 Synchronization of tunnel sessions.................................7 Migration to standard attributes...................................8 Security Considerations............................................8 Acknowledgement....................................................9 References.........................................................9 Author's Addresses.................................................9 Appendix A.........................................................9 Parikh, Koscinski Informational - Expires Nov 2001 2 draft-parikh-l2tpext-failover May 2001 Overview Resiliency is a key feature to have for the carriers deploying the next generation data networking equipment. The equipment vendors might achieve this resiliency by saving all critical control information that is required to recover the communications with the peer ("full state redundancy") after a failover. For L2TP protocol, this critical information includes saving the latest control channel and data session sequence numbers to the redundant("hot standby") equipment. As the speed of the interfaces increase (e.g. from L2TP over T1 to L2TP over OC192) and as the volume of aggregation increase, this current implied "full state" method of recovery will not scale without imposing a very significant performance impact. In L2TP protocol, the tunnel control channel uses the "send" and "receive" sequence numbers (Ns and Nr) for reliable delivery and detection of duplicate packets. The data sessions within the tunnel have an option to use the "send" sequence number for two possible reasons: detect lost packets and reordering of out of order packets. The current implied method to achieve this is to keep the state-full information current at the hot standby unit. This method does not scale very well and also burdens the system redundancy model. Not saving this information may cause improper operation or termination of control channel and massive data loss for the data sessions after the fail over. To alleviate this problem, a standard mechanism should be devised to restart the sequence numbers for the control and data sessions after failover. The mechanism should be scalable and be effective even when significant number of tunnels exists between the endpoints. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. Proposed failover protocol The proposed mechanism uses an explicit signaling method to synchronize the control plane and data plane protocol with respect to the established tunnels and their sequence numbers. The explicit signaling method provides both the robustness against race condition and addresses security consideration. Maintenance message type TBD (MAINT) Maintenance Message The Maintenance Message is a control channel message type that is encoded in the Message Type AVP similar to other control channel messages. The MAINT message SHOULD be sent on a newly established tunnel for the purpose of re-synchronization after a failover. Parikh, Koscinski Informational - Expires Nov 2001 3 draft-parikh-l2tpext-failover May 2001 Sync Tunnels AVP The Sync Tunnels AVP is valid in MAINT (Maintenance)(see Figure 1). This AVP is marked optional. With this AVP, an endpoint requests its peer to synchronize the established tunnels and restart the sequence numbers of specified tunnel(s) control channel. The data sessions within each tunnel for which sequencing was enabled are also restarted. The peer receiving the MAINT message replies back with a MAINT message to the sending peer, the receiving peer includes the list of tunnels it currently has in the Sync Tunnels AVP. Both peers MUST synchronize the tunnel list upon receipt of this AVP by terminating any mismatched tunnels locally. To accommodate scaling, multiple instance of this AVP may be included in the MAINT message. The Sync Tunnels AVP is encoded with vendor ID 0 and an Attribute Type Sync Tunnels TBD. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |M|H| rsvd | Length | Vendor ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Attribute Type | Tunnel ID list | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: Sync Tunnels AVP Mandatory (M) bit: MUST be 0. Hidden (H) bit: May be 1 or 0. Length: The length of the entire attribute in octets. The length depends on how many tunnel Ids are present in the Tunnel ID list. Each Tunnel ID is 2 octets. Vendor ID: A two octet value in network byte order; set to 0 to indicate that this is an IETF-assigned attribute. Attribute Type: A two octet value set to TBD Tunnel ID List: list of remote tunnel Ids. These tunnel Ids are in ESTABLISHED state at the LAC. Sync Sessions AVP The Sync Sessions AVP(see Figure 2) is valid in HELLO message. This AVP MUST be marked optional. With this AVP an endpoint may informs its peer about all the ESTABLISHED sessions in the tunnel sending the HELLO message. The peer MUST check against the list of its active Parikh, Koscinski Informational - Expires Nov 2001 4 draft-parikh-l2tpext-failover May 2001 sessions and terminate any mismatched sessions locally, the peer SHOULD then reply with this AVP in the next HELLO message to the sending end with its own list of sessions. The Sync Sessions AVP is encoded with vendor ID 0 and an Attribute Type sync info TBD. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |M|H| rsvd | Length | Vendor ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Attribute Type | session ID list | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: Sync Session AVP Mandatory (M) bit: MUST be 0. Hidden (H) bit: May be 1 or 0. Length: The length of the entire attribute in octets. The length depends on how many session Ids are present in the session ID list. Each session ID is 2 octets. Vendor ID: A two octet value in network byte order; set to 0 to indicate that this is an IETF-assigned attribute. Attribute Type: A two octet value set to TBD Session ID List: list of affected remote session Ids. These sessions are in ESTABLISHED state at the LAC. Synchronization of tunnel sequence counters In this example, the LAC is the end point that has just recovered from a failure. The LNS failure scenario should be similar. The assumption is that after failover, the LAC has the static configuration for all established tunnels and established sessions, but does not have the current Ns, Nr state per tunnel or per session. Also the information about tunnels and sessions in the transient state is lost after failover, which is acceptable because such tunnels and sessions will fail and re-establish later. Though the example is shown between one pair of LAC/LNS, it can be extended for multiple pairs. The assumption is that the LAC must have enough free tunnel Ids to carry out this protocol with multiple LNS's at the same time. This mechanism requires the remote tunnel endpoints to have at least one free tunnel ID. Parikh, Koscinski Informational - Expires Nov 2001 5 draft-parikh-l2tpext-failover May 2001 LAC LNS SCCRQ (Challenge AVP ----------------------------------> SCCRP (Challenge Response, Challenge) <---------------------------------- SCCCN (Challenge Response) ----------------------------------> MAINT (Sync Tunnels AVP) ----------------------------------> MAINT (Sync Tunnels AVP) <----------------------------- StopCCN ---------------------------------> ZLB <------------------------------- 1) In this example the LAC experiences a failure and as a result the "Redundant" LAC takes over. 2) The LAC takes following steps prior to sending the SCCRQ to the LNS: a) Complete the setup of the tunnel. The tunnel setup can be optionally authenticated for originator identification. b) Initialize the Sync Tunnel AVP. The Sync Tunnel AVP contains the remote tunnel Ids of the LNS present at LAC. c) Set Ns=0, Nr=0 for each existing control channel d) Set Ns(transmit)=0 for each data session within each tunnel. e) The Ns(receive) for each data session is reset and re-order buffers are flushed. f) Halt all new control traffic originating from LAC towards LNS except for this tunnel, until this tunnel is terminated. g) Discard all control traffic from LNS from existing tunnels. New tunnel requests should be honored as it may be due to the remote end resetting. In this case, Tie Breaker can be used to resolve simultaneous contention. h) Data traffic from LNS to LAC carried by the existing data sessions is accepted and the Nr updated to the Ns+1 of the received packets. Parikh, Koscinski Informational - Expires Nov 2001 6 draft-parikh-l2tpext-failover May 2001 i) Data traffic from the LAC to LNS is allowed to pass. By default, the LAC would sequence the data packets starting at Ns=0. At the LNS, prior to receiving the MAINT messages with Sync Tunnel AVP, these data packets may be lost if the sessions were doing sequencing and re-ordering. If the sessions were not doing re-ordering, then LNS MUST ignore the sequence numbers. j) Send MAINT message with Sync Tunnels AVP 3) The MAINT message arrives at the LNS. a) The LNS checks the sync Tunnels AVP to validate the tunnel Ids. The LNS Tunnel Ids not present in the list should be terminated locally at the LNS. b) Update(reset) all control channel retransmission queues of all established tunnels terminating at the LAC which sent the sync request (do not update tunnels terminating at a different LAC): declare to the upper layer all incomplete session requests as failed, keep the disconnect requests to be sent again with a new Ns; c) set Nr=0, Ns=0 for control channels of all established tunnels terminating at the LAC which sent the sync Tunnel AVP (do not update tunnels terminating at a different LAC) d) Set the Ns(receive) for each data session to the last Ns+1 and flush re-order buffer for each data session performing re- ordering. e) Discard control traffic from existing tunnels from LAC until this tunnel is terminated. f) Don't allow control traffic from the LNS to the LAC until this tunnel is terminated. g) Send MAINT to LAC with Sync Tunnels AVP. The Sync Tunnels AVP contains remote tunnel Ids of the LAC present at LNS. 4) The MAINT with the Sync Tunnels AVP arrives at the LAC. The LAC processes the sync Tunnels AVP and handles all the mismatches, just as LNS. 5). The LAC sends a StopCCN to terminate the tunnel. Synchronization of tunnel sessions There is an unlikely possibility that the LAC (failover node) thinks sessions are up, while the LNS (remote node) thinks the same sessions are down. This could happen when a CDN msg from LNS to the LAC was acked by the ZLB message from LAC to LNS, but the CDN message to the Parikh, Koscinski Informational - Expires Nov 2001 7 draft-parikh-l2tpext-failover May 2001 ôRedundantö LAC got lost during fail over. To synchronize the number of active sessions between tunnel endpoints, the failed tunnel end MAY attach a Sync Session AVP with a list of sessions active on this tunnel to the regularly scheduled HELLO message. The peer tunnel end SHOULD compare the received active sessions list with it's own active sessions list and if mismatched, release them locally. Migration to standard attributes It is intended that both the Sync Tunnel and Sync Sessions AVP will be migrated to standard attributes. As a result, the AVPs outlined in this draft would have a Vendor ID value of 0 and standard Attribute Values. Security Considerations The failover mechanism introduced in this section provides authentication of the originator of resynchronization tunnel. This prevents unauthorized endpoint to invoke this procedure. Parikh, Koscinski Informational - Expires Nov 2001 8 draft-parikh-l2tpext-failover May 2001 Acknowledgement Thanks to W. Mark Townsley, Ishan Weerakoon, Vipin Jain. References 1 RFC 2119 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 [RFC 2661] W. Townsley, A. Valencia, A. Rubens, G. Pall, G. Zorn, B. Palter, "Layer 2 Tunnel Protocol (L2TP)", RFC2661, August 1999. [RFC 1661] Simpson, W., "The Point-to-Point Protocol (PPP)", STD 51, RFC 1661, July 1994. Author's Addresses Keyur Parikh Megisto Systems, Inc. 20251 Century Boulevard, Suite 120 Germantown, MD 20876 Megisto Systems Phone: 1+ 301-444-1723 Email: kparikh@megisto.com Andy Koscinski Amber Networks, Inc. 48664 Milmont Drive Fremont, CA 94538 Phone: +1 510.687.5536 Email: andyk@ambernetworks.com Appendix A Following requirements were considered during the design of the mechanism: - efficiency for cases where there are 1000s of tunnels between two endpoints. - Security of the originator of the sync request. - Minimal loss of user data. Parikh, Koscinski Informational - Expires Nov 2001 9