Network Working Group Ping Pan Internet Draft (Hammerhead Systems) Expiration Date: December 2005 July 2005 Pseudo Wire Protection draft-pan-pwe3-protection-01.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78 and except as set forth therein, the authors retain all their rights. Abstract This document describes a mechanism that helps to protect and recover user traffic when carried over pseudo-wires. The mechanism requires some minor modification to the existing pseudo-wire setup procedure, and is Pan [Page 1] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 fully backward compatible. Essentially, the mechanism is to enable the network operators to setup multiple primary and backup pseudo-wires, and only use one to carry the data traffic itself. Upon network failure, user traffic can be switched over to the next ôbestö pseudo-wire base on preference levels. This document first describes the motivation of the work base on the discussions with a number of carriers. Then we define the protocol extension itself. 1. Specification of Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 2. Introduction Pseudo-wires have been deployed by a number of carriers to carry customer layer-2 data flows. Each Layer-2 flow (or Attachment Circuit) is mapped to a pseudo-wire. Pseudo-wire setup, maintenance and packet encapsulation have been extensively described in a number of IETF PWE3 drafts [PWE3-CTRL, PWE3-TRANSPORT]. Recently, several carriers have argued that, when offered as a service, pseudo-wires need to possess the same capabilities that have been deployed in transport networks for many years, namely, QoS guarantees, OAM, and P/R (protection and restoration). In this draft, we extend LDP pseudo-wire proposal [PWE3-CTRL] to support protection and restoration operation. Why is such work necessary? When it comes to traffic protection, carriers always have a rich set of tools to deal with network failures: fast-reroute [MPLS-FRR] and primary/secondary standby for MPLS, APS and UPSR for SONET/SDH, and Link Aggregation for transport Ethernet. The general concept has been that user traffic needs to be protected at every segment and every layer of the network. The need for pseudo-wire protection and restoration arises in a number of deployment scenarios: Pan [Page 2] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 +-----+ Tunnel-1 +-----+ AC's | |====== OC-48 ========| | AC's <------>| PE1 | | PE2 |<------> } | Tunnel-2 | | | |======= OC-3 ========| | +-----+ +-----+ Figure-1: Bandwidth Mismatch In Figure-1, there exist two parallel tunnels between two PE's. They have different link capacity. Hence, whenever the bandwidth on a protecting link is smaller than that on the working link, we may run into trouble during protection and restoration. In the example, let's assume that both tunnels are MPLS LSP's. Network operators have enabled MPLS fast-reroute to enable both LSPÆs protecting each other. From the PE's, a number of AC's are aggregated into the LSPÆs as pseudo-wires. Some AC's carry mission-critical data, while others transport best-effort data only. If Tunnel-1 fails, all traffic on Tunnel-1 will be switched into Tunnel-2. However, since both tunnels have different bandwidth, mission- critical traffic could be dropped as the result of link congestion during switch-over. This problem can be easily resolved if each pseudo-wire has its own preference, which allows the pseudo-wires to preempt each other when necessary. Also note that, since the pseudo-wires are always bi- directional, the preference assignment must be consistent on both ends the pseudo-wires. +-----+ +----+ AC's | |====== GigE ======| | AC's or PW's <------>| CPE | | PE |<--------------> | |====== DS3 =======| | +-----+ +----+ Figure-2: Network Access Figure-2 illustrates another deployment scenario where pseudo-wire protection may become critical. Pseudo-wire is becoming important for multi-protocol multi-service data access. One reason is that pseudo-wire enables data aggregation, which in turn improves bandwidth utilization. In a typical metro network access location (Hub or CO), the statistical multiplexing Pan [Page 3] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 gain is approximately 3-4 [ATT-REPORT]. So the early data gets aggregated, better bandwidth utilization for the carriers û especially at the access locations where bandwidth is still precious. More importantly, pseudo-wire provides a simple and uniform data transport layer, where all layer-2 packets can be processed uniformly at PE's. When designed properly, using pseudo-wire for data access enables the carriers to bypass Layer-2 specific management interface, which simplifies network operation. However, given the size of the carrier networks, the pseudo-wire access strategy has to depend on the cost of access devices or CPE's. To maintain a low cost, the access devices may not be routers, and use IP routing and MPLS signaling for traffic protection and recovery. As shown in Figure-2, the access device (CPE) will run target LDP to exchange pseudo-wire labels with the PE. In this case, a preferred protection method is to conduct traffic protection at pseudo-wire level. Finally, the network operators need to have the ability to support planned traffic shifting. In Figure-1, there are two links between two PE's carrying a number of pseudo-wires. During network maintenance, carriers may decide to shift all traffic from a set of pseudo-wires from one link to another one temporally without causing traffic disturbance to users. To support this operation, pseudo-wire protection can be manually triggered from the operators [NOTE1]. 3. Design Considerations 3.1. Protection Schemes There are three basic types of point-to-point protection: 1+1, 1:1 and 1:N. 1+1 is to transmit same traffic over two parallel links. The receiver will only pick traffic from one link at any given time. In event of failure, at least one of the links still carries the actual traffic. However, in packet networks, this may not be the best way to consume link bandwidth. 1:1 protection is to use one connection to protect another connection. The most popular 1:1 protection is SONET APS. 1:N is a generalized version of 1:1. In 1:N, one connection is established to protection multiple other connections. MPLS Facility Backup is one such example. Pan [Page 4] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 In pseudo-wire protection, each AC may have its own layer-2 characteristics that need to be maintained separately. When applying 1:N protection to these AC's, it would seem odd, for example, to setup one backup pseudo-wire to protect both a best-effort Ethernet VLAN connection and an ATM SPVC with CBR and VBR traffic requirements at the same time. On the other hand, if the pseudo-wires have strict bandwidth requirements, and the network needs to conserve network resources, the 1:N approach would be more optimal. In our design, we shall consider both 1:1 and 1:N schemes. But we will only define the operation sequence and protocol extension for 1:1 initially. 3.2. Protection Types Pseudo-wire protection will support the following types: cold, warm and hot standby. 3.2.1. Cold Standby The edge nodes will only negotiate and establish secondary (or backup) pseudo-wires after network failure. The nodes on cold standby need to have more than one LDP Hello adjacencies, where one of them can be used to carry data traffic after network failure. This type of protection can be fully supported with the existing specification [PW-CTRL]. The protection effectiveness depends on how fast two edge nodes can react to network failure and process control messages after the failure. 3.2.2. Warm Standby The edge nodes will negotiate backup pseudo-wires and exchange labels prior to any network failure. However, data forwarding path will not be programmed for label processing and QoS enforcement until after the detection of network failures. Such practice and requirement come from traditional transport carriers. In SONET/SDH networks, switches reserve the protection time slots ahead of time. Upon the detection of network failure, the nodes "wake-up" the protection connections. Pan [Page 5] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 3.2.3. Hot Standby This is the most efficient protection method. The protecting pseudo-wires are established before any network failure. This is also known as "make-before-break". Upon the detection of network failure, the edge nodes will switch data traffic into pre-established backup pseudo-wires directly. The protection efficiency is therefore depending on the speed for switch-over, which is in the order of milliseconds. 3.3. Link, Node and Path Protection In today's deployment, pseudo-wire setup and management involve only two edge nodes. The recent multi-hop pseudo-wire [MH-PWE3] has introduced the concept of pseudo-wire stitching, where each edge-to- edge pseudo-wire may consist of multiple segments. In addition to two edge nodes, one or multiple transit pseudo-wire nodes are involved in switching pseudo-wires between networks. Pseudo-wire protection thus needs to consider the following: (1) How to protect each pseudo-wire segment? (2) How to protect traffic in case of pseudo-wire switching point failure? (3) How to protect the entire edge-to-edge pseudo-wires? The first issue can be taken care of by treating each segment as an individual single-hop pseudo-wire, and create backup pseudo-wires to protect it. The second issue is similar to the requirement in MPLS fast-reroute node-protection. In case of node failure, the pseudo-wire switching points need to relay the failure information toward the edge nodes to trigger traffic switch-over. Since multi-hop pseudo-wire design is still under way, we will propose the solutions for the second and the third issues. Pan [Page 6] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 3.4. Backup Pseudo-wire Scaling Another consideration is the number of backups a pseudo-wire may have. This is a network design and deployment issue. Any protocol extension and implantation should not pose any constraint in this area. 4. LDP Extension PW protection is based on [PW-CTRL] and [LDP]. PW label binding uses targeted LDP, where two edge nodes first establish an LDP session using the Extended Discovery mechanism described in [LDP]. PW's are initiated via LDP Label Mapping messages. Each message contains a FEC TLV, a Label TLV, and some optional TLVs. The PW TLV can be either PWId FEC or Generalized ID FEC. In case of Generalized ID FEC, the mapping message will also include an Interface Parameters TLV, as described in [PW-CTRL]. PW protection operates under the assumption that there exists more than one link between a pair of PE's to transport data traffic, as shown in Figure-3. Each PE maintains multiple LDP Hello adjacencies, one for each link. +-------+ +---------+ AC | | Primary PW | | AC ---- +-+--+--O=======L0========O--+---+------- | | | | | | | | | | | | Backup PW-1 | | | | | | +--O=======L1========O--+ | | | | | | | | | | ... ... | | | | | | | | | | | Backup PW-N | | | | +-----O=======LN========O------+ | | | | | +-------+ +---------+ PE1 PE2 Figure-3: PW Protection Example For each primary (or working) PW, the PE's can setup one or multiple backup (protecting) PW's. The procedure on setting up the primary and backup PW's is the same as the one for regular PW's. The only Pan [Page 7] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 difference is that during PW initiation, a Protection TLV will be included in the mapping messages. The new TLV describes the preference levels for each PW. The Label Mapping messages will be sent over multiple Hello adjacencies between two PE's. All primary and backup PW's share the same attachment circuit information. The PE's will only transmit data traffic over the PW that has the highest preference level. During network failure, the PE's will switch-over traffic into the PW that has the next highest preference level. After network recovery, the PE's will revert back to the previous PW. 4.1. The PROTECTION TLV 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1|0| Protection tlv (???) | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Setup Pref Lvl | Hold Perf Lvl |Protection Type| Scheme | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - U bit (always set) The two edge nodes may not support the protection feature at the same time. On the node that does not support the PROTECTION TLV, only one pseudo-wire will be established. In case of network failure, no fast switch-over will be available. - Protection tlv The value of the new tlv type needs to be allocated by IANA. - Setup Preference Level The preference level with respect to initiate a PW. The value of 0 is the highest. The Setup Preference Level is used in deciding whether this PW can preempt another PW. - Holding Preference Level The preference level with respect to maintain a PW. The value of 0 is the highest. The Holding Preference Level Pan [Page 8] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 is used in deciding whether this PW can be preempted by another PW. - Protection Type Currently we have defined the following values: Hot Standby: 0 Warm Standby: 1 The default value is 0 (Hot Standby). - Scheme Currently, this can be one of the following: 1:1 protection: 0 1:N protection: 1 The default value is 0 (1:1 protection) Using the Protection TLV, the operator can configure the protection mechanism that they prefer. Since the pseudo-wires are always bidirectional, exchanging protection information between two edge nodes will help to achieve a consistent protection behavior for each pseudo-wire. 4.2. Signaling Procedures PW protection is an extension to the PW control and maintenance draft [PW-CTRL]. Essentially, the mechanism is to enable the network operators to setup multiple primary and backup pseudo-wires, and only use one to carry the data traffic itself. Upon network failure, user traffic can be switched over to the next ôbestö pseudo-wire base on preference levels. When there is more than one LDP Hello adjacency between a pair of PE's, the operators can always configure backup pseudo-wires for data protection purposes. As illustrated in Figure-3, to begin the setup from PE1, the operators first initiate the Primary PW over adjacency L0, and then initiate the Backup PW-1 over adjacency L1 by sending Label Mapping messages. The Label Mapping messages for both PW's must have the same PW FEC (PWid or Generalized ID) and the same AC interface information. They must have different Label and Protection TLV. The Label TLV contains Pan [Page 9] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 the label value to carry the actual data traffic over each PW. Each PW has different preference values in the Protection TLV. When PE2 receives a Label Mapping message, it will perform the following checks: If PE2 does not support the Protection TLV, it will ignore the TLV and precede the regular PW setup. PE2 can only setup one PW with PE1 per AC. PE2 will reply a Label Release Message to reject the extra PW's from PE1. PE2 should however notify PE1 by signaling the "Unknown TLV" status code. If PE2 supports the Protection TLV, it will process the rest of the mapping message. PE2 needs to check if it already has the PW's with the same attachment ID (PWid or the combination of AGI, SAII and TAII) in its database. On each PE, all PW's with the same attachment ID must have different preference level. In this case, PE2 will always reject the mapping message with the same preference level by replying a Label Release message. PE2 should notify PE1 with a "Duplicated Preference" status code. PE's cannot maintain more than one PW with the same attachment ID over an LDP adjacency. However, it is possible that the operators decide to adjust the preference levels or style for maintenance purposes. As a result, PE2 may receive multiple Label Mapping messages with the same attachment ID from a particular adjacency. In this case, PE2 will overwrite the existing protection information with the new one. If PE2 decides to accept the Label Mapping message, then it has to make sure that a LSP is setup in the opposite direction (PE1->PE2). If no corresponding tunnel, it must initiate it by sending a Label Mapping message to PE1. Other than reversing the SAI and TAI in PW FEC, PE2 must send the same Protection TLV back to PE1. 4.3. Consistent Protection Behavior PW's are bidirectional. Each PW must have the same protection behavior at both ends. Otherwise, a user traffic flow may have a hot-standby that can switch-over within 50 milliseconds on one direction, but slow to recover on the other direction. If the PW is initiated from one end (PE1), the other end (PE2) must comply by replying a Label Mapping message with the same Protection TLV. However, it is possible that the operators are to setup a PW Pan [Page 10] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 from both ends (PE1 and PE2) manually. In this case, if the protection parameters are inconsistent, the PE's need to reject the PW setup, and notify the operators with a "Mismatched Preference" status code. 4.4. Preference Levels All PW's are not created equal. Some will have higher preference level than the others. In case of network failure, the PE's will first protect the PW's with a higher preference. Some PW's may have network resource (such as, bandwidth) association. The PE's will reject some of the backup PW's during the setup, when there is not enough resource available on a backup link. PE's will notify the operators with an "Out of Backup Resource" status code. 5. Protecting Multi-Segment Pseudo Wires [Segmented-PW] describes the cases where pseudo-wires can be stitched at intermediate nodes. The proposed mechanism can be used to protect each segmented pseudo-wire, assuming LDP is the signaling protocol. [MHOP-PW] describes a method to establish pseudo-wires over multiple intermediate networks. At the edge of each network, the pseudo-wires will be processed and extended toward the ultimate destination. The proposed mechanism can be used to protect this type of multi-hop pseudo-wire, where each intermediate hop will use the defined TLV for per-segment protection. 6. Security Considerations This document specifies the LDP extensions that are needed for protecting pseudo-wires. It will have the same security properties as in [LDP] and [PW-CTRL]. Pan [Page 11] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 7. IANA Considerations We have defined the following protocol extension: 7.1. PW Protection TLV This is a new LDP TLV type. 7.2. PW Status Code The edge nodes need to information each other in a number of error conditions. Several PW status code need to be defined: 0x00000XYZ "Duplicated preference levels" 0x00000XYZ "Mismatched Preference" 0x00000XYZ "Out of Backup Resource" 8. Acknowledgement We are grateful on the opportunities of discussing this idea with various people in the past several months. 9. Full Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Pan [Page 12] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 10. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. 11. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Pan [Page 13] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 12. Normative Reference [MH-PWE3] L. Martini, et al, "Requirements for inter domain Pseudo- Wires", draft-martini-pwe3-MH-PW-requirements-00.txt [PW-CTRL] L. Martini, et al, "Pseudowire Setup and Maintenance using LDP", draft-ietf-pwe3-control-protocol-14.txt [PWE3-TRANSPORT] L. Martini, et al, "Transport of Layer 2 Frames Over MPLS", draft-martini-l2circuit-trans-mpls-14.txt [LDP] L. Andersson, et al, "LDP Specification", draft-ietf-mpls- rfc3036bis-00.txt [MPLS-FRR] P. Pan, et al, "Fast Reroute Extensions to RSVP-TE for LSP Tunnels", draft-ietf-mpls-rsvp-lsp-fastreroute-07.txt [DRY-MARTINI] P. Pan, ôDry-Martini: Supporting PWE3 over Sub-IP Access Networksö, draft-pan-pwe3-over-sub-ip-01.txt, July 2005 [ATT-REPORT] T. Afferton, et al, "Packet Aware Transport for Metro Networks", IEEE Network Magazine, April 2004. [NOTE1] Other mechanism may also be applicable for planned shutdown. See ôLDP graceful restart for planned outages (draft-minei-mpls-ldp- planned-restart-01.txt)ö by Ina Minei, et al. [Segmented-PW] Martini et.al. " Segmented Pseudo Wire", draft-ietf- pwe3-segmented-pw-00.txt, July 2005 [MHOP-PW] Florin Balus et. Al. ôMulti-Segment Pseudowire Setup and Maintenance using LDPö, draft-balus-mh-pw-control-protocol-02.txt, July 2005 13. Informative References None Pan [Page 14] Internet Draft draft-pan-pwe3-protection-01.txt July 2005 14. Author Information Ping Pan Hammerhead Systems 640 Clyde Court Mountain View, CA 94043 e-mail: ppan@hammerheadsystems.com Pan [Page 15]