Internet Draft Document Vach Kompella Category: Standards Track Joe Regan Expires: August 2008 Alcatel-Lucent Shane Amante Level 3 Communications February 18, 2008 Conversation Hashing for Pseudowires draft-vkompella-pwe3-hash-label-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 21, 2008. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract V. Kompella Expires August 2008 [Page 1] Internet-Draft Hashing on Pseudowires February 2008 This draft proposes a method to introduce granularity on the hashing of traffic running over pseudowires. Most forwarding engines are able to hash based on label stacks, so the approach here is to introduce additional labels that do not affect the handling of packets, but which identify a conversation, and can be hashed with granularity. 1. Introduction This draft proposes a method to introduce granularity on the hashing of traffic running over pseudowires. Typically, forwarding hardware is capable of looking at some fields in packets to construct hash buckets for conversations or flows. The ingress node is able to look at the un-encapsulated packet and spread flows around. At intermediate nodes, for pseudowires, there is no information on what layer 2 protocol encapsulation is on the packet, so the hardware can only hash on is the label stack. However, the granularity obtained over pseudowires is inadequate for real load-balancing, especially when the pseudowires emulate fat trunks. 2. The Solution When two PEs open up a targeted LDP session between them, as part of the Capability exchange between the two peers [LDP-Cap], the Hash Label TLV is exchanged. The Hash Label TLV specifies a set of labels that instruct the receiving PE to POP and continue on to the next label in the stack. Since forwarding engines generate hash buckets based on the label stack, the Hash Label(s) can be used to provide some diversity in the conversations in a pseudowire. Suppose that an LDP session has been established between two peers, P and Q, and Q has signaled ten Hash Labels in the range 101 through 110 (inclusive). On receiving a packet from the attachment circuit, node P will hash the packet into one of ten buckets, one for each Hash Label received by P. P will then encapsulate the packet with the PW label at the bottom of stack, add the appropriate Hash Label corresponding to the hash bucket, and finally add the tunnel encapsulation. Assume for the moment that the tunnel encapsulation is another label. At P, the layer 2 fields are visible, and a next hop can be determined out of the multiple (e.g., ECMP or LAG) next hops. However, at an LSR node, the label stack provides more V. Kompella Expires August 2008 [Page 2] Internet-Draft Hashing on Pseudowires February 2008 variability, even though the packets belong to the same pseudowire because the Hash Label gives more diversity. The same set of labels used for hashing can be used between Q and any other node that it sets up a targeted LDP session, and the same set of labels can be used across different pseudowires. Note that this solution can be extended, e.g., if P is capable of imposing four labels, and if Q is capable of processing a four label stack, then P can hash the flows into 100 buckets (using two of the hash labels for the conversation diversity). This would also require that the intermediate nodes be capable of hashing a four label stack. The order of the labels must be PW label at the bottom, Router Alert (if present), and then the Hash Label(s). Finally, the tunnel encapsulation comes at the top of the stack, which may be a label (or a pair of labels if the MPLS protocol imposes them, e.g., using facility bypass protection [RFC4090], or inter-area LDP [LDP-Ext]). 2.1. Protocol Format We introduce a new Hash Label TLV which has the following format. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |U|F| Hash Label TLV | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NumPushLabels | NumPopLabels | NumHashLabels | AllocType | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MBZ | Label 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MBZ | Label 2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MBZ | " | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Hash Label TLV Type. The type of the Hash Label TLV (TBD from IANA). Length. The length of the TLV. NumPushLabels. V. Kompella Expires August 2008 [Page 3] Internet-Draft Hashing on Pseudowires February 2008 The number of hash labels the node can push. NumPopLabels. The number of hash labels the node can pop. NumHashLabels. The number of hash labels provided for use. AllocType. The type of allocation scheme. If AllocType = 0, then the labels following the AllocType are a list of labels. If AllocType = 1, then exactly two labels must follow the AllocType, and they provide the lower and upper bound of a range of labels (inclusive). Label 1, Label 2, etc. If AllocType = 0, these are actual labels that may be used as hash labels. If AllocType = 1, then they are the lower and upper bound of a range of hash labels that may be used. 3. Packet format with PW hash labels The following is an example of what could happen if hash labels are exchanged between two nodes P and Q, where P sends Q the Hash Label TLV with 10 labels between 101 and 110. The figure below shows the PW and tunnel labels. PW label 2001 ------------------------------ | ----- | | ------>| C |------- | | | 4000 ----- 7000 | | | | v v ----- ----- ----- ----- AC1-----| P |----| A |----| B |----| Q |-----AC2 ----- ----- ----- ----- | ^ | ^ | ^ | | | | | | --------- ------ ------- 3000 5000 6000 Tunnel Labels: P->A: 3000 P->C: 4000 A->B: 5000 B->Q: 6000 C->B: 7000 V. Kompella Expires August 2008 [Page 4] Internet-Draft Hashing on Pseudowires February 2008 Q hashes a packet from attachment circuit AC2, on whatever relevant fields define a conversation or flow, and comes up with an index between 1 and 10, say 5. Then Q constructs the packet to P to look like: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 6000 (Tunnel Label) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 105 (Hash Label) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 2001 (PW Label) (BOS) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ When B receives the packet, it will hash the label stack {6000, 105, 2001} and come up with one of the next-hops A or C. Say the result is A. The packet from B to A will look like: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 5000 (Tunnel Label) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 105 (Hash Label) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 2001 (PW Label) (BOS) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P would then receive the following packet from A: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 3000 (Tunnel Label) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 105 (Hash Label) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 2001 (PW Label) (BOS) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P will pop 3000, find the hash label 105 (action pop), and then process 2001 as the PW label to forward the packet out AC1 with whatever necessary encapsulation is required for that DLC. V. Kompella Expires August 2008 [Page 5] Internet-Draft Hashing on Pseudowires February 2008 The rationale for putting the hash label between the PSN tunnel encapsulation and the PW label is that the forwarding engine will not have to process the PW label and then after it has taken the appropriate action, be required to remember the context while it processes the hash labels. 4. Future considerations One future application of this method would be to create a basis for hash diversity without having to peek below the label stack for IP traffic carried over LDP LSPs. 5. References Normative References Informative References [LDP-Cap] "LDP Capabilities," R. Thomas et al, draft-ietf-mpls- ldp-capabilities-01.txt, work in progress, February 2008. [RFC4090] "Fast Reroute Extensions to RSVP-TE for LSP Tunnels," P. Pan, RFC 4090, May 2005. [LDP-Ext] "LDP extension for Inter-Area LSP," B. Decraene et al, draft-ietf-mpls-ldp-interarea-02.txt, work in progress, February 2008. 6. Security Considerations No new security issues arise out of the extensions proposed here than exist in the base PWE3 standards. 7. IANA Considerations No IANA allocations have been specified yet (but a new TLV type will be forthcoming, as well as changes to the LDP Capability FEC TLV). 8. Authors' Addresses Vach Kompella Alcatel-Lucent vach.kompella@alcatel-lucent.com V. Kompella Expires August 2008 [Page 6] Internet-Draft Hashing on Pseudowires February 2008 Joe Regan Alcatel-Lucent joe.regan@alcatel-lucent.com Shane Amante Level 3 Communications shane@castlepoint.net 9. Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 10. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. V. Kompella Expires August 2008 [Page 7] Internet-Draft Hashing on Pseudowires February 2008 11. Acknowledgments Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). V. Kompella Expires August 2008 [Page 8]