Network Working Group Seisho Yasukawa Category: Informational NTT Expires: August 2006 Adrian Farrel Old Dog Consulting February 2006 An analysis of scaling issues in MPLS-TE backbone networks draft-yasukawa-mpls-scaling-analysis-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Traffic engineered Multiprotocol Label Switching (MPLS-TE) is being deployed in provider's backbone networks. The providers wish to grow these networks, and need to discover whether existing protocols and implementations can support the network sizes that they are planning. This document presents an analysis of some of the scaling concerns for MPLS-TE backbone networks, and examines the value of two techniques for improving scaling. Yasukawa and Farrel [Page 1] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 Contents 1. Introduction ................................................... 2 2. Network configurations ......................................... 3 2.1 Commercial drivers for selected configurations ................ 5 3. Required network sizes ......................................... 6 4. Issues of concern for scaling .................................. 6 4.1 LSP state ..................................................... 6 4.2 Processing overhead ........................................... 6 4.3 RSVP-TE implications .......................................... 7 4.4 Management .................................................... 7 4.5 Practical Numbers .... ........................................ 8 5. Scaling in flat networks ....................................... 8 6. Scaling improvements through Forwarding Adjacencies ............ 9 6.1 Two layer hierarchy ........................................... 9 6.1.1 Tuning the network topology to suit the two layer hierarchy. 10 6.2 Alternative two layer hierarchy .............................. 11 6.3 Three layer hierarchy ........................................ 11 6.4 Issues with hierarchical LSPs ................................ 12 7. Scaling improvements through multipoint-to-point LSPs ......... 13 7.1 Overview of MP2P LSPs ........................................ 13 7.2 Scaling improvements ......................................... 14 7.2.1 Comparison with other scenarios ............................ 15 7.3 Issues with MP2P LSPs ........................................ 16 8. Combined models ............................................... 17 9. Management Considerations ..................................... 17 10. Security Considerations ...................................... 18 11. Recommendations .............................................. 18 12. IANA Considerations .......................................... 18 13. Acknowledgements ............................................. 18 14. Intellectual Property Consideration .......................... 18 15. Normative References ......................................... 19 16. Informational References ..................................... 19 17. Authors' Addresses ........................................... 20 18. Disclaimer of Validity ....................................... 20 19. Full Copyright Statement ..................................... 20 1. Introduction As traffic engineered Multiprotocol Label Switching (MPLS-TE) grows in popularity, providers are looking at how they could deploy it in networks that are larger than the more simple and experimental networks that have been seen so far. This is leading them to examine the number of LSPs that may need to be supported at various points within the network, and at the consequent impact on control plane and management plane software as well as on the constraints placed by the physical limitations of the routers in the network. Physical scaling topology concerns are addressed by building networks that are not fully meshed. Network topologies tend to be meshed in Yasukawa and Farrel [Page 2] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 the core, but tree-shaped at the edges giving rise to a snow-flake design. MPLS-TE, however, establishes a logical full mesh between all edge points in the network, and this is where the scaling problems arise since the tree structure of the network tends to focus a large number of LSPs within the core of the network. This document presents a generic network topology and introduces terminology for the different scaling parameters. It then examines how many LSPs might be carried within the core of a network. Two techniques (hierarchical and multipoint-to-point LSPs) are introduced and an examination is made of the scaling benefits that they offer as well as of some of the concerns with using these techniques. Of necessity, this document makes very many generalizations. Not least among these are a set of assumptions about the symmetry and connectedness of the physical network. It is hoped that these generalizations will not impinge on the usefulness of the overview of the scaling properties that this document attmepts to give. 2. Network configurations The network configurations considered in this document are based on a hierarchy of connectivity within the core network. PE nodes have connectivity to P nodes as shown in figure 1. There may be interconnection between the PEs that are connected to a single P node, but there is no direct connectivity between PEs that are connected to different P-nodes. Dual homing of PEs to multiple P nodes is not considered in this document although it may be a valuable addition to a network configuration. P /|\ / | \ / | \ / | \ PE PE PE Figure 1 : PE to P node connectivity The relationship between P-nodes is also structured in a hierarchical way. Thus, as shown in figure 2, multiple P-nodes at one level are connected to a P-node at a higher level. We number the levels such that level 1 is the top level and level (n) is immediately above level (n+1), and we denote a P-node at level n as a P(n). There may be interconnection between P(n+1) nodes connected to a single P(n), but there is no direct connectivity between P(n+1) nodes connected to Yasukawa and Farrel [Page 3] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 different P(n) nodes. Again, dual homing of P(n+1) nodes to multiple P(n) nodes is not considered in this document although it may be a valuable addition to a network configuration. P(n) /|\ / | \ / | \ / | \ P(n+1) P(n+1) P(n+1) Figure 2 : Relationship between P-nodes At the top level, P(1) nodes are connected in a full mesh. In reality, the level 1 part of the network may be slightly less well connected than this, but assuming a full mesh provides for generality. The key multipliers for scalability are the number of P(1) nodes, and the multiplier relationship between P(n) and P(n+1) at each level including down to PEs. We define the multiplier M(n) as the number of nodes attached to any one P(n). We define S(n) as the number of nodes at level (n). Thus: S(n) = S(1)*M(1)*M(2)*...*M(n+1) So the number of PEs can be expressed as: S(PE) = S(1)*M(1)*M(2)*...*M(n) where the network has (n) layers of P-nodes. Thus we may depict an example network as shown in figure 3. In this case: S(1) = 3 M(1) = 3 S(2) = S(1)*M(1) = 9 M(2) = 2 S(PE) = S(1)*M(1)*M(2) = 18 Yasukawa and Farrel [Page 4] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 PE PE PE PE PE PE \ \/ \/ / PE--P(2) P(2) P(2) P(2)--PE \ | | / \| |/ PE--P(2)---P(1)------P(1)---P(2)--PE / \ / \ PE \ / PE \/ P(1) /|\ / | \ / | \ PE--P(2) P(2) P(2)--PE / /\ \ PE PE PE PE Figure 3 : An example network 2.1 Commercial drivers for selected configurations It is reasonable to ask why this particular network connectivity structure has been chosen. The consideration must be physical scalability. Each LSR is only able to support a limited number of physical interfaces. This necessarily reduces the ability to fully mesh a network and leads to the tree-like structure of the network towards the PEs. A realistic commercial consideration for an operator is the fact that the only revenue-generating nodes in the network are the PEs. Other nodes are needed only to support connectivity and scalability. Therefore, there is a desire to maximize S(PE) while minimizing the sum of S(n) for all values of (n). This could be achieved by minimizing the number of levels, and by maximizing the connectivity at each layer, M(n). Ultimately, however, this would produce a network of just interconnected PEs, which is clearly in conflict with the physical scaling situation. Therefore, the solution calls for a "few" levels with "relatively large" connectivity at each level. We might say that the cost-effectiveness of the network can be stated as: K = S(PE)/(S(1)+S(2) + ... + S(n)) where n is the level above the PEs This document examines the implications for control plane and data plane scalability of this type of network when MPLS-TE LSPs are used to provide full connectivity between all PEs. Yasukawa and Farrel [Page 5] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 3. Required network sizes An important question for this evaluation and analysis is the size of the network that operators require. How many PEs are required? What ratio of P to PE is acceptable. How many ports do devices have for physical connectivity? What type of MPLS-TE connectivity between PEs is required? Although presentation of figures for desired network sizes is ultimately pointless because history shows that networks grow beyond all projections, it is useful to set some acceptable lower bounds. That is, we can state that we already know that networks will be at least of a certain size. The most important features are: - The network should have at least 1000 PEs. - Each pair of PEs should be connected by at least one LSP in each direction. 4. Issues of concern for scaling This section presents some of the issues associated with the support of LSPs at an LSR or within the network that may mean that there is a a limit to the number LSPs that can be supported. 4.1 LSP state LSP state is the data (information) that must be stored at an LSR in order to maintain an LSP. While the size of the LSP state is implementation-dependent, it is clear that any implementation will require some data in order to maintain LSP state. Thus LSP state becomes a scaling concern because as the number of LSPs at an LSR increases, so the amount of memory required to maintain the LSPs increases in direct proportion. Since the memory capacity of an LSR is limited, there is a related limit placed on the number LSPs that can be supported. Note that techniques to reduce the memory requirements (such as data compression) may serve to increase the number of LSPs that can be supported, but this will only achieve a moderate multiplier and may significantly decrease the ability to process the state rapidly. 4.2 Processing overhead Depending largely on implementation issues, the number of LSPs supported by an LSR may impact the processing speed for each LSP. For example, control block search times can increase with the number of control blocks, and even excellent implementations cannot completely Yasukawa and Farrel [Page 6] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 mitigate this fact. Thus, since CPU power is constrained in any LSR, there may be a practical limit to the number of LSPs that can be supported. Further processing overhead considerations depend on issues specific to the control plane protocols and discussed in the next section. 4.3 RSVP-TE implications Like many connection-oriented signaling protocols, RSVP-TE requires that state is held within the network in order to maintain LSPs. The impact of this is described in section 4.1. Note that RSVP-TE requires that separate information is maintained for upstream and downstream relationships, but does not require any specific implementation of that state. RSVP-TE is a soft-state protocol which means that protocol messages (refresh messages) must be regularly exchanged between signaling neighbors in order to maintain the state for each LSP that runs between the neighbors. A common period for the transmission (and receipt) of refresh messages is 30 seconds meaning that each LSR must send and receive one message in each direction (upstream and downstream) for every LSP it supports every 30 seconds. This has the potential to be a significant constraint on the scaling of the network, but various improvements [RFC2961] mean that this refresh processing can be significantly reduced allowing an implementation to be optimized to nearly remove all concerns about soft state scaling in a stable network. RSVP-TE also requires that signaling adjacencies are maintained through the use of Hello message exchanges. Although [RFC3209] suggests that Hello messages should be retransmitted every 5ms, in practice values of around 3 seconds are more common. Nevertheless, the support of Hello messages can represent a scaling limitation on an RSVP-TE implementation since one message must be sent and received to/from each signaling adjacency every time period. This can impose limits on the number of neighbors (physical or logical) that an LSR supports. 4.4 Management Another practical concern for the scalability of large MPLS-TE networks is the ability to manage the network. This may be constrained by the available tools, the practicality of managing large numbers of LSPs, and the management protocols in use. Management tools are software implementations. Although such implementations should not constrain the control plane protocols, it is realistic to appreciate that network deployments will be limited by the scalability available tools. In practice, most existing tools Yasukawa and Farrel [Page 7] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 have a limit to the number of LSPs that they can support. While an NMS may be able to support a large number of LSPs, the number that can be supported by an EMS (or the number supported by an NMS per-LSR) is more likely to be limited. Similarly, practical constraints may be imposed by the operation of management protocols. For example, an LSR may be swamped by management protocol requests to read information about the LSPs that it supports and this might impact its ability to sustain those LSPs in the control plane. OAM, alarms and notifications can further add to the burden placed on an LSR and limit the number of LSPs it can support. All of these consideration encourage a reduction in the number of LSPs supported within the network and at any particular LSR. 4.5 Practical Numbers In practice, reasonable target numbers are as follows. S(PE) >= 1000 Number of levels is 3. That is: 1, 2 and PE. M(2) <= 20 M(1) <= 20 S(1) <= 100 5. Scaling in flat networks Before proceeding to examine potential scaling improvements, we need to examine how well the flat network described in figure 3 scales. Consider the requirement for a full mesh of LSPs linking all PEs. That is, each PE has an LSP to and from each other LSP. Thus, if there are S(PE) PEs in the network, there are S(PE)*(S(PE) - 1) LSPs. Define L(n) as the number of LSPs handled by a level (n) LSR. L(PE) = 2*(S(PE) - 1) L(2) can be computed as the sum of all LSPs for all attached PEs including the LSPs between the attached PEs (but being careful to only count those LSPs once). Thus, since the number of attached PEs is M(2), we have: L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) L(1) can be computed as the sum of the number of LSPs for all downstream PEs less those that can be served exclusively by the attached P(2) nodes, and only counting the LSPs routed through two attached P(2) nodes once. So: Yasukawa and Farrel [Page 8] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 L(1) = 2*M(1)*M(2)*(S(PE) - 1) - 2*M(1)*M(2) - M(2)*(M(1) - 1) = 2*M(1)*M(2)*[S(PE) - 2.5] - M(2) So, for example, with S(1) = 5, M(1) = 10, and M(2) = 20, we see: S(PE) = 1000 L(PE) = 1998 L(2) = 39580 L(1) = 398980 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: S(PE) = 2000 L(PE) = 3998 L(2) = 79580 L(1) = 798980 In both examples, the number of LSPs at the core (P(1)) nodes is probably unacceptably large even though there are only a relatively modest number of PEs. In fact, L(2) may even be too large in the second example. 6. Scaling improvements through Forwarding Adjacencies One of the purposes of LSP hierarchies [RFC4206] is to improve the scaling properties of MPLS-TE networks. LSP tunnels (sometimes known as Forwarding Adjacencies (FAs)) may be established to provide connectivity over the core of the network and multiple edge-to-edge LSPs may be tunneled down a single FA LSP. In our network it is natural to consider a mesh of FA LSPs between all core nodes at the same level. We consider two possibilities here. In the first all P(2) nodes are connected to all other P(2) nodes, and the PE-to-PE LSPs are tunneled across the core of the network. In the second, an extra layer of hierarchy is introduced by connecting all P(1) nodes in a mesh and tunneling the P(2)-to-P(2) tunnels through these. 6.1 Two layer hierarchy In this hierarchy model, the P(2) nodes are connected by a mesh of tunnels. This means that the P(1) nodes do not see the PE-to-PE LSPs. It remains the case that: L(PE) = 2*(S(PE) - 1) L(2) is slightly increased. It can be computed as the sum of all LSPs for all attached PEs including the LSPs between the attached PE plus the number of FA LSPs providing a mesh to the other P(2) nodes. Thus, Yasukawa and Farrel [Page 9] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 since the number of attached PEs is M(2) and the number of P(2) nodes is S(2), we have: L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) + 2*(S(2) - 1) L(1), however, is significantly reduced and can be computed as the sum of the number of FA LSPs to and from each attached P(2) to each other P(2) in the network, including (but counting only once) the FA LSPs between attached P(2) nodes. So: L(1) = 2*M(1)*(S(2) - 1) - M(1)*(M(1) - 1) So, for example, with S(1) = 5, M(1) = 10, and M(2) = 20, we see: S(PE) = 1000 S(2) = 50 L(PE) = 1998 L(2) = 39678 L(1) = 890 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: S(PE) = 2000 S(2) = 100 L(PE) = 3998 L(2) = 79778 L(1) = 1890 So, in both examples, the problem of the number of LSPs at the core (P(1)) nodes is solved, but any problem with L(2) is made slightly worse. 6.1.1 Tuning the network topology to suit the two layer hierarchy Clearly we can reduce L(2) by selecting appropriate values of S(1), M(1) and M(2). We can do this pretty much with immunity since no change will affect L(PE), and since L(1) is now so small. Observe that: L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) + 2*(S(2) - 1) where S(PE) = S(1)*M(1)*M(2) and S(2) = S(1)*M(1). So L(2) scales with M(2)^2 and we can have the most impact by reducing M(2). For example, with S(1) = 10, M(1) = 10, and M(2) = 10, we see: S(PE) = 1000 S(2) = 100 L(PE) = 1998 L(2) = 20088 L(1) = 1890 Yasukawa and Farrel [Page 10] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 And similarly, with S(1) = 20, M(1) = 20, and M(2) = 5, we see: S(PE) = 2000 S(2) = 400 L(PE) = 3998 L(2) = 20768 L(1) = 15580 These considerable scaling benefits must be offset against the cost effectiveness of the network. Recall that K = S(PE)/(S(1)+S(2) ... + S(n)) where n is the level above the PEs, so that for our network: K = S(PE) / (S(1) + S(2)) Thus, in the first example the cost-effectiveness has been halved from 1000/55 to 1000/110. And in the second example it has been reduced to roughly one quarter, changing from 2000/110 to 2000/420. So, although the tuning changes may be necessary to reach the desired network size, they come at a considerable cost to the operator. 6.2 Alternative two layer hierarchy An alternative to the two layer hierarchy presented in section 6.1 is to provide a full mesh of FA LSPs between P(1) nodes. This technique is only of benefit to any nodes in the core of the level 1 network. Otherwise it makes no difference to the PE and P(2) nodes and increases the burden at the P(1) nodes since they have to support all of the PE-to-PE LSPs as in the flat model, plus the additional 2*(S(1) - 1) P(1)-to-P(1) FA LSPs. Thus, this approach should only be considered where there is a mesh of P-nodes wihtin the ring of P(1) nodes, and it is not considered further in this document. 6.3 Three layer hierarchy As demonstrated by section 6.2, introducing a mesh of FA LSPs at the top level (P(1)) has no benefit, but if we introduce an additional level in the network (P(3) between P(2) and PE) we can introduce a new layer of FA LSPs so that we have a full mesh of FA LSPs between all P(3) nodes to carry the PE-to-PE LSPs, and a full mesh of FA LSPs between all P(2) nodes to carry the P(3)-to-P(3) LSPs. The math starts to get a little less pretty! The number of PEs S(PE) = S(1)*M(1)*M(2)*M(3) and the number of PE-to-PE LSPs at a PE remains as L(PE) = 2*(S(PE) - 1). The number of LSPs at a P(3) can be deduced from section 6.1. It is the sum of all LSPs for all attached PEs including the LSPs between the attached PE plus the number of FA LSPs providing a mesh to the Yasukawa and Farrel [Page 11] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 other P(3) nodes. L(3) = 2*M(3)*(S(PE) - 1) - M(3)*(M(3) - 1) + 2*(S(3) - 1) The number of LSPs at P(2) can also be deduced from section 6.1 since it is the sum of all LSPs for all attached P(3) nodes including the LSPs between the attached PE plus the number of FA LSPs providing a mesh to the other P(2) nodes. L(2) = 2*M(2)*(S(3) - 1) - M(2)*(M(2) - 1) + 2*(S(2) - 1) Finally, L(1) can be copied straight from 6.1. L(1) = 2*M(1)*(S(2) - 1) - M(1)*(M(1) - 1) For example, with S(1) = 5, M(1) = 5, M(2) = 5, and M(3) = 8 we see: S(PE) = 1000 S(3) = 125 S(2) = 25 L(PE) = 1998 L(3) = 16176 L(2) = 1268 L(1) = 220 Similarly, with S(1) = 5, M(1) = 5, M(2) = 8, and M(3) = 10 we see: S(PE) = 2000 S(3) = 200 S(2) = 25 L(PE) = 3998 L(3) = 40038 L(2) = 3184 L(1) = 220 Of course, the extra level in the network tends to reduce the cost effectiveness of the networks with values of K = 1000/155 and K = 2000/230 (from 1000/55 and 2000/110) for the examples above. 6.4 Issues with hierarchical LSPs A basic observation for hierarchical scaling techniques is that it is hard to have any impact on the number of LSPs that must be supported by the level of P(n) nodes adjacent to the PEs (for example, it is hard to reduce L(3) in section 6.3). In fact, the only way we can change the number of LSPs supported by these nodes is to change the scaling ratio M(n) in the network, in other words to change the number of PEs subtended to any P(n). But such a change has a direct effect on the number of PEs in the network and so the cost-effectiveness is impacted. Yasukawa and Farrel [Page 12] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 Another concern with the hierarchical approach is that it must be configured and managed. This may not seem like a large burden, but it must be recalled that the P(n) nodes are not at the edge of the network - they are a set of nodes that must be identified so that the FA LSPs can be configured and provisioned. Although certain techniques (such as IGP auto-mesh [AUTO-MESH]) can help with this procedure, there is still a configuration and management overhead. Finally, observe that we have been explaining these techniques using conveniently symmetrical networks. Consider how we would arrange the hierarchical LSPs in a network where some PEs are connected closer to the center of the network than others. 7. Scaling improvements through multipoint-to-point LSPs An alternative (or complementary) scaling technique has been proposed using multipoint-to-point (MP2P) LSPs. The fundamental improvement in this case is achieved by reducing the number of LSPs towards the destination as LSPs towards the same destination are merged. This section presents an overview of MP2P LSPs and describes their applicability and scaling benefits. 7.1 Overview of MP2P LSPs Note that the MP2P LSPs discussed here are for MPLS-TE and are not the same concept familiar in the Label Distribution Protocol (LDP) described in [RFC3036]. Traffic flows generally converge toward their destination and this can be utilized by MPLS in constructing an MP2P LSP. With such an LSP, the LFIB mappings at each LSR are many-to-one so that multiple pairs {incoming interface, incoming label} are mapped to a single pair {outgoing interface, outgoing label}. Obviously, if global labels are used this mapping may be optimized within an implementation. It is important to note that with MP2P MPLS-TE LSPs the traffic flows themselves are not merged. That is, the flows are labeled in their own right as point-to-point (P2P) flows, and the MP2P LSPs are used as tunnels. This means that traffic can be disambiguated at the egress of the MP2P LSPs without the need to look at the payload. Techniques for establishing MP2P MPLS-TE LSPs and for assigning the correct bandwidth downstream of LSP merge points are out of the scope of this document. Yasukawa and Farrel [Page 13] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 7.2 Scaling improvements Consider the network topology shown in figure 3. Suppose that we establish MP2P LSP tunnels such that there is one tunnel terminating at each PE, and that that tunnel has every other PE as an ingress. Thus, a PE-to-PE MP2P LSP tunnel would have S(PE)-1 ingresses and one egress, and there would be S(PE) such tunnels. Note that there still remain 2*(S(PE) - 1) PE-to-PE P2P LSPs that are carried through these tunnels. Let's consider the number of LSPs handled at each node in the network. The PEs continue to handle the same number of PE-to-PE P2P LSPs, and must also handle the MP2P LSPs. So: L(PE) = 2*(S(PE) - 1) + S(PE) But all P(n) nodes in the network only handle the MP2P LSP tunnels. Nominally, this means that L(n) = S(PE) for all values of n, but life is not really that simple. The number of LSPs is no longer the only issue (although it may have some impact for some of the scaling concerns listed in section 4). We are now interested more directly in the amount of LSP state that is maintained by an LSR. We can quantify this according to the number of LSP segments managed by an LSR. So, in the case of a P2P LSP, an ingress or egress has one segment to maintain, while a transit has two segments. Similarly, for an MP2P LSP, an LSR must maintain one state for each upstream segment (which will we can assume is in a one-to-one relationship with the number of upstream neighbors) and exactly one downstream segment - ingresses obviously have no upstream neighbors, and egresses have no downstream segments. So we can start again on our examination of the scaling properties, using X(n) to represent the amount of state held at each P(n). At the PEs, there is only connectivity to one other network node, the P(2) node. But note that we have required that P2P LSPs be used tunneled within the MP2P LSPs to allow disambiguation of data at the egresses. So X(PE) is: X(PE) = 4*(S(PE) - 1) Each P(2) node has M(2) downstream PEs to each of which a single MP2P LSP must be delivered coming from the one upstream P(1), and from which one LSP segment converges towards each other PE in the network. Note that not all MP2P LSPs initiated at subtended PEs are routed out through the parent P(1) since some terminate at local PEs. Thus: Yasukawa and Farrel [Page 14] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 X(2) = 2*M(2) + M(2)*(S(PE) - 1) + S(PE) - M(2) = S(PE)*(M(2) + 1) Similarly, at each P(1) node there are M(1) downstream P(2) nodes and so a total of M(1)*M(2) downstream PEs. Each P(1) is connected in a full mesh with the other P(1) nodes so has (S(1) - 1) neighbors. So each P(1) must handle an LSP segment coming from each downstream P(2) headed to each non-downstream PE and an LSP segment to another P(1) for each non-downstream PE. At the same time, it must handle an LSP segment coming from each other P(1) headed to each downstream PE, and an LSP segment towards each downstream PE. Thus: X(1) = (M(1) + 1)*(S(PE) - M(1)*M(2)) + (S(1) - 1)*(S(PE) - M(1)*M(2)) + M(1)(M(2) = M(1)*(S(PE) + M(2)*(M(1) - S(1) + 1)) +S(1)*S(PE) So, for example, with For example, with S(1) = 10, M(1) = 10, and M(2) = 10, we see: S(PE) = 1000 S(2) = 100 X(PE) = 3996 X(2) = 11000 X(1) = 20100 And similarly, with S(1) = 20, M(1) = 20, and M(2) = 5, we see: S(PE) = 2000 S(2) = 400 X(PE) = 5996 X(2) = 12000 X(1) = 80100 7.2.1 Comparison with other scenarios For comparison with the examples in sections 5 and 6, we need to convert those LSP-based figures to our new measure of LSP state. Observe that each LSP in sections 5 and 6 generates two state units at a transit LSR and one at an ingress or egress. So we can provide conversions as follows: Section 5 L(PE) = 2*(S(PE) - 1) L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) L(1) = 2*M(1)*M(2)*[S(PE) - 2.5] - M(2) X(PE) = 2*(S(PE) - 1) X(2) = 4*M(2)*(S(PE) - 1) - 2*M(2)*(M(2) - 1) X(1) = 4*M(1)*M(2)*[S(PE) - 2.5] - 2*M(2) Yasukawa and Farrel [Page 15] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 So that using the example figures in section 7.1 (S(1) = 10, M(1) = 10, and M(2) = 10) we see: X(PE) = 1998 X(2) = 39780 X(1) = 398980 Clearly this technique is a significant improvement over the flat network within the network, although the PEs are more heavily stressed. Section 6.1 L(PE) = 2*(S(PE) - 1) L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) + 2*(S(2) - 1) L(1) = 2*M(1)*(S(2) - 1) - M(1)*(M(1) - 1) S(2) = S(1)*M(1) X(PE) = 2*(S(PE) - 1) X(2) = 4*M(2)*(S(PE) - 1) - 2*M(2)*(M(2) - 1) + 2*(S(2) - 1) X(1) = 4*M(1)*(S(2) - 1) - 2*M(1)*(M(1) - 1) So that using the example figures in section 7.1 (S(1) = 10, M(1) = 10, and M(2) = 10) we see: X(PE) = 1998 X(2) = 39881 X(1) = 3780 And we can observe that the MP2P model is better at P(2), but the hierarchical model is better at P(1). In fact, this can be generalized to observe that the MP2P model produces best effects towards the edge of the network, while the hierarchical model makes most impression at the core. However, the requirement for P2P LSPs tunneled within the MP2P LSPs does cause a double burden at the PEs. 7.3 Issues with MP2P LSPs The biggest challenges for MP2P LSPs are the provision of support in the control and data planes. To some extent support must also be provided in the management plane. Control plane support is just a matter of defining the protocols and procedures [MP2P-RSVP], although it must be clearly understood that this will introduce some complexity to the control plane. Hardware issues may be a little more tricky. For example, the capacity of the upstream segments must never (allowing for statistical over-subscription) exceed the capacity of the downstream segment. Similarly, data planes must be equipped with sufficient buffers to handle incoming packet collisions. Yasukawa and Farrel [Page 16] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 The management plane will be impacted in several ways. Firstly the management applications will need to handle LSPs with multiple senders. This means that, although the applications need to process fewer LSPs, they will be more complicated and will, in fact, need to process the same number of ingresses and egresses. Other issues like diagnostics and OAM would also need to be enhanced to support MP2P, but might be borrowed heavily from LDP networks. Lastly, note that the MP2P solution requires support of twice as many LSPs at the PEs and requires tunneling to be introduced at the PEs. Since PEs are not usually as fully specified as P-routers, this may cause some concern although the use of previous hop popping on the MP2P LSPs might help to reduce this issue. In all cases, care must be taken not to confuse the reduction in the number of LSPs with a reduction in the LSP state that is required. In fact, the discussion in section 7.1 is slightly over-optimistic since LSP state towards the destination will probably need to include sender information and so will increase depending on the number of sender for the MP2P LSP (the calculations do not take this into account). As a final note, observe that the MP2P scenario presented in this section may be optimistic. MP2P LSP merging may be hard to achieve between LSPs with significantly different traffic and QoS parameters. Therefore, it may be necessary to increase the number of MP2P LSPs arriving at an egress. 8. Combined models There is nothing to prevent the combination of hierarchical and MP2P solutions within a network. Note, however, that if MP2P LSPs are tunneled through P2P FA LSPs across the core, none of the benefit of LSP merging is seen for the hops during which the MP2P LSPs are tunneled. It is possible to construct solutions where FA LSPs are managed as MP2P LSPs and additional significant savings are made. 9. Management Considerations The management issues of the two models presented in this document have been discussed in line. Neither solution is without its management overhead. Note, however, that scalability of management tools is one of the motivators for this work and that network scaling solutions that reduce the active management of LSPs at the cost of additional effort to manage the more static elements of the network represent a benefit. That is, it is worth the additional effort to set up MP2P or Yasukawa and Farrel [Page 17] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 FA LSPs if it means that the network can be scaled to a larger size without being constrained by the management tools. 10. Security Considerations The techniques described in this document use existing or yet-to-be-defined signaling protocol extensions and are subject to the security provided by those extensions. Note that we are talking about tunneling techniques used within the network and both approaches are vulnerable to the creation of bogus tunnels that deliver data to an egress or consume network resources. The MP2P technique may prove harder to debug through OAM methods than the FA LSP approach. 11. Recommendations At this stage, no recommendations are made, but it would be valuable to consult more widely to discover: - The concerns of other service providers with respect to network scalability. - More opinions on the realistic constraints to the network parameters listed in section 4. - Desirable values for the cost-effectiveness of the network (parameter K) - The applicability, manageability and support for the two techniques described - The feasibility of combining the two techniques as discussed in section 8. 12. IANA Considerations This document makes no requests for IANA action. 13. Acknowledgements TBD 14. Intellectual Property Consideration The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Yasukawa and Farrel [Page 18] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. 15. Normative References [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) Hierarchy with Generalized Multi-Protocol Label Switching (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 16. Informational References [RFC2961] Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F. and S. Molendini, "RSVP Refresh Overhead Reduction Extensions", RFC 2961, April 2001. [RFC3036] Andersson, L., Doolan, P., Feldman, N., Fredette, A. and B. Thomas, "LDP Specification", RFC 3036, January 2001. [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V. and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. [AUTO-MESH] Vasseur, JP., and Le Roux, JL., "Routing extensions for discovery of Multiprotocol (MPLS) Label Switch Router (LSR) Traffic Engineering (TE) mesh membership", draft-ietf-ccamp-automesh, work in progress. [MP2P-RSVP] Yasukawa, Y., "Supporting Multipoint-to-Point Label Switched Paths in Multiprotocol Label Switching Traffic Engineering", draft-yasukawa-mpls-mp2p-rsvpte, work in progress. Yasukawa and Farrel [Page 19] draft-yasukawa-mpls-scaling-analysis-00.txt February 2006 17. Authors' Addresses Seisho Yasukawa NTT Corporation 9-11, Midori-Cho 3-Chome Musashino-Shi, Tokyo 180-8585 Japan Phone: +81 422 59 4769 EMail: yasukawa.seisho@lab.ntt.co.jp Adrian Farrel Old Dog Consulting EMail: adrian@olddog.co.uk 18. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 19. Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Yasukawa and Farrel [Page 20]