CCAMP Working Group Internet Draft Young Hwa Kim Document: draft-kim-ccamp-cpr-reqts-00.txt byung Ho Yae Expires: August 2004 Jin Ho Hahm Avri Doria ETRI Jun Kyun Choi ICU Jae Cheol Ryou CNU February 2004 Requirements for the Resilience of Control Plane in GMPLS Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes requirements for providing the resilience capability of control plane (in other words, control network) in GMPLS. As known in generally, control plane consist of control entities, control channels, and control nodes. This contribution, as a document that proposes a framework to provide the resilience capability of control plane, include terminologies, basic concepts of control networks, possible configurations, necessities and Kim et al Expires - August 2004 [Page 1] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 requirements, additional considerations including the relationship with other protocol such as LMP. Table of Contents 1. Summary for Sub-IP Area........................................2 2. Conventions Used in the Document...............................3 3. Introduction...................................................3 4. Control Networks...............................................5 5. Concepts for the Resilience of Control Networks................6 6. Necessities for the Resilience of Control Networks.............7 7. Requirements for the Resilience of Control Networks............8 8. Relation to LMP...............................................11 9. Functions for the Resilience of Control Networks..............11 Security Considerations..........................................14 References.......................................................14 Author's Addresses...............................................14 1. Summary for Sub-IP Area 1.1. Summary See the Abstract above. 1.2. Related Documents draft-ietf-ccamp-gmpls-architecture-07.txt, May 2003. draft-ietf-ccamp-lmp-08.txt, March 2003. draft-ietf-mpls-generalized-signaling-09.txt, August 2002. 1.3. Where Does it Fit in the Picture of the Sub-IP Work This work fits in the CCAMP WG. 1.4. Why Is It Targeted at This WG This draft is targeted at the CCAMP WG because this draft specifies requirements for the resilience capability of control plane in GMPLS and the CCAMP WG coordinates the work defining a common control plane. In addition, the resultant specification from this document could may be an extension to Link Management Protocol (LMP). Kim, et al [Page 2] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 1.5. Justification of Work The CCAMP WG should consider this document as it addresses requirements for the resilience capability of control plane. Until now, the work of protection and restoration has been focused to handle LSPs over transport plane (in other words, data plane). Then the future work of protection and restoration should be focused on control plane for which the CCAMP WG play a leading role. 2. Comments to readers Although this contribution starts with the *oo.txt of a new name, this document is based on "draft-kim-ccamp-cc-protection-04.txt". The previous document had been focused on control channels themselves. But, from this document, weĈd like to focus on the resilience capability of control plane comprising of control channels, control nodes, and control entities. In this view-point, we think the current document could be updated in order to include more various requirements about the resilience capability of control plane. Compared to the previous document, the following contents have been modified and added. Especially, functions for the resilience of control Networks have been handled in more detail. - Update of editing errors. - Modification of the document title - Expansion of the resilience concept to control plane - Review and update of the overall contents - Update of the section, "Priorities in Control Channels" - Update of the section, "Reverting and Non-reverting Modes" - Update of the section, "Relation to LMP" - Update of the section, "Functions for Resilience of Control Channels" 3. Conventions Used in the Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119. Terms such as ASTN, UNI, and NNI used in this document refer to [G.807]. 4. Introduction The term, "control plane", may represent conceptual aspects of control networks comprising those control channels, control entities, and Kim, et al [Page 3] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 control nodes used to transport control packets for signaling, link management, and routing. As described in [LMP], a control channel is a pair of mutually reachable interfaces that are used to carry control packets for routing, signaling and link management between nodes. The control channels are used for the transport of control packets between neighboring nodes within a network, they may be also used to exchange control packets between neighboring networks over their control channels. Control packets transferred over control channels are generated by control entities, which as functional entities perform the interoperability functions such as signaling, link management, or routing functions. A control node means a physical node including control channels, control entities, OAM functions, and so on. Here, the survivability of control plane refers to the ability of a control network to maintain an acceptable level of control services during failures of control channels, control entities, and control nodes. In this document, only the survivability part of control channels is covered, the survivability parts of control entities and control nodes are currently out of the scope of this document. In traditional IP-based or ATM-based MPLS, there is an static association between control and transport channels within a physical link. Under the static association, the switchover of control channels may be implicitly resolved using the well-known switchover schemes such as 1+1 and 1:1 protection. In other words, where control channels are not separated from the transport channels, there may be no need to consider the specific resilience capability for control networks. However, as indicated in [GMPLS-ARCH], a control channel can be separated from the transport channel. To allow for the control channels between adjacent nodes to be separated from the associated data-bearing links means that there is not a one-to-one association between control and transport channels. Consequently, it means there may be a need for considering the survivability of these control channels. If there is no switchover mechanism allowing automatic configuration of control channels between adjacent nodes, these nodes have to determine alternative control channels based on the operator intervention of the management plane or the existing control channel management of LMP. This may result in the degradation of the service provided by a control network. In the end, to resolve these problems in GMPLS-based control networks, GMPLS nodes have to keep the resilience capability of control networks. For the purpose, to be begin, we have to define a framework or guideline for protecting control channels. This document describes basic concepts of control networks, related terminologies, requirements, and functions to be provided. Then, detailed switchover Kim, et al [Page 4] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 mechanisms of control channels will be covered in a new separate specification, or in an extension of the LMP specification if possible. 5. Control Networks As indicated in [G.807], a control network(signaling network as a narrow meaning) supports the control plane by the act of transferring service-related information between the user and the network and also between network entities. Generally, the control network is based upon common control signaling, thereby allowing a network operator to provide the capability of developing a separate control network. Channels for common control signaling are associated with data (or user) channels in the following modes (the following is based on the description contained in [G.807]) : - Associated mode in which control packets between network elements are transferred over control channels that directly connect the network elements such as transport channels; - Quasi-associated mode in which control packets between nodes A and B follow a predetermined routing path over several control channels while the traffic channels are routed directly between A and B. - Non-associated mode in which control packets between network elements A and B are routed over several control channels, while traffic signals are routed directly between A and B. The control channels used may vary with time and network conditions; But, the concept of this common control signaling seems to be not yet reflected on IP-based control networks. In current, the OIF defines, as a primitive scheme of control network, network configurations to transport control packets as follows: - In-fiber configuration in which control packets are carried over a control channel such as DCC embedded in the data-carrying optical link between network elements; - Out-of-fiber configuration in which control packets are carried over a dedicated control channel or an external IP transport network between adjacent nodes, which is separated from the data-bearing optical links. As a variant to the case of out-of-fiber configuration, we can consider another situation that one or more control channel within a fiber that includes transport channels can handle transport channels in other fibers as well as the own fiber. Kim, et al [Page 5] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 A similar concept of a control network using various control channels is found in the SCN(signaling communication network)-related contents of the automated switched transport network (ASTN) in ITU-T. As indicated in [G.7712], there may be several physical implementations for the SCN. For example, embedded control channels (ECC) interfaces, LAN interfaces, and WAN interfaces are possible. Here, there is an important reason for the survivability function of control plane: as described in the ITU-T recommendation, common to each topology is that alternative diverse paths exist between the communication entities (i.e., the ASTN capable network elements). That is, network elements themselves of ASTN may keep alternative control channels such as ECC interfaces, LAN interfaces, and WAN interfaces. 6. Concepts for the Resilience of Control Networks To provide the resilience capability of control networks, several switchover concepts for a higher layer could be introduced. For example, these include control packet type, active and standby control channels, logical and physical control channels, protection groups, etc. This section defines these terminologies. They are subject to be modified and other concepts can be also added in the future. - Control packet type : It indicate a type among control packets for routing, signaling, and link management to be carried over control channels. In more detail, OSPF, IS-IS, and BGP variants are candidates of control packets for routing, CR-LDP and RSVP-TE are ones of control packets for signaling, and LMP are ones of control packets for link management. - Active control channels : These are physical control channels which are currently carrying the control packets. An active control channel becomes a standby control channel when it can not carry the control packets due to a failure. - Standby control channels : They are physical control channels which are not carrying the control packets. Once any of standby control channels is used to carry the control packets, the standby control channel becomes an active control channel. - Physical control channels : They are data communication channels (DCCs) bytes of SONET/SDH, optical supervision channels (OSC), Ethernet links, or dedicated channels including timeslots and lambdas used to carry control packets for routing, signaling, and link management. The physical control channel could not be used for carrying data traffic. Kim, et al [Page 6] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 - Logical control channels : They are unique control channels within control networks and have control channel identifiers of global significance within relevant control networks, irrelevant to types of physical control channels. However, adjacent nodes that want to exchange control packets over control channels should use the same type of physical control channels each other. - Protection group : It is a group of control channels with the same type among various physical control channels or with the same protection aspects within the same type of physical control channels. 7. Necessities for the Resilience of Control Networks As indicated in [LMP], a node could keep in-fiber and out-of-fiber configurations simultaneously for control networks. However, there may be several advantages of out-of-fiber over in-fiber such as the followings: - When using the overhead bytes of SONET/SDH as a control channel, in-fiber configuration may require more resources such as HDLC controllers whereas out-of-fiber could reduce the number of these controllers; - In the case of in-fiber configuration, it looks that there no need to protect control channels because switchover schemes of the lower levels such as Layer 1 or 2 would support the resilience of control networks with an unit of the fiber. However, out-of-fiber configuration could apply various software based switchover schemes to provide more quality of service(QoS) in control networks; - In the case of in-fiber configuration, it is difficult to separate control plane from transport plane, but, in the case of out-of-fiber configuration, it is easy to separate control plane from transport plane and to add supplementary functions as needed; - Because in-fiber configuration supports the connection control function only within a single fiber, the configuration could not support the connection control function beyond a single fiber, whereas out-of-fiber configuration has no restriction about the number of fibers for the connection control function. When out-of-fiber configuration is used, control channels may be shared to handle several fibers. If there is no switchover scheme of control channels at this situation, a failure of the active control channel may cause the problem such as temporary suspension of control entities although they are running well. If the recovery of failed control channel would take a long time and the identification of an Kim, et al [Page 7] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 alternative control channel would be late, the failure will result in the degradation of service of control plane. In the case of RSVP- variant protocols, if there is no self-refresh procedure, the failure of control channels may result in side effects such as unwanted connection teardown. In particular, as described in [G.7712], such a failure may impact restoration when ASTN is used to provide restoration of existing connections. It is therefore critical for the control network to have the real-time resilience capability when transporting restoration messages. Consequently, to provide the resilience capability of control networks, it is reasonable to use out-of-fiber configuration rather than in-fiber configuration in GMPLS. In addition, for the support of concepts such as link bundling, explicit transport channel identification, and non-impact between failures of control and transport channels, the separation between control and transport planes has been introduced in MPLS and GMPLS, see [GMPLS-ARCH] and [GMPLS-SIG]. This situation means that a failure on control channels does not cause a failure on transport channels. However, if there is no recovery of the failed control channel and no identification of other alternative control channel even beyond a certain time limit, it may result in the deletion of previously established connections, or invoke operator intervention to manually treat the failure of control channel. It may degrade the quality of service(QoS) from the view-point of connection control. As indicated in [LMP], there may not be any active control channels available while the data links are still in use. most of applications could not be unacceptable to tear down connections at this situation. And, the data links under the condition of the failed control channel may not be guaranteed the same level of service as before the failure occurs. In [LMP], it is said that the TE link is in a degraded state. Eventually, in order to provide the stable and reliable communication environment between adjacent nodes, the explicit switchover mechanism of control channels should be provided. 8. Requirements for the Resilience of Control Networks To introduce the resilience capability of control networks into control plane, several requirements have to be identified. These include configuration of control networks, priorities in control channels, decision about reverting and non-reverting modes, and so on. 8.1. Configuration of Control Networks To begin with, in order to simplify the approach to the problem domain for the resilience capability of control networks, we assume Kim, et al [Page 8] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 that a single control channel carries all control packets such as routing, signaling, and link management. The support of the resilience capability of control networks requires either that the number of control channels between adjacent nodes has to be greater than one, and there has to be at least one active control channel and more than one standby control channel. The important thing is that adjacent nodes that want to exchange control packets should in advance know their types which a physical control channel carries in timeslots, supervision channels and Ethernet links, and so on, regardless of control modes such as associated, non- associated, or quasi-associated. In addition, each node should also identify adjacent nodes to control channels using an automatic method if possible. If a control network runs several active control channels between adjacent nodes, at least two control channels among them could be used to carry control packets with the same sequence number. Then, a receiving node of control packets would ignore a control packet that arrives late with the same sequence number. In this case, the configuration of control network are sufficient at least with two active control channels and should support in each node the processing capability of sequence numbers in routing, signaling, and link management protocols or in lower functions than these protocols. However, in this case, we have to take into account unnecessary processing overheads because at least twice as many control packets should be generated and at least half of them should be ignored. The more appropriate configuration of a control network is to keep a single active control channel and several standby control channels at any time. When a failure occurs at an active control channel, a node would switch the failed control channel over to the one of standby control channels using provisioning rules and negotiation results between nodes. Then, the standby control channel turns into an active control channel instead of the previous active control channel. The distribution of control packet types into more than one control channel may be possible, but, this case requires further study. For this case, the switchover function of control channels should know control packet types in advance that a control channel carries. 8.2. Priorities in Control Channels If there are more than one protection group(PG) or several control channels within a PG between adjacent nodes, we have to assign priorities of protection groups(for short, PPG) and priorities of control channels within a PG(for short, PCC). For the PPG, we could assign control channels of the associated mode to PPG-1, control Kim, et al [Page 9] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 channels of the quasi-associated mode to PPG-2, and finally control channels of the non-associated mode to PPG-3. Of course, a lower priority number would have higher priority than a higher priority number. It would not be appropriate to negotiate the PPG because there is a clear distinction from the viewpoint of physical aspects such as number of relay nodes between adjacent nodes. For this static priority assignment, the work of each node is to decide which control modes could be supported between adjacent nodes. In addition, different addressing scheme could be applied to control channels among PGs. Also, in the case of PCC, dynamic priority assignment is appropriate for the support of several configurations of control channels. For a example, in the case of using an external IP network, a node could take the IP address of a control channel to be lower than the one within other node or higher than the one within another node. In addition, within a PG, the same addressing scheme of control channels should be applied. In any case of PPG or PCC, control channels within the same or between protection groups should be switched via automatic operation and operator intervention together based on provisioning rules. In the end, under automatic operation, among several control channels between adjacent nodes, a control channel with higher priority is to be an active control channel. However, manual operation via operator intervention have to override the automatic situation. And, configuration consistency of control channels between adjacent nodes has to be validated before the operation of switchover function. 8.3. Reverting and Non-reverting Modes of Control Channels Reverting mode within a protection group may cause unnecessary switchover operation. Although the priority of an control channel is lower than that of another control channel recovered from failure, it is appropriate to keep the current status for more simple switchover operation. That is, non-reverting mode would be more effective within the same protection group. However, switchover between protection groups is different from that within the same protection group. It is preferable to apply reverting mode than non-reverting mode for the usage of better physical aspect of control channels. However, on forced switchover via operator intervention, two options of "switchover enabled" and "switchover disabled" are needed for the purpose of operation and maintenance. Automatic switchover should be prohibited on a control channel incurring the forced switchover with "switchover disabled". Kim, et al [Page 10] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 9. Relation to LMP As described in [LMP], LMP runs between adjacent nodes and is used to manage TE links. Among several functions of LMP, one of the core functions is the control channel management. This function bears a relation to the resilience capability of control networks. In current, parameter negotiation and hello protocol are performed within the control channel management. A control channel is identified and hello-related interval times such as HelloInterval(for an example, 150 ms) and HelloDeadInterval(at least, HelloInterval * 3 ms) are negotiated using the parameter negotiation, and failures of active control channels are detected using the hello protocol. If a node does not want the fast keep-alive mechanism of LMP, the node have to set the HelloInterval and HelloDeadInterval to zero. In case of in-fiber type that may not require the resilience capability of control networks, a control channel could be automatically configured using the multicasting scheme. However, in case of out-of-fiber type that would require the resilience capability of control networks, there is still no automatic configuration mechanism of control channels. Under the current LMP, on the failure detection of a active control channel, a node activates additional control channels using the control channel management with the parameter negotiation and the hello protocol. This could mean the implicit switchover of control channels, and the control channel configuration is not appropriate for the time-critical switchover. Especially, we think that this method is not sufficient for the restoration of existing connections in the ASTN environment, and for the avoidance of a TE link from the degraded state. In relation to LMP, an approach to be considering for the resilience capability of control networks is that the resilience capability includes the functions such as parameter negotiation and hello protocol as they are defined in LMP. 10. Functions for the Resilience of Control Networks Based on requirements described in the previous section, the possible functions for the resilience of control networks are as follows, are subject to modification, and not limited to only these: 10.1. Identification of an Active Control Channel At the zero-base status that dose not still identify an active control channel between adjacent nodes, each node should first of all identify its active control channel to exchange control packets. Kim, et al [Page 11] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 10.2. Negotiation of Switchover Attributes After identifying an active control channel each other between adjacent nodes, the negotiation of switchover attributes such as control modes to be provided, their control channels and priorities within a PG should be performed. Through this negotiation process, a node could find out the switchover aspects of its peer node. 10.3. Verification of Standby Control Channels After performing the negotiation of switchover attributes, as a next step, adjacent nodes should verify the connectivity of their standby control channels. Stand control channels with unsuccessful connectivity results should be deleted in the candidate list, and only the ones with successful connectivity results should be handled in the list. The successful result until this process would complete the prior preparation against failures of control channels connected each other. 10.4. Automatic Switchover Important conditions for automatic switchover include that the priority principle is based on the PPG and PCC combination and that control channels to be switched on and off have to be set to "switchover enable". Then, if a failure over an active control channel is detected, the switchover is automatically performed to one of standby control channels within the own PG or within other PG with the lower PPG without an operator intervention. In addition, if there is a recovery from the failure occurred over the previous active control channel and only under the situation that the current active control channel has the lower priority than the previous active control channel, the automatic switchover to previous active control channel would be performed. As peer entities for the automatic switchover, the requesting side and the requested side all are the functional entities of control plane. 10.5. Forced Switchover Contrary to the automatic switchover above, the forced switchover is performed manually through an operator intervention for the purpose of operation and maintenance. As peer entities for the forced switchover, the requesting side is the functional entity of management plane and the requested side is the functional entity of control plane. Kim, et al [Page 12] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 10.6. Inquiry of Switchover Attributes There may be a situation that an operation system of a node wants to know switchover attributes such as PGs and priority of control channels kept in its adjacent node. On the resultant notification to the operation system, it is more appropriate to include the local information corresponding to the switchover attributes of remote node as well. 10.7. Notification of Protocol Errors A functional entity for the resilience of control networks should check protocol error conditions as a general rule before acting upon a message after receiving the message from the peer entity. If there is no a protocol error within the received message, resultant operations would be performed. However, if a protocol error occurs, the receiving entity should notify the sending entity of the situation. the possible error cases are as follows: - Message type error - Unexpected message - Mandatory information missing - Unrecognized information 10.8. Parameter Negotiation See [LMP]. 10.9. Hello Protocol See [LMP]. 10.10. Resultant Management Functions To support the possible functions addressed above for the resilience of control networks, the following management functions could be considered: - Initiation for forced switchover - Inquiry of switchover attributes - Addition and deletion of control channels - Treatment of provisioning rules - Notification of protocol errors Kim, et al [Page 13] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 Security Considerations Security issues are not considered in this proposal. However, It looks that the security considerations mentioned in [LMP] could be applied to this protocol as well. References [G.807] ITU-T Recommendation G.807, "Requirements for Automatic Switched Transport Networks (ASTN)", July 2001. [G.7712] ITU-T Recommendation G.7712, "Architecture and Specification of Data Communications Network", March 2003. [G.8080] ITU-T Recommendation G. 8080, " Architecture for the automatically switched optical network (ASON)", March 2003. [O-UNI] OIF, OIF2000.125.7, "User Network Interface (UNI) 1.0 Signaling Specification", October 2001. [GMPLS-ARCH] Eric Mannie, "Generalized Multi-Protocol Label Switching (GMPLS) Architecture", Internet Draft, draft-ietf-ccamp-gmpls- architecture-07.txt, May 2003. [LMP] J. Lang, et al, "Link Management Protocol", Internet Draft, draft-ietf-ccamp-lmp-10.txt, October 2003. [GMPLS-SIG] Lou Berger, et al, "Generalized MPLS Signaling Functional Description", Internet Draft, draft-ietf-mpls-generalized-signaling- 09.txt, August 2002. Author's Addresses Young Hwa Kim ETRI 61 Gajeong-dong, Yuseong-gu, Daejeon, 305-350 Phone: +82 42 860 5819 E-mail: yhwkim@etri.re.kr Byung Ho Yae ETRI 61 Gajeong-dong, Yuseong-gu, Daejeon, 305-350 Phone: +82 42 860 5819 E-mail: bhyae@etri.re.kr Kim, et al [Page 14] Internet Draft draft-kim-ccamp-cpr-reqts-00.txt February 2004 Jin Ho Hahm ETRI 161 Gajeong-dong, Yuseong-gu, Daejeon, 305-350 Phone: +82 42 860 6048 E-mail: jhhahm.etri.re.kr Avri Doria ETRI 161 Gajeong-dong, Yuseong-gu, Daejeon, 305-350 Phone: +82 42 860 1019 E-mail: avri@etri.re.kr Jun Kyun Choi Information and Communications University (ICU) 58-4 Hwa Ahm Dong, Yuseong, Daejeon, 305-732 Phone: +82-42-866-6122 Email: jkchoi@icu.ac.kr Jae Cheol Ryou Dept. of Computer Science, Chungnam National University (CNU) 220 Gung-dong, Yuseong-gu, Daejeon, 305-764 Phone: +82 42 821 7443 E-mail: jcryou@home.cnu.ac.kr Kim, et al [Page 15]