CCAMP Working Group           Richard Rabbat (Fujitsu Labs. of America) 
Internet Draft                 Ching-Fong Su (Fujitsu Labs. of America) 
Expires: April 2004                      Vishal Sharma (Metanoia, Inc.) 
                                                                        
                                                           October 2003 
    
   Observations on the Applicability of the Fault Notification Protocol 
                   draft-rabbat-fnp-applicability-00.txt 
     
Status of this Memo  
    
   This document is an Internet-Draft and is in full conformance with 
   all provisions of Section 10 of RFC2026 [1]. 
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts.  
    
   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time. It is inappropriate to use Internet-Drafts as reference 
   material or to cite them other than as "work in progress."  
    
   The list of current Internet-Drafts can be accessed at  
        http://www.ietf.org/ietf/1id-abstracts.txt  
   The list of Internet-Draft Shadow Directories can be accessed at  
        http://www.ietf.org/shadow.html. 
    
    
Abstract  
    
   The Fault Notification Protocol (FNP) is a set of procedures designed 
   to enable time-bounded failure notification in networks using an IP-
   based control plane. This document discusses the applicability of FNP 
   in the context of optical transport networks. It highlights the 
   protocol’s principles of operation, and then describes the network, 
   node, fault, and operational models in optical networks for which the 
   protocol is designed. It also discusses the relationship to higher 
   layers, and issues of scalability. Some guidelines for deployment are 
   also provided. 
    
    
Rabbat, et al            Expires - April 2004                 [Page 1] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
Table of Contents 
    
   1. Introduction...................................................2 
   2. Terminology....................................................2 
   3. Operational Overview of FNP....................................3 
   4. FNP Applicability..............................................4 
   4.1 Network Model.................................................4 
   4.2 Node Architecture.............................................4 
   4.3 Fault Model (Types of faults supported).......................4 
   4.4 Network Layer at which FNP Applies............................5 
   4.5 Relationship to Higher (Packet) Layers........................5 
   4.6 Operational Model.............................................5 
   4.7 Framing and Data Plane Considerations.........................6 
   4.8 Scalability Considerations....................................6 
   4.9 Guidelines for Deployment.....................................7 
   5. Conclusion.....................................................7 
   6. Acknowledgements...............................................7 
   7. Intellectual Property Considerations...........................7 
   8. References.....................................................8 
   9. Authors' Addresses.............................................9 
   10. Full Copyright Statement.....................................10 
    
    
1.   Introduction 
    
   As carriers move towards offering advanced services on their 
   networks, with a tighter integration of the different network layers, 
   the ability to provide rapid, scalable, and timely restoration is 
   crucial for meeting agreed-upon SLAs, either between providers or 
   between the end-customer and a provider. In this context, time-
   bounded fault notification will be a key component of the overall 
   carrier restoration strategy.  
    
   The Fault Notification Protocol (FNP) [2] is a protocol developed to 
   meet this service provider requirement. It is designed to facilitate 
   rapid restoration by enabling time-bounded fault notification in 
   networks that use an IP-based control plane. 
    
   The purpose of this memo is to discuss the applicability of FNP in 
   the context of optical transport networks. 
    
    
2.   Terminology 
    
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
   document are to be interpreted as described in RFC 2119 [3]. 
    
    
Rabbat, et al            Expires - March 2004                 [Page 2] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
3.   Operational Overview of FNP 
    
   In this section, we briefly review the basic operation of FNP while 
   confining our discussion here to optical transport networks. 
    
   Fundamentally, FNP is a set of procedures designed to provide time-
   bounded fault notification in a network with shared protection. That 
   is, a network where either the protection route between two nodes 
   carries “extra traffic” from two or more disjoint trails, or the 
   provider implements M:N type shared restoration. 
    
   Once a network fault is detected, the node detecting the fault sends 
   out a fault notification message to each of its neighbors on the 
   control plane. The message essentially identifies the resource 
   (fiber, lambda, or node) at fault; this allows any network node 
   receiving a fault notification message to determine whether it lies 
   on the path of a backup LSP corresponding to a working LSP affected 
   by the fault. The message also carries the time (per the local clock) 
   at which the fault was detected. 
    
   Each network node, upon the receipt of a fault notification message 
   first transmits the message on each of its remaining outgoing 
   interfaces, and then processes the message to determine whether it 
   lies on the path of a backup LSP(s) that needs to be activated as a 
   result of that fault. If so, the node first drops any extra-traffic 
   that was using the resources originally reserved for this backup LSP, 
   and reconfigures its cross-connect hardware so that the working 
   traffic arriving on the backup LSP can be directed to the appropriate 
   outgoing link/interface. 
    
   The flooding mechanism ensures that information about the fault is 
   propagated to each network node in the minimum number of hops on the 
   control plane and, provided that the fault notification packet gets 
   high-priority in the transmission queues at each node, also that 
   fault notification is propagated in the shortest possible time.  
    
   A protection switching node, upon receiving the notification message, 
   waits for an amount of time that is the difference of the 
   notification time bound (Tntf) and the time at which the fault was 
   detected (Tdetect), and then switches traffic from the affected 
   working path(s) to the backup path(s). Note that this eliminates a 
   phase of signaling that would typically be needed in a signaling-
   based approach to activate the nodes along the backup LSP. 
    
   The key is to ensure that by the time a protection-switching node 
   performs the switch, all intermediate nodes along the associated 
   backup path(s) will have configured themselves. This is assured by 
   selecting a backup path in such a way that for any fault on the 
   corresponding working path, all of the nodes along the backup path 
 
Rabbat, et al            Expires - March 2004                 [Page 3] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
   will have been informed (and will have reconfigured themselves) 
   within a time Tntf following Tdetect. Therefore, a protection-
   switching node performs the switch [Tntf – (Tcurrent – Tdetect)] ms 
   after learning of the fault via a fault notification message. 
    
    
4.   FNP Applicability 
    
   Our objective in this section is to clearly specify how and where FNP 
   applies in the context of optical transport networks, by discussing 
   its applicability along several dimensions, as outlined below.  
    
4.1     Network Model 
    
   FNP is initially designed to operate within a single IGP area, where 
   fine-grained signaling is used.  
    
   In fine-grained signaling, the entire backup resource (link, lambda, 
   and hence, label) is selected during the initial signaling phase for 
   the backup path. Although FNP could also apply to coarse-grained 
   signaling (where only a link bundle is selected during the signaling 
   of the backup path, but the specific lambda and, hence, label, is 
   selected upon the occurrence of a failure) that requires coordination 
   with signaling between adjacent nodes, and is left for further study. 
    
   FNP is useful in contexts where either: (a) the provider implements 
   1:1 restoration and allows the bandwidth on the backup path to be 
   shared by trails that originate and terminate at nodes other than the 
   s-d of the backup path, or (b) the provider implements more general 
   shared-mesh restoration, where multiple working LSPs with disjoint 
   paths share backup resources. 
    
4.2    Node Architecture 
    
   FNP is designed to work in networks with OEO nodes. Its applicability 
   to networks with OOO nodes (that is, fully transparent all-optical 
   networks) depends on the monitoring capabilities of the OOO systems 
   deployed, and is for further study. 
    
   For a network with OEO nodes, the fault detection and correlation 
   (which happens before FNP is activated, and is outside the scope of 
   this document) occurs at the node closest to the fault. Once the 
   detection procedure has determined that a bonafide fault has 
   occurred, it activates FNP for fault notification. 
    
4.3    Fault Model (Types of faults supported) 
    
   FNP is designed to support three types of faults in an optical 
   transport network – fiber cuts, transponder failures, and switch 
 
Rabbat, et al            Expires - March 2004                 [Page 4] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
   failures. These correspond, respectively, to link faults, lightpath 
   or LSP faults, and node faults. 
    
4.4    Network Layer at which FNP Applies 
    
   In the case of optical transport networks, FNP is designed to operate 
   at the fiber and optical lightpath layers. The protocol works in the 
   context of an optical transport layer that is controlled by an IP-
   based control plane. 
    
   The operation of FNP in a multi-layer context, is a complex problem, 
   and is for further study. (For example, in a multi-layer situation, 
   the goal might be to perform notification both at the layer closest 
   to the fault (as FNP currently does) and at the service layer (for 
   example at the level of a VT1.5 circuit that may typically be 
   embedded inside a larger SONET/SDH circuit on a lightpath).) 
    
4.5    Relationship to Higher (Packet) Layers 
    
   A key aspect of using FNP at the optical transport layer to provide 
   time-bounded notification (and hence recovery) is to be able to 
   provide the higher (packet) layer some guarantees on how long the 
   optical transport layer would take to respond to a failure. 
    
   This allows carriers to implement appropriate hold-off timers at the 
   higher-layers, and to use this information to craft adequate SLA’s 
   with their customers. 
    
   In the event where the client layers (higher (packet) layers) and the 
   server layer (the optical transport layer) are under the control of 
   different providers, it is reasonable to expect that the inter-
   provider agreements between the carriers would incorporate protection 
   switching timing bounds. In that case,  notification timing bound 
   guarantees provided by the carrier owning/operating the server layer 
   would be useful to enable the carrier owning/operating the client 
   layer to, in turn, incorporate these in the SLAs it signs. This 
   notion could be applied recursively between pairs of adjacent 
   carriers.  
    
4.6    Operational Model 
    
   FNP is applicable in a hierarchical network layering model, for 
   example, packet over SONET/SDH over lambda over fiber, with the 
   recognition that the SONET/SDH layer is itself a layered architecture 
   (for example, VT1.5 in STS-1 in STS-3 in STS-12/48).  
    
   Note that FNP does not by itself impose any requirements on the 
   policy that the provider uses to devise pre-emption schemes in the 
   case where shared restoration and extra-traffic are used. As a 
 
Rabbat, et al            Expires - March 2004                 [Page 5] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
   practical matter, however, the carrier (or carriers) involved would 
   have to devise pre-emption schemes that are not susceptible to a 
   domino effect (where the removal of some extra-traffic LSP causes a 
   cascading effect, triggering the pre-emption of a series of LSPs). A 
   carrier would be expected to ensure this simply to maintain network 
   stability. 
    
4.7    Framing and Data Plane Considerations 
    
   FNP is a control-plane mechanism for disseminating fault information 
   throughout a network. As explained in Section 3, in the context of 
   transport networks, the flooding mechanism of FNP accomplishes both 
   notification and node reconfiguration simultaneously. That is, it 
   informs the intermediate nodes along a backup LSP corresponding to an 
   affected working LSP of a fault, thus allowing them to reconfigure 
   themselves, while at the same time notifying the edge nodes 
   responsible for taking a restoration action to recover the affected 
   LSP(s). 
    
   When appropriate digital framing of the optical signal is available 
   in the data plane (e.g. G.709 digital wrapper or SONET/SDH framing), 
   and the optical transport nodes can process and interpret the framing 
   overhead, FNP can interwork with the fault notification mechanisms 
   available in the data plane (e.g. the Forward/Backward Defect 
   Indication signals embedded in the framing overhead). In this case, 
   even though notification of the end nodes may occur in the data 
   plane, the notification of the nodes along the backup paths of the 
   affected working paths is still needed so that they can reconfigure 
   themselves. This can be accomplished via FNP. 
    
4.8    Scalability Considerations 
    
   FNP ensures that at most one message is exchanged on every control 
   channel link, whereas fault notification using signaling may lead to 
   a large number of signaling messages per link, as explained shortly.  
   This leads to a scalability advantage for DWDM networks that have a 
   large number of wavelengths or when there are numerous LSPs, each 
   corresponding to a small granularity SONET/SDH channel. 
    
   Let us define the length of a control channel between two adjacent 
   nodes to be the number of hops that a control message takes to go 
   from one node to the other.  Thus, an in-band control channel has 
   length one.  By extension, the length of a path in the control plane 
   is the sum of lengths of control channels used in this path.  In 
   practice, the maximum number of messages using signaling per failed 
   LSP is equal to the length of the path that the notification message 
   takes from the detecting node to the protection switching point "s" 
   plus twice sum of the lengths of the control channels corresponding 
   to each hop of the protection path from s to d (s-d protection path 
 
Rabbat, et al            Expires - March 2004                 [Page 6] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
   on the control channel).  For the set of affected LSPs, that value is 
   multiplied by the number of LSPs affected by the fault.  The number 
   of messages, in the worst case, is thus directly proportional to the 
   number of LSPs affected.  This compares to a maximum number of 
   messages for FNP equal to the sum of the lengths of all control 
   channels in the network. 
    
4.9    Guidelines for Deployment 
    
   While use of FNP can be appropriate in a variety of situations, we 
   provide some initial thoughts on deployment considerations here.  
    
   FNP is expected to be very useful in core optical networks where the 
   provider deploys a mesh-based topology and has a large number of 
   active lambdas (or the possibility of having several lambdas turned 
   on as the network grows). As explained earlier, this would save on 
   the signaling overhead of individually activating each backup LSP. 
    
   As explained in Section 4.5, FNP is applicable in situations where 
   adjacent client and server layers are under the control of different 
   providers.  Although FNP does not impose a limit on how many 
   providers may be involved in offering service to the end customer, 
   practical considerations would dictate that this “recursion” of 
   provider client-server relationships not be more than a few levels 
   deep. 
    
    
5.   Conclusion 
    
   This document has provided an overview of the domain of applicability 
   of the FNP protocol in the context of optical transport networks. By 
   outlining the network, node, and fault models to which FNP applies, 
   the document has provided guidelines on where FNP is currently 
   usable, and outlined areas of further work. 
    
    
6.   Acknowledgements 
    
   We would like to thank the members of the CCAMP WG for on-line and 
   off-line discussions that helped shape some of the ideas behind this 
   document. In particular,  Adrian Farrel, Zafar Ali, Neil Harisson, 
   Jonathan Sadler, Jonathan Lang, Fabio Ricciato and Roberto Albanese. 
    
    
7.   Intellectual Property Considerations 
    
   This section is taken from Section 10.4 of RFC2026 [1]. 
    

Rabbat, et al            Expires - March 2004                 [Page 7] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
   The IETF takes no position regarding the validity or scope of any 
   intellectual property or other rights that might be claimed to 
   pertain to the implementation or use of the technology described in 
   this document or the extent to which any license under such rights 
   might or might not be available; neither does it represent that it 
   has made any effort to identify any such rights. Information on the 
   IETF's procedures with respect to rights in standards-track and 
   standards-related documentation can be found in BCP-11. Copies of 
   claims of rights made available for publication and any assurances of 
   licenses to be made available, or the result of an attempt made to 
   obtain a general license or permission for the use of such 
   proprietary rights by implementors or users of this specification can 
   be obtained from the IETF Secretariat. 
    
   The IETF invites any interested party to bring to its attention any 
   copyrights, patents or patent applications, or other proprietary 
   rights, which may cover technology that may be required to practice 
   this standard. Please address the information to the IETF Executive 
   Director. 
    
8.   References
                     
   [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 
      9, IETF RFC 2026, October 1996. 
    
   [2] Rabbat, R., and V. Sharma (Eds.), "Fault Notification Protocol 
      for GMPLS-Based Recovery", Internet Draft, work in progress, 
      draft-rabbat-fault-notification-protocol-03.txt, June 2003. 
     
   [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement 
      Levels," BCP 14, IETF RFC 2119, March 1997. 
    

Rabbat, et al            Expires - March 2004                 [Page 8] 
 
                                     
9.   Authors' Addresses  
      
   Richard Rabbat                       
   Fujitsu Labs of America, Inc.        
   1240 E. Arques Ave, MS 345           
   Sunnyvale, CA 94085                  
   United States of America             
   Phone: +1-408-530-4537               
   Email: rabbat@fla.fujitsu.com        
    
   Ching-Fong Su 
   Fujitsu Labs of America, Inc. 
   1240 E. Arques Ave 
   Sunnyvale, CA 94085 
   United States of America 
   Phone: +1-408-530-4572 
   Email: csu@fla.fujitsu.com. 
    
   Vishal Sharma 
   Metanoia, Inc. 
   1600 Villa Street, Unit 352 
   Mountain View, CA 94041 
   United States of America 
   Phone: +1-408-530-8313 
   Email: v.sharma@ieee.org 


Rabbat, et al            Expires - April 2004                 [Page 9] 
 
                draft-rabbat-fnp-applicability-00.txt      October 2003 
 
 
10.    Full Copyright Statement 
    
   "Copyright (C) The Internet Society (2003). All Rights Reserved. 
   This document and translations of it may be copied and furnished to  
   others, and derivative works that comment on or otherwise explain it 
   or assist in its implementation may be prepared, copied, published 
   and distributed, in whole or in part, without restriction of any 
   kind, provided that the above copyright notice and this paragraph are 
   included on all such copies and derivative works. However, this 
   document itself may not be modified in any way, such as by removing 
   the copyright notice or references to the Internet Society or other 
   Internet organizations, except as needed for the purpose of 
   developing Internet standards in which case the procedures for 
   copyrights defined in the Internet Standards process must be 
   followed, or as required to translate it into languages other than 
   English. 
    
   The limited permissions granted above are perpetual and will not be 
   revoked by the Internet Society or its successors or assigns. 
    
   This document and the information contained herein is provided on an 
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 
    
    
Rabbat, et al            Expires - March 2004                [Page 10]