Internet DRAFT - draft-zhang-ccamp-obs

draft-zhang-ccamp-obs



Internet Engineering Task Force                               Ping Zhang                     
Internet Draft                                              Jia Jia Liao  
Expires: December 2006                                      Zheng Bin Li  
                                                               An Shi Xu
          National Laboratory on Local Fiber-Optic Communication Network
                                 & Advanced Optical Communication System
                                                Peking University, China
                                                               June 2006


           Retransmission in Optical Burst Switching Network
                  draft-zhang-ccamp-obs-00.txt 


Status of this Memo

   By submitting this Internet-Draft, each author represents that any 
   applicable patent or other IPR claims of which he or she is aware 
   have been or will be disclosed, and any of which he or she becomes 
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time. It is inappropriate to use Internet-Drafts as reference 
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html.

Copyright Notice

   Copyright (C) The Internet Society (2006). All Rights Reserved.

Abstract

   Reliabilty is one of the essential requirements in data transmission
   network. In the traditional IP/TCP network, TCP provides end to end
   reliable transmission. Due to the high burst loss probability in
   Optical Burst Switching (OBS) Network, if we leave the retransmission 
   work to the uppper IP/TCP layer, the network may be overloaded. 
   Besides the simple retransmission in the IP/TCP network, OBS network 
   which builds on Wavelength Division Mutiplexing system has a new 
   retransmission scheme -- alternate wavelength retransmission. With 
   the information of network status collected in the edge nodes, 
   retransmission in the OBS network is more efficient and delay is 
   shorter. Also TCP and OBS retransmission can work together which is 
   called multi-layer retransmission.  
   
Zhang,Liao,Li,Xu                                             [Page 1]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006

Conventions

   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
   document are to be interpreted as described in RFC2119 [RFC 2119].

















































Zhang,Liao,Li,Xu                                             [Page 2]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006

Table of Contents

   1. Introduction.....................................................4 
      1.1. OBS Network Architecture....................................4
      1.2. Multi-layer Interoperation..................................5
   2. TCP Retransmission...............................................5
      2.1. Retransmission Policy.......................................6
          2.1.1. First-only............................................6
          2.1.2. Batch.................................................6
          2.1.3. Individual............................................6
      2.2. Retransmission Timer........................................6
          2.2.1. Round-Trip Time.......................................6
          2.2.2. Exponential RTO Backoff...............................7
   3. OBS Retransmission...............................................7
      3.1. Motivation..................................................7
      3.2. OBS Retransmission Parameters...............................8
          3.2.1. Delay.................................................8
          3.2.2. Edge Node Buffer......................................8
      3.3. OBS Retransmission Schemes..................................8
          3.3.1. Simple Retransmission.................................8
          3.3.2. Alternate Wavelength Retransmission...................9
          3.3.3. Alternate Routing Retransmission......................9
   4. Multi-layer Retransmission.......................................9
      4.1. Coordination of TCP and OBS Retransmission..................9
   5. Acknowledgements................................................10
   6. References......................................................10
   7. AUTHORS' ADDRESSES..............................................10
   8. IPR NOTICE......................................................10
   9. FULL COPYRIGHT STATEMENT........................................11
   





 


















Zhang,Liao,Li,Xu                                             [Page 3]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006   

1. Introduction

   The basic difference among Optical Circuit Switching (OCS), Optical
   Burst Switching (OBS) and Optical Packet Switching (OPS) is that the 
   three work at different granularity. OCS, which has already been 
   widely deployed, aims to switch at wavelength, waveband or even fiber 
   level. However, in most cases, individual users can hardly afford a 
   whole wavelength, and thus TDM is applied to provide each channel 
   with fixed percentage of the total bandwidth by splitting wavelength 
   into recurring time-slots. This approach has been proved to be less 
   bandwidth efficient than OPS, which is able to switch packets like IP
   network in optical domain. However, some critical technologies 
   essential to OPS, such as optical random access memory, are far away 
   from maturity. OBS, supposed to bridge above two mechanisms, is able 
   to switch bufferlessly at sub-wavelength level. OBS, featured with 
   unidirectional reservation and statistical multiplexing of wavelength 
   resources, has brought great flexibility to optical bandwidth 
   distribution. 
   
1.1. OBS Network Architecture

   OBS network is composed of two sub-planes, namely data plane and 
   control plan, as shown in Figure 1. In data plane, traffic from OBS 
   client layer (e.g. IP or ATM layer) is aggregated into Data Bursts 
   (DBs) at ingress edge nodes which perform as an interface to the 
   upperlayer and local at the edge of OBS layer [IPOWDM]. DBs will be 
   sent through core nodes to their egress edge nodes without o-e-o 
   conversion, in which a few optical fiber delay lines may be applied 
   to reduce overall blocking probability. DBs usually contains tens to 
   thousands of thousand bits including payload and frame overhead. 
  
  
                                 [ core ]
                                /[ node ]\
                               /     |    \
                              /  |------|  \
                             /   | -\/- |   \
             |      [ core ]/   /| -/\- |\   \[ core ]      |
      from   |------[ node ]\  / |------| \  /[ node ]------|   to 
   upperlayer|          |    \/            \/    |          |upperlayer
     ------->|      |------| /\            /\ |------|      |-------->
     ------->|------| -\/- |/  \          /  \| -\/- |------|-------->
     ------->|      | -/\- |\   \[ core ]    /| -/\- |      |-------->
     traffic |      |------| \   [ node ]   / |------|      | traffic
             |                \      |     /                |
     Ingress edge node         \ |------| /          Egress edge node
                                \| -\/- |/
                                 | -/\- |
                                 |------|    

	      
   Figure 1:  OBS Network Architecture


Zhang,Liao,Li,Xu                                             [Page 4]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006

   Control plane mainly conduct routing and resource reservation by 
   configuring optical switching fabric according to signalling. An 
   offset time ahead of DB transmission, Control Packets (CPs) will be
   sent through core nodes to establish an available light path from 
   ingress edge node to the egress one. At each stop at core nodes, CPs 
   will experience o-e-o conversion and be processed electrically to 
   trigger switching fabric. As long as every configuration is 
   successful, DB is able to transmit across the network. However, once 
   one of the core nodes along light path fails to act, the DB will have 
   to be discarded and bandwidth that has already been reserved will be 
   released.
	      
1.2. Multi-layer Interoperation	      
	  
   OBS layer, viewed as an data-link layer, aims to provide reliable 
   end-to-end path for data transmission. As shown in Figure 2, its 
   server layer is optical layer with huge physical bandwidth and 
   its client layer can be network layer (e.g. IP) or others like ATM. 
   With the increase in volume and importance of IP traffic, 
   applications based on IP has become dominant. Thus in this draft, we 
   only consider IP as the client layer, for which OBS acts to provide 
   available bitpipe.
	 
	 
             |+++++++++++++++++++|          |+++++++++++++++++++|
             | Application Layer |<-------->| Application Layer |
             |+++++++++++++++++++|          |+++++++++++++++++++|
             |     IP Layer      |<-------->|   Network Layer   |
             |+++++++++++++++++++|          |+++++++++++++++++++|
             |     OBS Layer     |<-------->|  Data-link Layer  |
             |+++++++++++++++++++|          |+++++++++++++++++++|
             |   Optical Layer   |<-------->|   Physical Layer  |
             |+++++++++++++++++++|          |+++++++++++++++++++|
    
          	 
   Figure 2: Layered Network       	 

   Optical layer, lying under OBS layer, focuses on optical signal 
   transmission, amplifying, multiplexing and demultiplexing. From the 
   view of OBS, optical layer offers Optical Channel-Path (OCh-P), 
   connecting distributed OBS nodes. OCh-P represents the end-to-end 
   transport of a lightpath across multiple regenerators in the path
   [Optical]. 
	 	
   OBS nodes are classifies as edge nodes and core nodes. Edge nodes 
   consist of aggregating queues, CPs' generator and CPs' and DBs' 
   transmitter or receiver. Core nodes comprise CPs' processor, switch 
   driver, and switching fabric.

 2. TCP Retransmission

   In the IP/TCP layer, TCP provides end to end reliable transmission.
   While a packet is sent out, one copy of the packet is buffered at the
   
Zhang,Liao,Li,Xu                                             [Page 5]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006   

   source node. If the packet is successfully received at the 
   destination node, the copy will be destoryed. Otherwise, the source
   node will automatically retransmit the packet after a preset time.

2.1. Retransmission Policy
  
   At the source node, all packets that have been sent but have not 
   received the acknowledgement from the destination node will be put in
   the buffering queue. If the source node fails to receive an 
   acknowledgement at a given time, the corresponding packet will be 
   retransmitted. One of the three retransmission strategies may be 
   adopted [Comm]:

2.1.1. First-only

   Only one retransmisson timer is needed for the entire queue. When an
   acknowledgement is received, remove the acknowledged packet from the
   buffering queue and reset the timer. Once the timer expires, the 
   packet at he front of the queue is retransmitted and the timer is
   reset.

2.1.2. Batch

   Also one retransmisson timer for the entire queue. When the source
   node receives an acknowledgement, it removes the corresponding packet
   from the buffering queue and reset the timer. Once the timer expires, 
   all packet in the queue is retransmitted and the timer is reset.

2.1.3. Individual

   One retransmisson timer for each packet in the queue. When an
   acknowledgement is received, remove the acknowledged packet from the
   buffering queue and destory the corresponding timer. If any timer 
   expires, the crresponding packet is retransmitted individually and
   its timer is reset.

   
   The first-only policy is more efficient because only lost packets are
   retransmitted. However, the packets in the queue may experience 
   considerable delays because the timer for one packet is not set until
   it moves to the front of the queue. The individual policy can improve
   the delay by deploying one timer for each packet. The batch policy 
   also reduces the delay at the cost of unnecessary retransmissions.
   
2.2. Retransmission Timer

   The retransmission timer (also call retransmission timeout, RTO) is
   a critical parameter in TCP congestion control. If the timer is set 
   too long, the end to end transmission delay may not be acceptable. On 
   the other hand, if the timer is set too short, it will result in some 
   unnecessary retransmissions.     

2.2.1. Round-Trip Time   

Zhang,Liao,Li,Xu                                             [Page 6]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006     
     
   The retransmission timer (RTO) is usually set according to the 
   estimated round-trip time (RTT). There are two approaches to estimate
   the round-trip time: Simple Average and Exponential Average [RFC 793].

   Simple Average:

        ARTT(K+1)=(K*ARTT(K)+RTT(K+1))/(K+1)

   where RTT(i) is the round-trip time observed for the ith transmitted
   packet, and ARTT(K) is the average round-trip time for the first K
   packets with ARTT(0)=0.

   Exponential Average:

        SRTT(K+1)=a*SRTT(K)+(1-a)*RTT(K+1)

   where SRTT(K) is the smoothed round-trip time for the first K packets 
   with SRTT(0)=0, and a (0<a<1) is a constant value.

2.2.2. Exponential RTO Backoff

   In order to avoid the sustained congetion, a TCP source should 
   increase its retransmission timeout (RTO) each time the same packet 
   is retransmitted. This is called a backoff process. A simple way for
   this is to mutiply the RTO by a constant value each time the same 
   packet is retransmitted. 
   
3. OBS Retransmission

   OBS layer performs as a data-link layer located between optical layer 
   and network layer (IP layer). The major duties of this layer is to 
   provide reliable and quick bitpipes for its client layer (network 
   layer) and to make effective utilization of huge bandwidth of its 
   server layer. 
   
3.1. Motivation

   In OBS network, though we can leave the retransmission problem to the
   upper IP/TCP layer, it is inefficient and may result in long delays.
   It is more convenient and efficient to carry out parts of the 
   retransmission in the OBS layer.
      
   First, a burst is much larger than an IP packet: a burst may be as 
   large as 10 to 100 MB, while a typical IP packet is some 10 KB. That
   is to say a burst may contain thousands of IP packets. If a burst is
   lost and the retransmission is done by TCP layer, thousands of 
   signallings should be exchanged to retransmit all the packets in the 
   lost burst. On the contrary, if the burst is retransmitted in the
   OBS layer, the work would be much easier as only one burst needs to
   be retransmitted.

   Second, more information of the bursts and network status is 
   avialable in the OBS layer. In OBS network, the edge nodes assemble 
   
Zhang,Liao,Li,Xu                                             [Page 7]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006     
   
   data bursts, create the burst control packets (BCPs), and also choose   
   the routing for the bursts. If a burst is lost, the edge node can
   choose another wavelength or routing according to the information 
   return from the node where the burst is lost. 
     
3.2. OBS Retransmission Parameters

   OBS retransmission policy is somewhat similar to the TCP individual
   retransmission policy. Each burst has its own timer. The difference
   is that in OBS layer negative acknowledgements (or failures) are 
   used. After a burst is transmitted, one copy of the burst is 
   buffered in the source edge node, and a timer is set. If the source
   node has not received a faiure when the timer expires, the copy in
   the buffer will be removed. Otherwise, the source node will
   retransmit the burst after it receives a failure.

   Some parameters need to be considered when retransmission is 
   implemented in the OBS layer. The most important two parameters 
   are: delay and buffer.

3.2.1. Delay

   It is obvious that retransmission introduces additional delay both
   in TCP and OBS layer. For some real-time traffic, this extra delay
   is not acceptable. Therefore, not all the lost bursts need to be 
   retransmitted. In OBS layer, the total delay includes the burst
   assembling delay, offset time and the propagation dealy. 

3.2.2. Buffer

   As bursts in OBS layer is much larger than IP packets, the buffer
   needed at the OBS edge node is also much larger. For example, a
   typical 10 Gbps link with timer set as 50 ms will need as large as
   500 Mb buffer capacity. This huge buffer reqirement limits the 
   retransmission times in the OBS network. To retransmit the lost
   burst only once in OBS layer may be a good choice. 
 
3.3. OBS Retransmission Schemes

   OBS has its own retransmission schemes. Besides the simple
   retransmission which is almost the same as the TCP retransmission,
   there are two other transmission schemes which does not exist in
   TCP retransmission: Alternate Wavelength Retransmission and
   Alternate Routing Retransmission.

3.3.1. Simple Retransmission

   Each burst transmitted is buffered in the edge node and a timer is 
   set. If a failure is received before the timer expires, the edge
   node just retransmits the burst using the same wavelength and same
   routing. It is similar to the individual retransmission policy in the
   TCP layer, thus is called Simple Retransmission.
   

Zhang,Liao,Li,Xu                                             [Page 8]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006   
 
3.3.2. Alternate Wavelength Retransmission
          
   OBS is applied in the Wavelength Division Mutiplexing (WDM) network, 
   and one optical link contains many wavelength channels. Further 
   assume that no wavelength convesion is adopted (which is true in most 
   WDM networks built nowadays), in this case different wavelengths can
   be viewed as independent channels, and the lost bursts can be 
   retransmitted in another wavelength channel. It is called Alternate
   Wavelength Retransmission. 

3.3.3. Alternate Routing Retransmission

   In OBS network, routing is decided at the edge node, while in IP
   network packets are routed locally hop by hop. When a burst is lost,
   it reveals that the path may be in heavy load, so we can choose an
   alternate routing path for the retransmitted burst. This is called
   Alternate Routing Retransmission. By carefully choose the alternate
   routing, the load of the network can be balanced.
   
4. Multi-layer Retransmission

   Retransmission in TCP layer has been studied for several decades,
   and has been used widely in the Internet. It is considered to be 
   usefull and robust, although it is far from being perfect. It is 
   inefficient and may result in large delays.
   
   OBS is only proposed for a few years, and has not been uesd in the
   real network yet. Little work has been done on the retransmission in
   OBS layer. Also due to the limits on retranmission times in OBS 
   layer, it can not guarantee end to end reliable transmission (no 
   burst loss). 

   Multi-layer retransmission can solve the problems we meet when 
   applying retransmission in single layer. TCP and OBS layer can work
   together on the retransmission. 

4.1. Coordination of TCP and OBS Retransmission
   
   OBS retransmission is more efficient and responds more quickly than 
   TCP retransmission. Therefore, every time a burst is lost, OBS layer
   should first carry out the retransmission. During this period, TCP
   retransmission functions should be restrained. For example, the OBS
   layer can send a message to the TCP layer to stop the retransmission
   timer. 

   Suppose that bursts are retransmitted only once in the OBS layer. 
   If the bursts are lost the second time when retransmitted, OBS layer
   will not retransmit the bursts any more but will report the failure 
   to the TCP layer, and then TCP retransmission timer will continue to
   work. 

   After all, multi-layer retransmission of TCP and OBS has better
   performance than single-layer retransmission.

Zhang,Liao,Li,Xu                                            [Page 9]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006    

5. Acknowledgements

   This research is funded by the National High Technology Research and
   Development Program of China (863 Program).

   The authors are grateful to other colleagues for their work and 
   useful suggestions.
   
6. References

   [IPOWDM]       S. Dixit, "IP OVER WDM: Building the Next-Generation 
                  Optical Internet". WILEY-INTERSCIENCE, 2002.
   [Optical]      R. Ramaswami and K. N. Sivarajan, "Optical Networks". 
                  Morgan Kaufmann Publishers, 2004.     
   [Comm]         William Stallings, "Data and Computer Communications".  
                  Sixth Edition, Pearson Education Publishers, 2000.
   [RFC 793]      J. Postel, "Transmission control protocol". RFC 793,
                  September 1981.
                  
7. AUTHORS' ADDRESSES
   
   Ping Zhang
   National Laboratory on Local Fiber-Optic Communication Network 
   & Advanced Optical Communication System, Peking University, 100871
   P.R. China
   Email: zhangping@pku.edu.cn

   Jia Jia Liao
   National Laboratory on Local Fiber-Optic Communication Network 
   & Advanced Optical Communication System, Peking University, 100871
   P.R. China
   Email: jjliao@ele.pku.edu.cn
      
   Zheng Bin Li
   National Laboratory on Local Fiber-Optic Communication Network 
   & Advanced Optical Communication System, Peking University, 100871
   P.R. China
   Email: lizhengbin@pku.edu.cn
   
   An Shi Xu
   National Laboratory on Local Fiber-Optic Communication Network 
   & Advanced Optical Communication System, Peking University, 100871
   P.R. China
   Email: lyrxas@pku.edu.cn
   
8. IPR NOTICE

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights. Information
      
Zhang,Liao,Li,Xu                                            [Page 10]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006    

   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.
   
   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at ietf-
   ipr@ietf.org.

9. FULL COPYRIGHT STATEMENT

   Copyright (C) The Internet Society (2006). This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for

   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.







Zhang,Liao,Li,Xu                                            [Page 11]
------------------------------------------------------------------------
Internet Draft       draft-zhang-ccamp-obs-00.txt           June 2006