Internet Engineering Task Force Ping Zhang Internet Draft Jia Jia Liao Expires: December 2006 Zheng Bin Li An Shi Xu National Laboratory on Local Fiber-Optic Communication Network & Advanced Optical Communication System Peking University, China June 2006 Retransmission in Optical Burst Switching Network draft-zhang-ccamp-obs-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2006). All Rights Reserved. Abstract Reliabilty is one of the essential requirements in data transmission network. In the traditional IP/TCP network, TCP provides end to end reliable transmission. Due to the high burst loss probability in Optical Burst Switching (OBS) Network, if we leave the retransmission work to the uppper IP/TCP layer, the network may be overloaded. Besides the simple retransmission in the IP/TCP network, OBS network which builds on Wavelength Division Mutiplexing system has a new retransmission scheme -- alternate wavelength retransmission. With the information of network status collected in the edge nodes, retransmission in the OBS network is more efficient and delay is shorter. Also TCP and OBS retransmission can work together which is called multi-layer retransmission. Zhang,Liao,Li,Xu [Page 1] ------------------------------------------------------------------------ Internet Draft draft-zhang-ccamp-obs-00.txt June 2006 Conventions The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [RFC 2119]. Zhang,Liao,Li,Xu [Page 2] ------------------------------------------------------------------------ Internet Draft draft-zhang-ccamp-obs-00.txt June 2006 Table of Contents 1. Introduction.....................................................4 1.1. OBS Network Architecture....................................4 1.2. Multi-layer Interoperation..................................5 2. TCP Retransmission...............................................5 2.1. Retransmission Policy.......................................6 2.1.1. First-only............................................6 2.1.2. Batch.................................................6 2.1.3. Individual............................................6 2.2. Retransmission Timer........................................6 2.2.1. Round-Trip Time.......................................6 2.2.2. Exponential RTO Backoff...............................7 3. OBS Retransmission...............................................7 3.1. Motivation..................................................7 3.2. OBS Retransmission Parameters...............................8 3.2.1. Delay.................................................8 3.2.2. Edge Node Buffer......................................8 3.3. OBS Retransmission Schemes..................................8 3.3.1. Simple Retransmission.................................8 3.3.2. Alternate Wavelength Retransmission...................9 3.3.3. Alternate Routing Retransmission......................9 4. Multi-layer Retransmission.......................................9 4.1. Coordination of TCP and OBS Retransmission..................9 5. Acknowledgements................................................10 6. References......................................................10 7. AUTHORS' ADDRESSES..............................................10 8. IPR NOTICE......................................................10 9. FULL COPYRIGHT STATEMENT........................................11 Zhang,Liao,Li,Xu [Page 3] ------------------------------------------------------------------------ Internet Draft draft-zhang-ccamp-obs-00.txt June 2006 1. Introduction The basic difference among Optical Circuit Switching (OCS), Optical Burst Switching (OBS) and Optical Packet Switching (OPS) is that the three work at different granularity. OCS, which has already been widely deployed, aims to switch at wavelength, waveband or even fiber level. However, in most cases, individual users can hardly afford a whole wavelength, and thus TDM is applied to provide each channel with fixed percentage of the total bandwidth by splitting wavelength into recurring time-slots. This approach has been proved to be less bandwidth efficient than OPS, which is able to switch packets like IP network in optical domain. However, some critical technologies essential to OPS, such as optical random access memory, are far away from maturity. OBS, supposed to bridge above two mechanisms, is able to switch bufferlessly at sub-wavelength level. OBS, featured with unidirectional reservation and statistical multiplexing of wavelength resources, has brought great flexibility to optical bandwidth distribution. 1.1. OBS Network Architecture OBS network is composed of two sub-planes, namely data plane and control plan, as shown in Figure 1. In data plane, traffic from OBS client layer (e.g. IP or ATM layer) is aggregated into Data Bursts (DBs) at ingress edge nodes which perform as an interface to the upperlayer and local at the edge of OBS layer [IPOWDM]. DBs will be sent through core nodes to their egress edge nodes without o-e-o conversion, in which a few optical fiber delay lines may be applied to reduce overall blocking probability. DBs usually contains tens to thousands of thousand bits including payload and frame overhead. [ core ] /[ node ]\ / | \ / |------| \ / | -\/- | \ | [ core ]/ /| -/\- |\ \[ core ] | from |------[ node ]\ / |------| \ /[ node ]------| to upperlayer| | \/ \/ | |upperlayer ------->| |------| /\ /\ |------| |--------> ------->|------| -\/- |/ \ / \| -\/- |------|--------> ------->| | -/\- |\ \[ core ] /| -/\- | |--------> traffic | |------| \ [ node ] / |------| | traffic | \ | / | Ingress edge node \ |------| / Egress edge node \| -\/- |/ | -/\- | |------| Figure 1: OBS Network Architecture Zhang,Liao,Li,Xu [Page 4] ------------------------------------------------------------------------ Internet Draft draft-zhang-ccamp-obs-00.txt June 2006 Control plane mainly conduct routing and resource reservation by configuring optical switching fabric according to signalling. An offset time ahead of DB transmission, Control Packets (CPs) will be sent through core nodes to establish an available light path from ingress edge node to the egress one. At each stop at core nodes, CPs will experience o-e-o conversion and be processed electrically to trigger switching fabric. As long as every configuration is successful, DB is able to transmit across the network. However, once one of the core nodes along light path fails to act, the DB will have to be discarded and bandwidth that has already been reserved will be released. 1.2. Multi-layer Interoperation OBS layer, viewed as an data-link layer, aims to provide reliable end-to-end path for data transmission. As shown in Figure 2, its server layer is optical layer with huge physical bandwidth and its client layer can be network layer (e.g. IP) or others like ATM. With the increase in volume and importance of IP traffic, applications based on IP has become dominant. Thus in this draft, we only consider IP as the client layer, for which OBS acts to provide available bitpipe. |+++++++++++++++++++| |+++++++++++++++++++| | Application Layer |<-------->| Application Layer | |+++++++++++++++++++| |+++++++++++++++++++| | IP Layer |<-------->| Network Layer | |+++++++++++++++++++| |+++++++++++++++++++| | OBS Layer |<-------->| Data-link Layer | |+++++++++++++++++++| |+++++++++++++++++++| | Optical Layer |<-------->| Physical Layer | |+++++++++++++++++++| |+++++++++++++++++++| Figure 2: Layered Network Optical layer, lying under OBS layer, focuses on optical signal transmission, amplifying, multiplexing and demultiplexing. From the view of OBS, optical layer offers Optical Channel-Path (OCh-P), connecting distributed OBS nodes. OCh-P represents the end-to-end transport of a lightpath across multiple regenerators in the path [Optical]. OBS nodes are classifies as edge nodes and core nodes. Edge nodes consist of aggregating queues, CPs' generator and CPs' and DBs' transmitter or receiver. Core nodes comprise CPs' processor, switch driver, and switching fabric. 2. TCP Retransmission In the IP/TCP layer, TCP provides end to end reliable transmission. While a packet is sent out, one copy of the packet is buffered at the Zhang,Liao,Li,Xu [Page 5] ------------------------------------------------------------------------ Internet Draft draft-zhang-ccamp-obs-00.txt June 2006 source node. If the packet is successfully received at the destination node, the copy will be destoryed. Otherwise, the source node will automatically retransmit the packet after a preset time. 2.1. Retransmission Policy At the source node, all packets that have been sent but have not received the acknowledgement from the destination node will be put in the buffering queue. If the source node fails to receive an acknowledgement at a given time, the corresponding packet will be retransmitted. One of the three retransmission strategies may be adopted [Comm]: 2.1.1. First-only Only one retransmisson timer is needed for the entire queue. When an acknowledgement is received, remove the acknowledged packet from the buffering queue and reset the timer. Once the timer expires, the packet at he front of the queue is retransmitted and the timer is reset. 2.1.2. Batch Also one retransmisson timer for the entire queue. When the source node receives an acknowledgement, it removes the corresponding packet from the buffering queue and reset the timer. Once the timer expires, all packet in the queue is retransmitted and the timer is reset. 2.1.3. Individual One retransmisson timer for each packet in the queue. When an acknowledgement is received, remove the acknowledged packet from the buffering queue and destory the corresponding timer. If any timer expires, the crresponding packet is retransmitted individually and its timer is reset. The first-only policy is more efficient because only lost packets are retransmitted. However, the packets in the queue may experience considerable delays because the timer for one packet is not set until it moves to the front of the queue. The individual policy can improve the delay by deploying one timer for each packet. The batch policy also reduces the delay at the cost of unnecessary retransmissions. 2.2. Retransmission Timer The retransmission timer (also call retransmission timeout, RTO) is a critical parameter in TCP congestion control. If the timer is set too long, the end to end transmission delay may not be acceptable. On the other hand, if the timer is set too short, it will result in some unnecessary retransmissions. 2.2.1. Round-Trip Time Zhang,Liao,Li,Xu [Page 6] ------------------------------------------------------------------------ Internet Draft draft-zhang-ccamp-obs-00.txt June 2006 The retransmission timer (RTO) is usually set according to the estimated round-trip time (RTT). There are two approaches to estimate the round-trip time: Simple Average and Exponential Average [RFC 793]. Simple Average: ARTT(K+1)=(K*ARTT(K)+RTT(K+1))/(K+1) where RTT(i) is the round-trip time observed for the ith transmitted packet, and ARTT(K) is the average round-trip time for the first K packets with ARTT(0)=0. Exponential Average: SRTT(K+1)=a*SRTT(K)+(1-a)*RTT(K+1) where SRTT(K) is the smoothed round-trip time for the first K packets with SRTT(0)=0, and a (0