Delay-Tolerant Networking Research Group Wenfeng Shi Internet Draft Qi Xu Intended status: Experimental Bohao Feng Expires: December 16, 2017 Huachun Zhou Beijing Jiaotong University June 15, 2017 A Mechanism Coping with Unexpected Disruption in Space Delay Tolerant Networks draft-shi-dtn-amcud-04.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on December 16, 2017. Shi, et al. Expires December 16, 2017 [Page 1] Internet-Draft amcud June 2017 Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract This document proposes a coping mechanism used to deal with the unpredictable disruption problem and congestion control problem in Space Delay Tolerant Networks (DTN) [RFC4838]. Since Licklider Transmission Protocol (LTP) [RFC5326] provides retransmission-based reliability for bundles, several times of retransmissions can be seen as a failure occurred over links. The proposed mechanism is used to direct the following packets to other nodes as soon as the selected path is detected as disruption or congestion and probes the availability of the links which has disrupted unexpectedly. Table of Contents 1. Introduction ................................................ 2 2. Conventions used in this document............................ 3 3. The coping mechanism......................................... 3 4. Security Considerations...................................... 6 5. IANA Considerations ......................................... 7 6. References .................................................. 7 1. Introduction Since the moving trajectory of nodes is scheduled in the space network, it's possible to have a prior knowledge of contact information between any nodes. Consequently, routing algorithms such as Contact Graph Routing (CGR) [CGR] can calculate a delivery path from the source to destination hop by hop based on the connectivity relationship, propagation delay, data rate, etc. Shi, et al. Expires December 16, 2017 [Page 2] Internet-Draft amcud June 2017 However, due to the complexity of the space network, the satellite and its associated links suffer from the electromagnetic interference frequently and this may lead to unpredictable disruption for a period of time. Then, the subsequent bundles sent by the source using the initially contact information cannot be transmitted successfully and retransmission is also occurred. As a result, not only the timeliness of bundles cannot be guaranteed but also limited resources of the node and link are consumed and wasted. Thus, it is important to make a mechanism to handle the unexpected disruption problem. What's more, when the direct path to the destination is unreachable, data will be stored at the intermediated nodes and this will consume the node's storage resources. When the remaining storage space of the contact end node is less than the contact capacity, it will increase the risk of network congestion. However, the upstream nodes have no chance to learn the congestion information. Routes that calculated by the source nodes may not be the best choice. So it is urgent to find a scheme to reflect the congestion status to the upstream nodes. This draft proposes a coping mechanism to deal with the contact unexpected disruption problem and the network congestion problem. The contact unexpected disruption coping mechanism works with Licklider Transmission Protocol (LTP) [RFC5326] and routing algorithms such as Contact Graph Routing (CGR) and it is used to not only direct the following bundles to other nodes when the disruption is occurred but also probe the availability of the disrupted links during its claimed valid time. The congestion control mechanism consists of contact congestion forecasting scheme and congestion- aware data forwarding scheme. The contacts are divided into different congestion levels according to nodes' storage resource. And the data with different priority will be forwarded according to the congestion level. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 3. The coping mechanism Since LTP provides retransmission-based reliability for bundles, several times of retransmissions can be seen as a failure occurred over links. Suppose CGR is used as the routing algorithm. Once the retransmission is detected for more than two times, the contact used Shi, et al. Expires December 16, 2017 [Page 3] Internet-Draft amcud June 2017 in CGR is regarded as temporary corruption. Then, the node marks this contact as temporary disrupted and recalculates the route for subsequent bundles. Besides, a disruption advertisement for the unavailable contact is sent to upstream nodes. When receiving the advertisement, related nodes create disrupting contacts to prevent the use of disrupted links indicated by the advertisement. However, the advertisement may be useless when it arrives at some nodes whose related contacts do not become available until the expiration of the advertisement. Hence, a disruption advertisement group is defined to assure the effectiveness of the contact disruption advertisement. The group contains nodes indicated in corresponding contacts whose "from time" are earlier than the disrupting contact's "to time". When T seconds elapse, a probing message is sent by the node to the destination shown in the disputed contact to check if the connectivity has been recovered.Considering that the contact may be disrupted caused by the damage of satellite, if the detection duration is a fixed short value, it may incur more energy consumption. Thus, it is necessary to set the detection duration dynamically.If the corresponding response message is received, the contact is remarked as recovery and can be used for the following bundles and a contact recovery advertisement is sent to nodes belonging to the advertisement group. Otherwise,the node sends a probe message again 2T seconds later. If the corresponding response still haven t been received, the node will set the prove message 3T seconds later. Also a maximum detection duration should be set to guarantee the detection accuracy. In this way, the node probes the disrupted link periodically until the contact is recovered or expired. In the space network, the communication start time, end time and transmission rate between two nodes is known in advance and is configured into contact plan in CGR. Thus it is convenient to compute the residual capacity of the contact. When the monitoring node detects that the remaining storage capacity of the node is less than the residual capacity of the contact whose end node is the monitoring node, it will compute the congestion level of the contact. If the remaining capacity of the monitoring node is less than thirty percent of the residual capacity of the contact, the contact will be marked as mild congested. If less than ten percent, the contact will be marked as severe congested. If the capacity of the node is exhausted, the contact will be marked as complete congested. When the congestion level changed, the monitoring node will record the new level of the contact in the contact plan and send contact congestion advertisement to other nodes. Shi, et al. Expires December 16, 2017 [Page 4] Internet-Draft amcud June 2017 As soon as the other node receives the congestion advertisement, it will update the congestion level of the corresponding contact according to the advertisement. When calculating routes, the nodes compute path congestion level as the highest congestion level of the contact consisted in the path and forwarding different priority bundles according to the path congestion level. If there exists no congestion in the path, bundles of all priority can be forwarded in the path. If the congestion level is mild, only urgent and standard bundles can be forwarded. If the congestion level is severe, only urgent bundle can be forwarded. If the congestion level is complete congestion, all bundles should be forwarded using sub optimal path. By this way, we can not only prevent data from been dropped when network suffers from congestion but also leave the transmission opportunity to high priority bundles. +----------+ |Satellite2| +----------+ / | \ / | \ / | \ / | \ +----------+ | +----------+ +----------+ |Satellite1| | |Satellite4|------|Satellite5| +----------+ | +----------+ +----------+ \ | / \ | / \ | / \ | / +----------+ |Satellite3| +----------+ Fig. 1 Example of unexpected contact disruption and congestion control. An example is given to explain the contact disruption handling mechanism. Assume that the contact between Satellite1 and Satellite2 is available from 1s to 300s, the contact between Stallite1 and Satellite3 from 100s to 300s, the contact between Satellite3 and Satellite4 from 100s to 300s, the contact between Satellite2 and Satellite4 from 1s to 300s, the contact between Satellite2 and Satellite3 from 1s to 300s, the contact between Satellite4 and Satellite 5 from 400s to 500s. Either Satellite2 or Satellite3 can Shi, et al. Expires December 16, 2017 [Page 5] Internet-Draft amcud June 2017 be used by Satellite1 as relays to send bundles to Satellite5. At initial, Satellite2 is selected to be used. Suppose at one time, the link from Satellite2 to Satellite4 is disrupted. When Satellite2 detects the retransmission of bundles two times, it marks the contact to Satellite4 as "temporary disrupted" and recalculates routes for the subsequent bundles. Thus, those bundles will be sent to Satellite3 and then to Satellite4 and Satellite5. In addition, the disruption advertisement group is computed by Satellite2 containing Satellite1, Satellite3 and Satellite4. When Satellite1 receives the advertisement, it will mark the contact from Satellite2 to Satellite4 as "disrupted" and use Satellite3 as the relay. At the same time, Satellite2 will send the probe message to Satellite4 periodically and check if the link is recovered. If Satellite2 receives a response, it will mark the contact as "recovery" and send contact recovery advertisement to satellites included in the advertisement group. If Satellite2 does not receive a response after sending the probing messages, it will resend the probing message again after T seconds. If Satellite2 still haven't received the response after 2T seconds, it will resend the probing message after 3T seconds. Assuming that the maximum detection duration is set to 3T. If satellite2 still haven't received the response after 3T, it will resend the probing message after 3T seconds until the disrupted contact is recovered or expired. Another example is also given to explain the congestion control scheme. Assume that the storage capacity of satellite2 in figure 1 is 100Mbytes, the storage capacity of other satellites is 200Mbytes. Assume thatsatellite1 sends one bulk bundle, one standard bundle and one urgent bundle to satellite5 every second. We also assume that the transmission rate is 200kbytes/s and the bundle size is 50kbytes. Initially, Satellite2 is selected to be used. Since the contact time between satellite2 and satellite4 is 100s, bundles will be stored at satellite2 before the contact started. At the start of the transmission, there exists no congestion. With the increase of data stored at satellite2, the storage capacity decreased and when the storage is less than thirty percent of the capacity between satellite1 and sateliite2, satellite2 will find that the contact between satellite1 and satellite2 is mild congested. It will send congestion advertisement to satellite1. After satellite1 receives the advertisement, it will mark the contact between satellite1 and satellite2 as mild congested and using satellite3 as the relay for bulk bundles. The standard and urgent bundle still be forwarded using satellite2 as relay. When satellite2 detects the contact between satellite1 and satellite2 is server congested, it will send congestion advertisement to satellite1 and after satellite1 updates the congestion level, it will forward bulk and standard bundle using Shi, et al. Expires December 16, 2017 [Page 6] Internet-Draft amcud June 2017 satellite3 as relay. When the storage capacity of satellite2 exhausted, the contact between satellite1 and satellite2 is complete congested. satellite2 will send congestion advertisement to satellite1. After satellite1 receives and updates the contact plan, it will use satellite3 as relay for all bundles. 4. Security Considerations To be done. 5. IANA Considerations To be done. 6. References [RFC4838] Burleigh S, Hooke A, Torgerson L, et al. RFC4838-Delay- Tolerant Networking Architecture[J]. 2007. [RFC5326] Ramadas M, Burleigh S, Farrell S. RFC 5326, Licklider Transmission Protocol Specification[J]. IRTF DTN Research Group, 2008. [RFC5050] Burleigh, S. Bundle protocol specification. No. RFC 5050. 2007. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [I-D. burleigh-dtnrg-cgr] Burleigh S. Contact Graph Routing: draft- burleigh-dtnrg-cgr-01, July 2010[J]. Shi, et al. Expires December 16, 2017 [Page 7] Internet-Draft amcud June 2017 Authors' Addresses Wenfeng Shi Beijing Jiaotong University Beijing, 100044, P.R. China Email: 14111038@bjtu.edu.cn Qi Xu Beijing Jiaotong University Beijing, 100044, P.R. China Email: 15111046@bjtu.edu.cn Bohao Feng Beijing Jiaotong University Beijing, 100044, P.R. China Email: 11111021@bjtu.edu.cn Huachun Zhou Beijing Jiaotong University Beijing, 100044, P.R. China Email: hchzhou@bjtu.edu.cn Shi, et al. Expires December 16, 2017 [Page 8]