Internet Engineering Task Force K. Subramaniam, Ed. Internet-Draft D. Loher Intended status: Informational Microsoft Expires: January 8, 2016 July 7, 2015 Router Buffer Sizes In The WAN draft-ksubram-lmap-router-buffer-sizes-01 Abstract This draft identifies the set of data that needs to be collected, and analyzed to quantify router buffer sizes used in routers in the Wide Area Network (WAN). The scope of this draft is limited to WAN links that have link latencies of 40 to 150 milliseconds. Reducing router buffer sizes has many advantages, the most important being cost. However, there is not much data available today to effectively calculate this. This draft details use cases for the study, and lists data that needs to be taken into consideration to be able to quantify the size of router buffers. The details of the individual measurement metrics are beyond the scope of this document. Neither does the draft identify methods to gather the data. What it identifies is a need to be able to collect, and report this empirical data in a readable fashion thus providing the ability to study and compare data in a more standardized method. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 8, 2016. Subramaniam & Loher Expires January 8, 2016 [Page 1] Internet-Draft Router Buffer Sizes in the WAN July 2015 Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document may not be modified, and derivative works of it may not be created, except to format it for publication as an RFC or to translate it into languages other than English. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. Discards with small buffer sizes . . . . . . . . . . . . 4 3.2. Discards with large buffer sizes . . . . . . . . . . . . 4 4. List of required data for study of router buffer sizes . . . 4 4.1. Number of concurrent flows, N . . . . . . . . . . . . . . 5 4.2. Length of a flow, L . . . . . . . . . . . . . . . . . . . 6 4.3. Packet Discards, D . . . . . . . . . . . . . . . . . . . 6 4.4. Reason for Packet Discards, R . . . . . . . . . . . . . . 6 4.5. Resolution of time interval, T . . . . . . . . . . . . . 6 4.6. 5 Tuple Flow Identity, I . . . . . . . . . . . . . . . . 7 5. XML Representation of an Information Model for Calculating Router Buffer Sizes . . . . . . . . . . . . . . . . . . . . . 7 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 9. Security Considerations . . . . . . . . . . . . . . . . . . . 9 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 10.1. Normative References . . . . . . . . . . . . . . . . . . 9 10.2. Informative References . . . . . . . . . . . . . . . . . 9 Appendix A. Additional Stuff . . . . . . . . . . . . . . . . . . 10 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 Subramaniam & Loher Expires January 8, 2016 [Page 2] Internet-Draft Router Buffer Sizes in the WAN July 2015 1. Introduction "How much buffering do core links need?" is a question that has been under study for a while. The question boils down to quantify buffer sizes and yet achieve 100% utilization on links with maximum throughput at a feasible cost. Buffer design could substantially increase costs. While over- buffering seems intuitive it can complicate the design of high speed routers, lead to higher power consumption, more board space, and lower density. It can actually increase end-to-end delay in the presence of congestion. This can make congestion more persistent. Additionally, there is always a tradeoff between buffer sizes and the capacity of a router. On the other hand, under-buffering while doing away from the above cons of over-buffering could lead us away from our primary goal of 100 percent link utilization. This could happen in a scenario using a simple Additive Increase Multiplicative Decrease (AIMD) for TCP flows when the sender has packets to send but the window size advertised is less and as a result the receiver consumes far less that it could. The rule of thumb for router buffers has been defined as [Villamizar] : B = RTT*C. Where B, was the buffer size, RTT the Round Trip Time, and C the capacity of the bottleneck link. [RFC3429] also talks about the buffer size being at least one TCP window size. However later studies [Appenzeller], show that the rule of thumb works either for a single flow or a perfectly synchronized large number of flows. Further they postulate that the buffer size is actually (2RTT * C)/sqrt(n), where n is the number of flows. This indicates a significant reduction in the buffer chip promoting lower costs. As seen, there have been proponents for large buffers and small. However, most of these studies are based on theoretical models and simulations. Today, there is no model or protocol to mine big data from a providers network to be able to answer this question efficiently. The nature of WAN traffic can be uncertain and varying. Furthermore the traffic could vastly vary between individual ISPs. This document implored the need for a model of mining empirical big data in a providers network to be able to build a network that drives down the $/GB and at the same time maximizing link utilization. This document outlines use cases for the study of router buffer sizes in the WAN and identifies the data that needs to be collected and Subramaniam & Loher Expires January 8, 2016 [Page 3] Internet-Draft Router Buffer Sizes in the WAN July 2015 analyzed. It could be further extended to the edge and datacenters, but it is outside the scope of this draft. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 3. Use Case From an operator's perspective it is imperative to monitor discards and link utilization over WAN links to be able to study the router buffer sizes. But these alone will be unable to provide an operator with enough information as to why the discards happened. The two use cases outlined here argue that more data needs to be collected, reported, and analyzed. 3.1. Discards with small buffer sizes Trans-pacific and trans-atlantic links of latencies in the range of 150 ms and 90 ms respectively, with low link utilization of 30 percent, and small buffers have seen dropped packets. The most intuitive method has been to increase the buffer sizes for these links on noticing packet discards. While this might alleviate the issue temporarily, unless the right problem has been identified this could readily lead to buffer bloat which has many issues on its own. 3.2. Discards with large buffer sizes Operators have also observed dropped packets on WAN links within North America with as large buffers as 125 MB per port with link utilizations of 60%. If this happens even if the router has not been specifically configured to drop certain type of packets, or there are no routing misconfigurations, then clearly the issue here is not the size of the router buffer. 4. List of required data for study of router buffer sizes This section talks about the absolute minimum requirements of the type of data that needs to be collected to be able to effectively quantify router buffer size. Subramaniam & Loher Expires January 8, 2016 [Page 4] Internet-Draft Router Buffer Sizes in the WAN July 2015 +---+-------------------------+-------------------------------------+ | | Data | Details | +---+-------------------------+-------------------------------------+ | 1 | Number of concurrent | For aggregate traffic | | | flows, N | | | 2 | Length of the flow, L | [Flow start time - flow end time] | | 3 | Packet Discards, P | Per Interface | | 4 | Reason for Packet | Buffer overflow, configuration, | | | Discards, R | etc. | | 5 | Resolution of Time | [Flow start time - flow end time] | | | Interval, T | | | 6 | 5 tuple flow identity, | Src IP, Dest IP, Src port, Dest | | | I | Port, Protocol. | +---+-------------------------+-------------------------------------+ Table 1: List of required data for Router Buffer Sizes A service provider needs to take into consideration several attributes to determine the right buffer size for its WAN routers. This section enlists the details as to why the five above have been identified as the minimum essential data needed to aid the study of router buffer sizes. 4.1. Number of concurrent flows, N Studies [Feldmann] and [Stevens] show that 95% of flows in the internet today are attributed to TCP [Postel] flows. The nature of these flows can vary significantly not only with various time periods, but also between providers. Flows that spend most of their time in slow-start require significantly less buffering than flows that live mostly in congestion avoidance. Due to this it is important to identify the type of concurrent flows that can live on a WAN link. Short (non-persistent) flows are those that live for less than one RTT, and large (persistent) flows are those whose lifetime is larger that one RTT with congestion overhead. Internet measurements [Avra] show that while a smaller number of large flows contribute to maximum packet transfer, short flows dominate most TCP sessions and large flows are known to have a larger effect on buffer sizes. These combination flows could in turn have an effect on Round Trip Time (RTT), loss probability and flow lengths. The ability to detect large flows is necessary because while the flows can be constant in steady state, the aggregate traffic can keep changing due to various arrival and departure rates. There needs to be a way for the number of concurrent flows to be collected and analyzed with the granularity of the lifetime of short flows, as low as one millisecond. Subramaniam & Loher Expires January 8, 2016 [Page 5] Internet-Draft Router Buffer Sizes in the WAN July 2015 4.2. Length of a flow, L Length of a flow can be defined as its duration: [flow stop time - flow start time], or the number of packets/bytes sent in this time duration. Identifying the length of flow in a provider's network will give information of the mix of short and large flows that are present in the WAN. This will lead to modeling implications in TCP flow control. 4.3. Packet Discards, D Number of packet discards per interface is probably the most important metric. Of this the number of outward (WAN) facing interface discards would be more intuitive to the study of buffer sizes. Interface discards can be referred to in [RFC2893] 4.4. Reason for Packet Discards, R There can be several reasons for packet discards especially when it is observed on less utilized links. Some of them could be due to routing misconfigurations, or designed to drop certain packets due to configurations. Clearly stating a reason as insufficient buffer will help narrow down the data required. This is especially true in the case of smart buffer allocations when some ports run out of buffers but not others. We could observe that a port has been allocated only, say, 30 percent of the available total buffer space but is experiencing the highest utilization and as a result of that is seeing packet drops pointing to the fact that dynamic buffers' smart allocations scheme is not adaptive and predictive to the nature of the WAN traffic. 4.5. Resolution of time interval, T The time interval should be granular such that it captures not only the number of concurrent flows in steady state but also the aggregate traffic over the lifetime of a short flow. It should also be able to correlate the discards per interface to the number of concurrent flows. Today via IPFIX we can calculate the number of concurrent flows. Via Sflow counters or flows, we can calculate the discards. Using counters requires upto two times the granularity set for any changes to be visible due to Nyquist rate. Reducing the counter export interval would increase the responsiveness, but at the cost of increased overhead and reduced scalability. On the other hand, packet sampling automatically allocates monitoring resources to busy links, providing a highly scaleable way to quickly detect traffic Subramaniam & Loher Expires January 8, 2016 [Page 6] Internet-Draft Router Buffer Sizes in the WAN July 2015 flows wherever they occur in the network. Responsiveness is important for a more stable control. 4.6. 5 Tuple Flow Identity, I 5 tuple flows have a source IP, destination IP, source port, destination port, and protocol to identify endpoints for unidirectional flows. Having this functionality gives the network operator a way to identify the offending flows, legitimate elephant flows, and high priority flows which may happen at certain periods during the day. Being able to separate traffic using the 5 tuple, further increases the strength of the sample set of empirical data available for the study of router buffer sizes. 5. XML Representation of an Information Model for Calculating Router Buffer Sizes The below is an example information model representing the data that needs to be measured in order to make buffer size estimations easier. The methods/algorithms to determine this date is out of the scope of draft. Subramaniam & Loher Expires January 8, 2016 [Page 7] Internet-Draft Router Buffer Sizes in the WAN July 2015 6. Conclusion We see that there are numerous issues at different layers that have an effect (directly or indirectly) on the sizing of router buffers. We also notice that there is no study that takes empirical data into consideration. Ideally, what would be required is an all knowing oracle that sees the traffic flow on an end-to-end network across all layers. Due to a lack of the resource, the first step to the study of router buffer sizes is to effectively mine the big data repository of a provider for the data identified in this draft. Subramaniam & Loher Expires January 8, 2016 [Page 8] Internet-Draft Router Buffer Sizes in the WAN July 2015 7. Acknowledgements 8. IANA Considerations This memo includes no request to IANA. All drafts are required to have an IANA considerations section (see the update of RFC 2434 [I-D.narten-iana-considerations-rfc2434bis] for a guide). If the draft does not require IANA to do anything, the section contains an explicit statement that this is the case (as above). If there are no requirements for IANA, the section will be removed during conversion into an RFC by the RFC Editor. 9. Security Considerations This document does not introduce new security issues. 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 10.2. Informative References [Appenzeller] G. Appenzeller, I. Klesassy, and N. McKeown, "Some Internet Architectural Guidelines and Philosophy", 2004, . [Avra] Konstantin Avrachenkov, INRIA Sophia Antipolis, "Differentiation Between Short and Long TCP Flows: Predictability of the Response Time", 2004, . [Feldmann] A. Feldmann, J. Rexford, and R. Caceres, "Efficient policies for carrying Web traffic over flow-switched networks", Dec. 1998, . Subramaniam & Loher Expires January 8, 2016 [Page 9] Internet-Draft Router Buffer Sizes in the WAN July 2015 [I-D.narten-iana-considerations-rfc2434bis] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", draft-narten-iana- considerations-rfc2434bis-09 (work in progress), March 2008. [Postel] J. Postel, "Transmission Control Protocol", Sep. 1981, . [RFC2893] K. McCloghrie, F. Kastenholz, "The Interfaces Group MIB", Jun. 2000, . [RFC3429] R. Bush and D. Meyer, "Some Internet Architectural Guidelines and Philosophy", Dec. 2002, . [Stevens] W. R. Stevens, "Transmission Control Protocol", 1994, . [Villamizar] C. Villamizar and C. Song, "High performance tcp in ansnet", 1994, . Appendix A. Additional Stuff This becomes an Appendix. Authors' Addresses Kamala Subramaniam (editor) Microsoft Mountain View, CA 94043 US Email: kasubra@microsoft.com Darren Loher Microsoft Redmond, WA 98052 US Subramaniam & Loher Expires January 8, 2016 [Page 10]