AAA Working Group Bernard Aboba INTERNET-DRAFT Microsoft Category: Standards Track 18 May 2001 AAA Transport Profile This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. 2. Abstract This document discusses transport issues that arise with protocols for Authentication, Authorization and Accounting. It also provides recommendations on the use of transport by AAA protocols. This includes usage of standards-track RFCs as well as experimental proposals. 3. Introduction This document discusses transport issues that arise with protocols for Authentication, Authorization and Accounting. It also provides recommendations on the use of transport by AAA protocols. This includes usage of standards-track RFCs as well as experimental proposals. 3.1. Requirements language In this document, the key words "MAY", "MUST, "MUST NOT", "optional", "recommended", "SHOULD", and "SHOULD NOT", are to be interpreted as Aboba Standards Track [Page 1] INTERNET-DRAFT AAA Transport Profile 18 May 2001 described in [1]. 3.2. Terminology Accounting The act of collecting information on resource usage for the purpose of trend analysis, auditing, billing, or cost allocation. Administrative Domain An internet, or a collection of networks, computers, and databases under a common administration. Computer entities operating in a common administration may be assumed to share administratively created security associations. Attendant A node designed to provide the service interface between a client and the local domain. Authentication The act of verifying a claimed identity, in the form of a pre- existing label from a mutually known name space, as the originator of a message (message authentication) or as the end-point of a channel (entity authentication). Authorization The act of determining if a particular right, such as access to some resource, can be granted to the presenter of a particular credential. Billing The act of preparing an invoice. Broker A Broker is an entity that is in a different administrative domain from both the home AAA server and the local ISP, and which provides services, such as facilitating payments between the local ISP and home administrative entities. There are two different types of brokers; proxy and routing. Client A node wishing to obtain service from an attendant within an administrative domain. End-to-End End-to-End is the security model that requires that security information be able to traverse, and be validated even when an AAA message is processed by intermediate nodes such as proxies, brokers, etc. Aboba Standards Track [Page 2] INTERNET-DRAFT AAA Transport Profile 18 May 2001 Foreign Domain An administrative domain, visited by a Mobile IP client, and containing the AAA infrastructure needed to carry out the necessary operations enabling Mobile IP registrations. From the point of view of the foreign agent, the foreign domain is the local domain. Home Domain An administrative domain, containing the network whose prefix matches that of a mobile node's home address, and containing the AAA infrastructure needed to carry out the necessary operations enabling Mobile IP registrations. From the point of view of the home agent, the home domain is the local domain. Hop-by-hop Hop-by-hop is the security model that requires that each direct set of peers in a proxy network share a security association, and the security information does not traverse a AAA entity. Inter-domain Accounting Inter-domain accounting is the collection of information on resource usage of an entity within an administrative domain, for use within another administrative domain. In inter-domain accounting, accounting packets and session records will typically cross administrative boundaries. Intra-domain Accounting Intra-domain accounting is the collection of information on resource within an administrative domain, for use within that domain. In intra-domain accounting, accounting packets and session records typically do not cross administrative boundaries. Local Domain An administrative domain containing the AAA infrastructure of immediate interest to a Mobile IP client when it is away from home. Proxy A AAA proxy is an entity that acts as both a client and a server. When a request is received from a client, the proxy acts as a AAA server. When the same request needs to be forwarded to another AAA entity, the proxy acts as a AAA client. Local Proxy A Local Proxy is a AAA server that satisfies the definition of Aboba Standards Track [Page 3] INTERNET-DRAFT AAA Transport Profile 18 May 2001 a Proxy, and exists within the same administrative domain as the network device (e.g. NAS) that issued the AAA request. Typically, a local proxy will enforce local policies prior to forwarding responses to the network devices, and are generally used to multiplex AAA messages from a large number of network devices. Network Access Identifier The Network Access Identifier (NAI) is the userID submitted by the client during network access authentication. In roaming, the purpose of the NAI is to identify the user as well as to assist in the routing of the authentication request. The NAI may not necessarily be the same as the user's e-mail address or the user-ID submitted in an application layer authentication. Routing Broker A Routing Broker is a AAA entity that satisfies the definition of a Broker, but is NOT in the transmission path of AAA messages between the local ISP and the home domain's AAA servers. When a request is received by a Routing Broker, information is returned to the AAA requester that includes the information necessary for it to be able to contact the Home AAA server directly. Certain organizations providing Routing Broker services MAY also act as a Certificate Authority, allowing the Routing Broker to return the certificates necessary for the local ISP and the home AAA servers to communicate securely. Non-Proxy Broker A Routing Broker is occasionally referred to as a Non-Proxy Broker. Proxy Broker A Proxy Broker is a AAA entity that satisfies the definition of a Broker, and acts as a Transparent Proxy by acting as the forwarding agent for all AAA messages between the local ISP and the home domain's AAA servers. Real-time Accounting Real-time accounting involves the processing of information on resource usage within a defined time window. Time constraints are typically imposed in order to limit financial risk. Roaming Capability Roaming capability can be loosely defined as the ability to use any one of multiple Internet service providers (ISPs), while maintaining a formal, customer-vendor relationship with Aboba Standards Track [Page 4] INTERNET-DRAFT AAA Transport Profile 18 May 2001 only one. Examples of cases where roaming capability might be required include ISP "confederations" and ISP- provided corporate network access support. Transparent Proxy A Transparent Proxy is a AAA server that satisfies the definition of a Proxy, but does not enforce any local policies (meaning that it does not add, delete or modify attributes or modify information within messages it forwards). 4. Issues in AAA transport usage Issues that arise in AAA transport usage include: Application-driven versus network-driven behavior Slow failover Use of Nagle Algorithm Multiple connections Duplicate detection Invalidation of transport parameter estimates Inability to use fast re-transmit Congestion avoidance Delayed acknowledgments Premature Failover Head of line blocking Connection load balancing We discuss each of these issues in turn. 4.1. Application-driven versus network-driven behavior Steady state AAA transport behavior is typically application rather than network driven. For example, a 48-port NAS with an average session time of 20 minutes will on average send only 144 authentication/authorization requests/hour, and an equivalent number of accounting requests. This translates to an average inter-packet spacing of 25 seconds. Even on much larger NAS devices, the inter-packet spacing is often larger than the Round Trip Time (RTT). For example, a 2048-port NAS with an average session time of 10 minutes will on average send 3.4 authentication/authorization requests/second, and an equivalent number of accounting requests. This translates to an average inter-packet spacing of 293 ms. Note that transient behavior can result in much lower inter-packet spacing. For example, after a NAS reboot previously stored accounting records may be sent to the accounting server in rapid succession. Similarly, after recovery from a power failure, users may respond with a Aboba Standards Track [Page 5] INTERNET-DRAFT AAA Transport Profile 18 May 2001 large number of simultaneous logins. Thus while application-driven AAA transport behavior is the norm, there are situations in which behavior may be network driven. Note that even with high inter-packet spacings as seen by the NAS, it is possible for AAA clients and servers to experience congestion, even in the absence of any other traffic. For example, while a given AAA client may not send substantial traffic, many AAA clients may interact with a given AAA proxy or server. Thus routers close to a heavily loaded proxy or server may experience congestion, even though traffic close to the client is very light. For example, if 10,000 48-ports NASes were to use the same AAA proxy or server, that proxy or server would receive 400 authentication/authorization requests/second and an equivalent number of accounting requests. For 1000 octet requests, this could generate as much as 6.4 Mbps of incoming traffic at the AAA proxy or server. While such a transaction rate is within the capabilities of the fastest AAA servers and proxies, implementations exist that cannot handle such a high load, and thus high queuing delays and/or dropped packets may be experienced at the server, even if the routers on the path are not congested. Thus, a well designed AAA protocol needs to be able to handle congestion occurring at the AAA server, as well as congestion experienced within the network. 4.2. Slow failover Where TCP [5] is used as the transport, AAA implementations will experience very slow fail over times if they wait until a TCP connection times out before resending on another connection. This is not an issue for SCTP [6], which enables adjustment of the failover timer at the transport layer. 4.3. Use of Nagle Algorithm AAA protocol messages are often smaller than the maximum segment size (MSS). While exceptions occur when certificate-based authentication issued or where a low path MTU is found, typically AAA protocol messages are less than 1000 octets. Therefore, the total packet count, and associated network overhead can be reduced by combining multiple AAA messages within a single packet. While this does not reduce the work required by the application in parsing packets and responding to the messages, it does reduce the number of packets processed by routers along the path. However, within the application-driven regime, the NAS will typically receive a reply from the home server prior to having another request to send. This implies, for example, that accounting requests will typically be sent individually rather than being batched by the transport layer. Aboba Standards Track [Page 6] INTERNET-DRAFT AAA Transport Profile 18 May 2001 As a result, within the application-driven regime, the Nagle algorithm [12] is ineffective. 4.4. Multiple connections Since the RADIUS [2] Identifier field is a single octet, a maximum of 256 requests can be in progress between two endpoints described by a 5-tuple: (NAS IP address, NAS port, UDP, RADIUS server IP address, RADIUS server port). In order to get around this limitation, RADIUS clients have utilized more than one sending port, sometimes even going to the extreme of using a different sending port for each NAS port. Were this behavior to be extended to AAA protocols operating over reliable transport, the result would be multiplication of the effective slow-start rampup by the number of connections. For example, if a NAS had ten connections open to a AAA proxy, and used a per-connection initial window [20] of 2, then the effective initial window would be 20. This is inappropriate, since it would permit the NAS to send a large burst of packets into the network. 4.5. Duplicate detection In order to avoid spurious re-transmits, it is necessary for TCP [24] and SCTP [6] to include logic for estimating the re-transmission timer. However, even with a good RTO estimator, RTT distributions are typically heavy-tailed so that there will be some number of false re-transmits. As a result, AAA servers must be prepared to receive duplicate requests, and it is typical for server implementations to cache responses so as to make it possible respond to such duplicate requests more efficiently. 4.6. Invalidation of transport parameter estimates Congestion control principles [9],[16] require the ability of a transport protocol to respond effectively to congestion, as sensed via increasing delays, packet loss, or explicit congestion notification. With network-driven applications, it is possible to respond to congestion on a timescale comparable to the round-trip time (RTT). However, with application-driven AAA protocols, the time between sends may be considerably larger than the RTT, so that the network conditions can not be assumed to persist between sends. For example, the congestion window may grow during a period in which congestion is being experienced, because few packets are sent, limiting the opportunity for feedback. Similarly, after congestion is detected, the congestion window may remain small, even though the network conditions that existed at the time of congestion no longer apply by the time when the next packets are sent. In addition, due to the low sampling interval, estimates of RTT and RTO may become invalid. Aboba Standards Track [Page 7] INTERNET-DRAFT AAA Transport Profile 18 May 2001 4.7. Inability to use fast re-transmit When congestion window validation [13] is implemented, the result is that AAA protocols operate much of the time in slow-start with an initial congestion window set to 1 or 2, depending on the implementation [20]. This implies that AAA protocols gain little benefit from the windowing features of reliable transport. Since the congestion window is so small, it is generally not possible to receive enough duplicate ACKs (3) to trigger fast re-transmit. As a result, dropped packets will require a retransmission timeout (RTO). 4.8. Congestion avoidance The law of conservation of packets [9] suggests that a client should not send another packet into the network until it can be reasonably sure that a packet has exited the network on the same path. In the case of a AAA client, the law suggests that it should not retransmit to the same server or choose another server until it can be reasonably sure that a packet has exited the network on the same path. If the client advances the window as responses arrive, then the client will "self clock", adjusting its transmission rate to the available bandwidth. While a AAA client using a reliable transport such as TCP [5] or SCTP [6] will self-clock when communicating directly with a AAA-server, end- to-end self-clocking is not assured when a AAA proxy is present. As described in the Appendix, AAA proxies may be classified as Re- directs, Store and Forward Proxies, Application layer Proxies, and Transport proxies. Of these proxies, only the Transport and Re-direct proxy types result in establishment of direct transport connection between the AAA client and AAA server. Where such direct transport connections exist, end-to-end self-clocking will occur. However when store and forward or application layer proxies are used, two separate and de-coupled transport connections are used, one between the AAA client and proxy, and another between the AAA proxy and server. Since the two transport connections are de-coupled, transport layer ACKs do not flow end-to-end, and self-clocking does not occur. For example, let us consider the situation where AAA runs over a reliable transport and the bottleneck exists between the AAA proxy and a AAA server. In this situation, self-clocking will occur between the AAA client and AAA proxy, causing the AAA client to adjust its sending rate to the rate at which transport ACKs flow back from the AAA proxy. However, since this rate is higher than the bottleneck bandwidth, the overall system will not self-clock. Aboba Standards Track [Page 8] INTERNET-DRAFT AAA Transport Profile 18 May 2001 Since there is no direct transport connection between the AAA client and AAA server, the AAA client does not have the ability estimate the end- to-end transport parameters and adjust its sending rate to the bottleneck bandwidth between the proxy and server. As a result, the incoming rate at the AAA proxy can be higher than the rate at which packets can be sent to the AAA server. In this case, the end-to-end performance will be determined by details of the proxy implementation. In general the end-to-end transport performance in the presence of application layer or store and forward proxies will always be worse in terms of delay and packet loss than if the AAA client and server were communicating directly. For example, if the proxy operates with a large receive buffer, it is possible that a large queue will develop on the receiving side, since the AAA client is able to send packets to the AAA proxy more rapidly than the proxy can send them to the AAA server. Eventually, the buffer will overflow, causing wholesale packet loss as well as high delay. Methods to induce fine-grained coupling between the two transport connections are difficult to implement. One possible solution is for the AAA proxy to operate with a receive buffer that is no larger than its send buffer. If this is done, "back pressure" (closing of the receive window) will cause the proxy to reduce the AAA client sending rate when the proxy send buffer fills. However, unless multiple connections exist between the AAA client and AAA proxy, closing of the receive window will affect all traffic sent by the AAA client, even traffic destined to AAA servers where no bottleneck exists. Since multiple connections between a AAA client and proxy result in multiplication of the effective slow-start ramp rate, this is not recommended. As a result, use of "back pressure" cannot enable individual AAA client-server conversations to self-clock, and this technique appears impractical for use in AAA. 4.9. Delayed Acknowledgments As described in Appendix A, ACKs may comprise as much as half of the traffic generated in a AAA exchange. This occurs because AAA conversations are typically application-driven, and therefore there is frequently not enough traffic to enable ACK piggybacking. As a result, AAA protocols running over TCP or SCTP transport may experience a doubling of traffic as compared with implementations utilizing UDP transport. It is typically not possible to address this issue via the sockets API. ACK parameters (such as the value of the delayed ACK timer) are typically fixed by the TCP implementation and therefore not tunable by the application. Aboba Standards Track [Page 9] INTERNET-DRAFT AAA Transport Profile 18 May 2001 4.10. Premature failover RADIUS [2] failover implementations are typically based on the concept of primary and secondary servers, in which all traffic flows to the primary server unless it is unavailable. However, the failover algorithm was never specified. As a result, RADIUS failover implementations vary in quality, with some failing over prematurely, violating the law of "conservation of packets". Where an application layer or store and forward proxy is present, the NAS has no direct connection to a AAA server, and is unable to estimate the end-to-end transport parameters. As a result, a NAS or proxy awaiting an application-layer response from the server has no transport- based mechanism for determining an appropriate failover timer. For example, if the path between the AAA proxy and server includes a high delay link, it is possible that the NAS will failover to another proxy while packets are still in flight. This violates the principle of "conservation of packets" since the NAS will inject additional packets into the network before having evidence that a previously sent packet has left the network. Such behavior can result in worsening the situation on an already congested link, resulting in congestive collapse [9]. 4.11. Head of line blocking Head of line blocking occurs during periods of packet loss where the time between sends is shorter than the Re-transmission timeout value (RTO). In such situations, packets back up in the send queue until the lost packet can be successfully re-transmitted. Head of line blocking is typically an issue only on larger NASes. For example, a 48-port NAS with an average inter-packet spacing of 25 seconds is unlikely to have an RTO greater than this unless severe packet loss is experienced. However, a 2048-port NAS with an average inter-packet spacing of 293 ms may experience head-of-line blocking since the inter-packet spacing is less than the minimum RTO value of 1 second. 4.12. Connection load balancing In order to lessen queuing delays and ameliorate the head of line blocking problem, it is desirable for a AAA protocol to be able to load balance between multiple connections. While sophisticated load balancing techniques are possible, substantial benefits can be achieved by use of static load balancing. In static load balancing, traffic is distributed between servers based on static "weights" corresponding to server capacity. Aboba Standards Track [Page 10] INTERNET-DRAFT AAA Transport Profile 18 May 2001 5. AAA transport profile In order to address the issues described previously, it is recommended that AAA protocols make use of standards track as well as experimental techniques. Recommendations on AAA transport usage are described below. 5.1. Transport mappings AAA Servers MUST support TCP & SCTP. NASes MUST support TCP, and MAY support SCTP. As support for SCTP improves, it is possible that SCTP support will be required on NASes at some point in the future. TCP is required on NASes because not all NASes have SCTP in their protocol stacks, and because existing firewalls may not support SCTP. 5.2. Application layer heartbeat In order to enable AAA implementations to more quickly detect transport and application-layer failures, AAA protocols MUST support an application layer heartbeat. The heartbeat is used in order to enable a NAS or proxy to determine when to resend on another connection. The heartbeat protocol is not intended as a server-server failover mechanism comparable to that proposed in [31]. The AAA heartbeat operates as follows within a primary/secondary configuration: [1] Let us assume that each NAS is initially configured with a single primary AAA proxy or server, and one more secondary connections. [2] Heartbeat behavior is determined by two major parameters: the heartbeat timer (Th) and the failover timer (Tf). These timers are maintained on a per-connection basis. The purpose of the heartbeat timer is to control the sending of heartbeat packets between AAA client and server. The heartbeat timer is set by the AAA client after sending a request to the server. It is reset after receipt of a response from the server. Thus, the heartbeat timer will only expire in circumstances where there is no traffic between the AAA client and server. When the heartbeat timer expires, a heartbeat packet is sent, and the heartbeat timer is reset. The heartbeat timer ranges between 30 seconds and 60 seconds, and is dynamically estimated as described in [6]. The purpose of the failover timer is to control the resending of a request on a secondary transport connection. The failover timer is set when the heartbeat timer expires, and it is cleared (not reset) on receipt of a packet from the server. Aboba Standards Track [Page 11] INTERNET-DRAFT AAA Transport Profile 18 May 2001 The default value of Tf = 31 * RTO. Where RTO is unknown by the application, RTOmin = 1 second is assumed and thus the default value of Tf is 31 seconds. This corresponds to 5 timeouts with exponential backoff. Tf MUST NOT be set lower than 7 * RTO. Where RTO is unknown by the application, RTOmin = 1 second is assumed and thus the minimum value of Tf is 7 seconds. This corresponds to 3 timeouts with exponential backoff. Since a value of Tf lower than the default makes duplicate responses more likely, if a duplicate response is received, then it is recommended that the value of Tf be doubled until Tf reaches the default value. [3] When the failover timer expires, the AAA client MAY failover the request to the secondary server. However, the client MUST NOT close the primary connection until the primary heartbeat timer has expired twice without a response. Once the AAA client has failed over to the secondary, subsequent requests are sent to the secondary server until the heartbeat timer on the primary connection is reset, and the next secondary in the list becomes the secondary. This prevents flapping between the primary and secondary server, and ensures that the failover semantic remains consistent. In situations where no transport layer ACK is received on the primary connection after multiple re-transmissions, the RTO will be exponentially backed off. Due to Karn's algorithm, the RTO estimator will not be reset until another ACK is received in response to a non-re-transmitted request. Thus, after the client fails over to the secondary, the RTO of the primary will remain at a high value unless an ACK is received on the primary connection. As a result, subsequent requests sent on the primary connection will not receive the same service as was originally provided. For example, if Tf remains set at 7 seconds, on subsequent requests, instead of failover occuring after 3 retransmissions, it may occur without even a single retransmission. Suspending use of the primary connection until a response is received to a heartbeat message guarantees that the RTO timer will have been reset before the primary connection is reused. If no response is received to the second heartbeat message, then the primary connection is closed and so the temporary suspension becomes permanent. [4] After the connection to a server is closed after the expiration of two heartbeat timers, the AAA client continues to attempt to bring up the connection by sending a heartbeat message at the heartbeat interval, Th. Thus, the heartbeat timers continue to run even when a connection is closed. Once the connection is re-opened, it is not Aboba Standards Track [Page 12] INTERNET-DRAFT AAA Transport Profile 18 May 2001 put into service again until the heartbeat has been successfully responded to three times. [5] This heartbeat mechanism provides support for multiple secondaries. Once a secondary ascends to primary status, its connection is suspended or closed using the same rules as apply to primaries. Thus, it is possible to failover from a primary to a secondary, and then to have to failover from that secondary to another secondary. Implementations will typically retain a limit on the number of connections open at a time, so that this behavior will not result in too many open connections. Typically this also implies that once a previously closed connection is brought online again, another lower priority connection will be closed. [6] In order to enable diagnosis of failover behavior, it is recommended that a table of failover events be kept within the MIB. 5.3. Use of Nagle Algorithm While AAA protocols typically operate in the application-driven regime, there are circumstances in which they are network driven. For example, where a NAS reboots, or where connectivity is restored between a NAS and a AAA proxy, it is possible that multiple packets will be available for sending. As a result, there are circumstances where the transport-layer batching provided by the Nagle Algorithm (12) is useful, and as a result, AAA implementations MUST enable the Nagle algorithm, RFC 896 [12]. 5.4. Multiple connections 5.5. AAA protocols SHOULD use only a single persistent connection between a AAA client and a AAA proxy or server, and SHOULD provide for pipelining of requests, so that more than one request can be in progress at a time. In order to minimize use of inactive connections in roaming situations, a AAA proxy MAY bring down a connection to a AAA server if the connection has been un-utilized (discounting the heartbeat) for a certain period of time, which MUST NOT be less than BRINGDOWN_INTERVAL (5 minutes). In the event that a connection goes down to a given AAA proxy or server, the AAA client MAY attempt to bring it back up periodically. However, these attempts to revive the connection MUST NOT be more aggressive than the HEARTBEAT_MINIMUM (3 seconds). Aboba Standards Track [Page 13] INTERNET-DRAFT AAA Transport Profile 18 May 2001 5.6. Connection load balancing In order to support failover and failback, a AAA implementation MUST support connection failure detection, and MUST NOT send packets on a socket that it knows to be inoperative. This implies that the "weight" on a non-operable connection MUST be reduced to zero. In order to provide additional resilience and address head of line blocking issues, a AAA client MAY maintain connections between multiple AAA proxies, and a AAA proxy MAY maintain connections between multiple AAA servers. A AAA client/proxy connected to multiple proxies/servers can treat them as primary/secondary or balance load between them. It is recommended that static load balancing SHOULD be supported using Pearson's hash [29] applied to the NAI [28]. Hashing on the NAI ensures that traffic for a given destination will be sent to the same proxy, maximizing use of the routing cache. More sophisticated load balancing techniques, such as dynamic load balancing, MAY also be supported by AAA clients and proxies. 5.7. Duplicate detection AAA protocols MUST support an end-to-end message identifier, to enable the home server to detect duplicates. Hop-by-hop identifiers whose value may change at each hop are not sufficient, since a AAA server may receive the same message from multiple proxies. For example, a AAA client can send a request to Proxy1, then failover and resend the request to Proxy2; both proxies forward the request to the home AAA server, with different hop-by-hop identifiers. A Session-ID is insufficient as it does not distinguish different messages for the the same session. 5.8. Invalidation of transport parameter estimates In order to address invalidation of transport parameter estimates, AAA protocol implementations MAY utilize Congestion Window Validation (RFC 2861) [13] and RTO Validation [30]. RFC 2581 [14] recommends that a connection go into slow-start after a period where no traffic has been sent within the RTO interval. RFC 2861 [13] recomends only increasing the congestion window if it was full when the ACK arrived. The congestion window is reduced by half once every RTO interval if no traffic is received. When Congestion Window Validation is used, the congestion window will not build during application-driven periods, and instead will be decayed. As a result, AAA applications operating within the application- driven regime will typically run with a congestion window equal to the initial window [21] much of the time. This implies that AAA protocols Aboba Standards Track [Page 14] INTERNET-DRAFT AAA Transport Profile 18 May 2001 will typically operate in "perpetual slowstart". During periods in which AAA behavior is application-driven this will have no effect, since the time between packets will be larger than RTT, and thus AAA will operate with an effective congestion window of 1. However, during network-driven periods, the effect will be to space out sending of AAA packets. Thus instead of being able to send a large burst of packets into the network, a NAS will need to wait several RTTs as the congestion window builds during slow-start. For example, a NAS operating with an initial window of 2, with 35 AAA requests to send would take approximately 6 RTTs to send them, as the congestion window builds during slow start: 2, 3, 3, 6, 9, 12. After the backlog is cleared, the implementation will once again be application- driven and the congestion window size will decay. Note that RFC 2861 [13] does not address the issue of RTO validation. This is also a problem, particularly when the Congestion Manager [19] is implemented. During periods of high packet loss, the RTO may be repeatedly increased via exponential backoff, and may attain a high value. Due to lack of timely feedback on RTT and RTO during application- driven periods, the high RTO estimate may persist long after the conditions that generated it have dissipated. In order to address this issue, an RTO validation procedure is required. The following procedure [30] is recommended, and will be documented in the form of an Internet-Draft at some point in the future: After the congestion window is decayed according to [13], reset the estimated RTO to 3 seconds. After the next packet comes in, re-calculate RTTavg, RTTdev, and RTO according to the method described in [14]. 5.9. Inability to use fast re-transmit When Congestion Window Validation (RFC 2861) [13] is used, AAA implementations will operate with a congestion window equal to the initial window much of the time. As a result, the window size will often not be large enough to enable use of fast re-transmit. To address this issue, AAA implementations SHOULD implement Limited Transmit, as described in RFC 3042 [21]. Rather than reducing the number of duplicate ACKs required for triggering fast recovery, which would increase the number of inappropriate re-transmissions, Limited Transmit enables the window size be increased, thus enabling the sending of additional packets which in turn may trigger fast re-transmit without a change to the algorithm. Aboba Standards Track [Page 15] INTERNET-DRAFT AAA Transport Profile 18 May 2001 However, if congestion window validation [13] is implemented, this proposal will only have an effect in situations where the time between packets is less than the estimated retransmission timeout (RTO). If the time between packets is greater than RTO, additional packets will typically not be available for sending so as to take advantage of the increased window size. As a result, AAA protocols will typically operate with the lowest possible congestion window size, resulting in a re- transmission timeout for every lost packet. 5.10. Head of line blocking The head-of-line blocking problem can be addressed by a combination of Limited Transmit [21], and connection load balancing. 5.11. Congestion avoidance In order to improve upon default timer estimates, AAA implementations MAY implement the Congestion Manager (CM) [19]. CM is an end-system module that: (i) Enables an ensemble of multiple concurrent streams from a sender destined to the same receiver and sharing the same congestion properties to perform proper congestion avoidance and control, and (ii) Allows applications to easily adapt to network congestion. The CM helps integrate congestion management across all applications and transport protocols. The CM maintains congestion parameters (available aggregate and per-stream bandwidth, per-receiver round-trip times, etc.) and exports an API that enables applications to learn about network characteristics, pass information to the CM, share congestion information with each other, and schedule data transmissions. The CM enables the AAA application to access transport parameters (RTTavg, RTTdev) via callbacks. RTO estimates are currently not available via the callback interface, though they probably should be. Where available, transport parameters SHOULD be used to improve upon default timer values. 5.12. Premature Failover To prevent premature failover, all AAA messages sent by a AAA client or proxy (including accounting) MUST include an a maximum wait time. If the next hop server cannot return the reply within that time period, it MUST send an error message with an appropriate reason code. The maximum wait time MUST NOT be shorter than MINIMUM_WAIT_INTERVAL (15 seconds). Aboba Standards Track [Page 16] INTERNET-DRAFT AAA Transport Profile 18 May 2001 Application Layer error messages are needed, so that NAS can do appropriate failover. Failures can occur at both the transport and application layers; for example, the NAS-proxy or Proxy-AAA server transport connections can fail or a proxy/AAA server can be congested or busy. At the application layer, the AAA application can fail. In order to enable proper failover behavior, the NAS or proxy must be able to distinguish between these conditions. The following Application Layer Status Messages are recommended: "Busy": Proxy/Server too busy to handle additional requests, NAS should failover all requests to another proxy/server. "Forwarding": Proxy has located AAA server, but timely response is not forthcoming; NAS should reset application layer timers, wait for final response. "Can't Locate": Proxy can't locate the AAA server for the indicated realm; NAS should failover that request to another proxy. "Failover": Proxy has tried primary server, is failing over to secondary server; NAS should reset application layer timers, wait for final response. "Can't Forward": Proxy has tried both primary and secondary AAA servers with no response; NAS should failover to another proxy. "Processing": Server cannot provide an immediate response to this request; NAS should failover this request to another server, but not all requests. These messages differ in that some tell the NAS that the proxy/server is too busy for any request and therefore that the connection should come down for a while; some say that the proxy/AAA server can't handle a particular request, implying failover for that request alone; some indicate that the ultimate destination cannot be reached or isn't responding, implying per-request failover. Note that these messages are all hop-by-hop. 6. References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Rigney, C., Willens, S., Rubens, A., Simpson, W., "Remote Authentication Dial In User Service (RADIUS)", RFC 2865, June 2000. Aboba Standards Track [Page 17] INTERNET-DRAFT AAA Transport Profile 18 May 2001 [3] Rigney, C., "RADIUS Accounting", RFC 2866, June 2000. [4] Calhoun, P., Rubens, A., Akhtar, H., Guttman, E., "DIAMETER Base Protocol", Internet draft (work in progress), draft-ietf-aaa- diameter-00.txt, February 2001. [5] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981. [6] R. Stewart et al., "Stream Control Transmission Protocol", RFC 2960, October 2000. [7] Aboba, B., Vollbrecht, J., "Proxy Chaining and Policy Implementation in Roaming", RFC 2607, June 1999. [8] Aboba, B., Arkko, J., "Introduction to Accounting Management", RFC 2985, June 2000. [9] Jacobson, V., "Congestion Avoidance and Control", Computer Communications Review, ACM SIGCOMM, [10] Blunk, L. and J. Vollbrecht, "PPP Extensible Authentication Protocol (EAP)", RFC 2284, March 1998. [11] Rigney, C., Willats, W., Calhoun, P., "RADIUS Extensions", RFC 2869, June 2000. [12] Nagle, J., "Congestion Control in IP/TCP", RFC 896, January 1984. [13] Handley, M., Padhye, J., Floyd, S., "TCP Congestion Window Validation", RFC 2861, June 2000. [14] Allman, M., Paxson, V. and W. Stevens, "TCP Congestion Control", RFC 2581, April 1999. [15] Paxson, V., Allman, M., Dawson, S., Fenner, W., Griner, J., Heavens, I., Lahey, K., Semke, J. and B. Volz, "Known TCP Implementation Problems", RFC 2525, March 1999. [16] Floyd, S., "Congestion Control Principles", RFC 2914, September 2000. [17] Dawkins, S., Montenegro, G., Kojo, M. and V. Magret, "End-to-end Performance Implications of Slow Links", Internet draft (work in progress), draft-ietf-pilc-slow-04.txt, July 2000. [18] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions for High Performance", RFC 1323, May 1992. Aboba Standards Track [Page 18] INTERNET-DRAFT AAA Transport Profile 18 May 2001 [19] Balakrishnan, H., Seshan, S., "The Congestion Manager", Internet draft (work in progress), draft-ietf-ecm-cm-03.txt, November 2000. [20] Allman, M., Floyd, S. and C. Partridge, "Increasing TCP's Initial Window", RFC 2414, September 1998. [21] Allman, M., Balakrishnan H., Floyd, S., "Enhancing TCP's Loss Recovery Using Limited Transmit", RFC 3042, January 2001. [22] Matt Mathis, Jamshid Mahdavi, Sally Floyd, Allyn Romanow. "TCP Selective Acknowledgment Options", RFC 2018, October 1996. [23] Floyd, S., Henderson, T., "The NewReno Modification to TCP's Fast Recovery Algorithm", RFC 2582, April 1999. [24] Paxson, V., Allman, M., "Computing TCP's Retransmission Timer", RFC 2988, November 2000. [25] Floyd, S., Mahdavi, J., Mathis, M., Podolsky, M., Romanow, A., "An Extension to the Selective Acknowledgment (SACK) Option for TCP", RFC 2883, July 2000. [26] Montenegro, G., Dawkins, S., Kojo, M., Magret, V., Vaidya, N., "Long Thin Networks", RFC 2757, January 2000. [27] Touch, J., "TCP Control Block Interdependence", RFC 2140, April 1997. [28] Aboba, B. and M. Beadles, "The Network Access Identifier", RFC 2486, January 1999. [29] Volz, B., Gonczi, S., Lemon, T., Stevens, R., "DHC Load Balancing Algorithm", Internet-draft (work in progress), draft-ietf-dhc- loadb-03.txt, September 2000. [30] Allison Mankin, personal communication. [31] Droms, R., Kinnear, K., Stapp, M., Volz, B., Gonczi, S., Rabil, G., Dooley, M., Kapur, A., "DHCP Failover Protocol", Internet draft (work in progress), draft-ietf-dhc-failover-08.txt, July 2000. 7. Appendix A - AAA proxy bestiary As described in [2],[7] proxies have become a common feature of the AAA landscape in order to support services such as roaming and shared use networks. Such proxies are used both for authentication/authorization, as well as accounting [8]. Aboba Standards Track [Page 19] INTERNET-DRAFT AAA Transport Profile 18 May 2001 AAA proxies come in several varieties, including: Application-layer proxies Re-directs Store and Forward proxies Transport layer proxies The transport layer behavior of each of these proxies is described in turn. 7.1. Application-layer proxies A conventional application-layer AAA proxy does not respond to a NAS request until it receives a response from the AAA server. Since the Nagle algorithm is typically not triggered in AAA exchanges, the typical behavior of an application-layer AAA proxy operating over reliable transport within the application-driven regime is show below. Time NAS Proxy Home Server ------ --- ----- ----------- 0 Request -------> OTTnp + Tpr Request -------> OTTnp + TdA Delayed ACK <------- OTTnp + OTTph + Reply/ACK Tpr + Tsr <------- OTTnp + OTTph + Tpr + Tsr + Reply OTThp + TpR <------- OTTnp + OTTph + Tpr + Tsr + Delayed ACK OTThp + TdA -------> OTTnp + OTTph + OTThp + OTTpn + Tpr + Tsr + Delayed ACK TpR + TdA -------> Key --- OTT = One-way Trip Time Aboba Standards Track [Page 20] INTERNET-DRAFT AAA Transport Profile 18 May 2001 OTTnp = One-way trip time (NAS to Proxy) OTTpn = One-way trip time (Proxy to NAS) OTTph = One-way trip time (Proxy to Home server) OTThp = One-way trip time (Home Server to Proxy) TdA = Delayed ACK timer Tpr = Proxy request processing time TpR = Proxy reply processing time Tsr = Server request processing time With application-layer proxies two connections are established, one from the NAS to the AAA proxy, and another from the AAA proxy to the AAA server. Since these connections are de-coupled, the end-to-end conversation between the NAS and AAA server will not self clock. Another thing to notice about this situation is that ACKs may comprise as much as half of the traffic. This occurs because ACK parameters (such as the value of the delayed ACK timer) are typically fixed by the TCP implementation and are not tunable by the application. Since AAA traffic is application-driven, there is frequently not enough traffic to enable ACK piggybacking. Thus, the use of reliable transport by AAA protocols may result in as much as a doubling of traffic over what would be experienced with UDP transport. A detailed examination of the trace reveals the conditions under which this may occur. At time 0, the NAS sends a request to the proxy. Ignoring the serialization time, the request arrives at the proxy at time OTTnp, and the proxy takes an additional Tpr in order to forward the request toward the home server. At time TdA after receiving the request, the proxy sends a delayed ACK. The delayed ACK is sent, rather than being piggybacked on the reply, as long as TdA < OTTph + OTThp + Tpr + Tsr + TpR. Typically Tpr < TdA, so that the delayed ACK is sent after the proxy forwards the request toward the home server, but before the proxy receives the reply from the server. However, depending on the TCP implementation on the proxy and when the request is received, it is also possible for the delayed ACK to be sent prior to forwarding the request. At time OTTnp + OTTph + Tpr, the server receives the request, and Tsr later it generates the reply. Where Tsr < TdA, the reply will contain a piggybacked ACK. However, depending on the responsiveness of the AAA server and the server's TCP implementation, it is conceivable that the ACK and reply will be sent separately. This may be the case, for example, where a slow database or filestore must be consulted by the server prior to sending the reply. Aboba Standards Track [Page 21] INTERNET-DRAFT AAA Transport Profile 18 May 2001 At time OTTnp + OTTph + OTThp + Tpr + Tsr the reply/ACK reaches the proxy, which then takes TpR additional time to forward the reply to the NAS. At TdA after receiving the reply, the proxy generates a delayed ACK. Typically TpR < TdA so that the delayed ACK is sent to the server after the proxy forwards the reply to the NAS. However, depending on the circumstances and the proxy TCP implementation, the delayed ACK may be sent first. As in the case of the delayed ACK sent in response to a request, which may be piggybacked if the reply can be received quickly enough, piggybacking of the ACK sent in response to a reply from the server is only possible if additional request traffic is available to piggyback on. However, due to the high inter-packet spacings in typical AAA scenarios, this is unlikely unless the AAA protocol supports a reply ACK. At time OTTnp + OTTph + OTThp + OTTpn + Tpr + Tsr + TpR the NAS receives the reply. TdA later, a delayed ACK is generated. 7.2. Re-directs Re-directs operate by referring a NAS to the AAA server, enabling the NAS to talk to the AAA server directly. The sequence of events is as follows: Time NAS Re-direct Home Server ------ --- --------- ----------- 0 Request -------> OTTnp + Tpr Redirect/ACK <------- OTTnp + Tpr + Request OTTpn + Tnr -------> OTTnp + OTTpn + Tpr + Tsr + Reply/ACK OTTns <------- OTTnp + OTTpn + OTTns + OTTsn + Tpr + Tsr + Delayed ACK TdA -------> Key --- OTT = One-way Trip Time OTTnp = One-way trip time (NAS to Re-direct) OTTpn = One-way trip time (Re-direct to NAS) Aboba Standards Track [Page 22] INTERNET-DRAFT AAA Transport Profile 18 May 2001 OTTns = One-way trip time (NAS to Server) OTTsn = One-way trip time (Server to NAS) TdA = Delayed ACK timer Tpr = Re-direct processing time Tnr = NAS re-direct processing time Tsr = Server request processing time Since with re-directs a direct transport connection is established between the NAS and the AAA server, the end-to-end connection will self- clock. Delayed ACKs are also reduced as compared with the application-layer proxy case since the Re-direct and Home Server will typically be able to piggyback replies with the ACKs. 7.3. Store and Forward proxies With a store and forward proxy, the proxy may send a reply to the NAS prior to forwarding the request to the server. While store and forward proxies are most frequently deployed for accounting [8], they also can be used to implement authentication/authorization policy, as described in [7]. With a store and forward proxy, the sequence of events is as follows: Time NAS Proxy Home Server ------ --- ----- ----------- 0 Request -------> OTTnp + TpR Reply/ACK <------- OTTnp + Tpr Request -------> OTTnp + OTTph + Reply/ACK Tpr + Tsr <------- OTTnp + OTTph + Tpr + Tsr + Reply OTThp + TpR <------- OTTnp + OTTph + Tpr + Tsr + Delayed ACK OTThp + TdA -------> OTTnp + OTTph + OTThp + OTTpn + Tpr + Tsr + Delayed ACK Aboba Standards Track [Page 23] INTERNET-DRAFT AAA Transport Profile 18 May 2001 TpR + TdA -------> Key --- OTT = One-way Trip Time OTTnp = One-way trip time (NAS to Proxy) OTTpn = One-way trip time (Proxy to NAS) OTTph = One-way trip time (Proxy to Home server) OTThp = One-way trip time (Home Server to Proxy) TdA = Delayed ACK timer Tpr = Proxy request processing time TpR = Proxy reply processing time Tsr = Server request processing time As noted in [8], store and forward proxies can have a negative effect on accounting reliability. By sending a reply to the NAS without receiving one from the accounting server, store and forward proxies fool the NAS into thinking that the accounting request had been accepted by the accounting server when this is not the case. As a result, the NAS can delete the accounting packet from non-volatile storage before it has been accepted by the accounting server. The leaves the proxy responsible for delivering accounting packets. If the proxy involves moving parts (e.g. a disk drive) while the NAS does not, overall system reliability can be reduced. As a result, store and forward proxies SHOULD NOT be used. 7.4. Transport layer proxies With a transport layer proxy, the proxy may acts as an intermediary, forwarding transport ACKs between the NAS and the Home Server. This type of proxy effectively splices together the NAS-proxy and proxy-AAA server connections into a single conection that behaves as though it operated end-to-end. As a result, transport proxies will exhibit end-to-end self- clocking. However, since these proxies need to operate at the transport layer, they cannot be implemented purely as applications and examples of AAA transport proxies are rare. With a transport proxy, the sequence of events is as follows: Time NAS Proxy Home Server ------ --- ----- ----------- 0 Request -------> OTTnp + Tpr Request -------> OTTnp + OTTph + Reply/ACK Aboba Standards Track [Page 24] INTERNET-DRAFT AAA Transport Profile 18 May 2001 Tpr + Tsr <------- OTTnp + OTTph + Tpr + Tsr + Reply/ACK OTThp + TpR <------- OTTnp + OTTph + OTThp + OTTpn + Tpr + Tsr + Delayed ACK TpR + TdA -------> OTTnp + OTTph + OTThp + OTTpn + Tpr + Tsr + Delayed ACK TpR + TpD -------> Key --- OTT = One-way Trip Time OTTnp = One-way trip time (NAS to Proxy) OTTpn = One-way trip time (Proxy to NAS) OTTph = One-way trip time (Proxy to Home server) OTThp = One-way trip time (Home Server to Proxy) TdA = Delayed ACK timer Tpr = Proxy request processing time TpR = Proxy reply processing time Tsr = Server request processing time TpD = Proxy delayed ack processing time 8. Security Considerations General security considerations concerning TCP congestion control are discussed in RFC 2581 [14]. 9. IANA Considerations This draft does not create any new number spaces for IANA administration. 10. Acknowledgments Thanks to Allison Mankin of ISI, Barney Wolff of Databus, and Pat Calhoun of Sun Microsystems for fruitful discussions relating to AAA transport. Aboba Standards Track [Page 25] INTERNET-DRAFT AAA Transport Profile 18 May 2001 11. Authors' Addresses Bernard Aboba Microsoft Corporation One Microsoft Way Redmond, WA 98052 Phone: +1 (425) 936-6605 Fax: +1 (425) 936-7329 Email: bernarda@microsoft.com 12. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards- related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. 13. Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are Aboba Standards Track [Page 26] INTERNET-DRAFT AAA Transport Profile 18 May 2001 perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 14. Expiration Date This memo is filed as , and expires December 1, 2001. Aboba Standards Track [Page 27]