Reliable Server Pooling Working L. Coene group Siemens Internet-Draft P. Conrad Expires: April 15, 2004 University of Delaware P. Lei Cisco October 16, 2003 Reliable Server pool applicability Statement Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 15, 2004. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document describes the applicability of the reliable server pool architecture and protocols to applications which want to have High availability services. This is accomplished by using redundant servers and failover between servers of the same pool in case of server failure. Processing load in a pool may de distributed/shared between the members of the pool according to a certain policy. Also some guidance is given on the choice of underlying transport protocol (and corresponding transport protocol mapping) for transporting application data and Rserpool specific control data. Coene, et al. Expires April 15, 2004 [Page 1] Internet-Draft Rspool applicability October 2003 Table of Contents 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Reliable serverpool . . . . . . . . . . . . . . . . . . . . 4 2.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 ASAP/ENRP applicability . . . . . . . . . . . . . . . . . . 4 2.2.1 Minimal Rserpool service . . . . . . . . . . . . . . . . . . 4 2.2.2 Full Rserpool service . . . . . . . . . . . . . . . . . . . 5 3. Application and Control data Transport . . . . . . . . . . . 6 3.1 Rserpool use between 2 pools . . . . . . . . . . . . . . . . 6 3.2 state sharing via the cookie . . . . . . . . . . . . . . . . 6 3.3 PE Registration Services . . . . . . . . . . . . . . . . . . 6 3.4 Failover Callback Function . . . . . . . . . . . . . . . . . 6 3.5 PE Selection Services . . . . . . . . . . . . . . . . . . . 7 3.6 Upper Layer/Application Level Acknowledgements . . . . . . . 8 3.7 RSerPool Managed Data Channel . . . . . . . . . . . . . . . 8 4. Transport protocols used by ENRP & ASAP . . . . . . . . . . 10 4.1 ASAP on top of UDP . . . . . . . . . . . . . . . . . . . . . 10 4.2 ASAP on top of TCP . . . . . . . . . . . . . . . . . . . . . 10 4.3 ASAP on top of SCTP . . . . . . . . . . . . . . . . . . . . 10 4.4 Address hiding . . . . . . . . . . . . . . . . . . . . . . . 10 5. Proxies and Rserpool . . . . . . . . . . . . . . . . . . . . 12 6. Issues for Reliable Server pooling . . . . . . . . . . . . . 13 6.1 State transfer accoss the server pool . . . . . . . . . . . 13 7. Security considerations . . . . . . . . . . . . . . . . . . 14 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 15 References . . . . . . . . . . . . . . . . . . . . . . . . . 16 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 16 Intellectual Property and Copyright Statements . . . . . . . 18 Coene, et al. Expires April 15, 2004 [Page 2] Internet-Draft Rspool applicability October 2003 1. INTRODUCTION Reliable server pooling provides protocols for providing higly available services. The services are located in pool of redundant servers and if a server fails, another server will take over. The only requirement put on these servers belonging to the pool is that if state is maintained by the server, this state must be transfered to the other server taking over. The mechanism for transfering this state information is NOT part of the Reliable server pooling architecture and/or protocols and must be provided by other protocols. The goal is to provide server based redundancy. Transport and network level redundancy are handled by the transport and network layer protocols. The application may choose to distribute its traffic over the servers of the pool conforming to a certain policy. The application wishing to make use of Rserpool protocols may use different transport layers(such as UDP, TCP and SCTP). However some transport layers may have restrictions build in in the way they might be operating in the Rserpool architecture and its protocols. 1.1 Scope The scope of this document is to explore the different ways that Reliable server pool protocols can be used in order to provide a highly available service towards applications with different requirements. 1.2 Terminology The terms are commonly identified in related work and can be found in the Aggregate Server Access Protocol and Endpoint Name Resolution Protocol Common Parameters documentRFC ARCH [2]. Coene, et al. Expires April 15, 2004 [Page 3] Internet-Draft Rspool applicability October 2003 2. Reliable serverpool 2.1 Architecture A overview of the reliable server pool architecture is given in the Rserpool architecture document RFC ARCH [2]. The Rserpool architecture is made up of clients(Pool Users - PU) and servers(Pool Elements - PE). Both PU and PE's can be grouped into a pool in which a PE provides a service(File transfer, storage, bank transaction) to a PU. The PU's may try to find out via the endpoint resolution protocol(ENRP) which PE's are active. The PU can set up a communication channel with a particular PE(chosen out of the server pool) by using the Aggregate Server Access Protocol (ASAP) or by using directly any of the transport protocols(UDP/TCP/SCTP/RTP). ASAP may be running on top of UDP, TCP or SCTP. The minimum mode of using Rserpool is to use only the ENRP for Endpoint name resolution. The PU may setup the client - server communication WITHOUT ASAP, but using present transport protocols(such as UDP, TCP..) The normal use of Rserpool is to use ENRP for Enpoint name resolution and ASAP for client - server communication. ASAP may be using as underlying transport protocol UDP, TCP or SCTP. 2.2 ASAP/ENRP applicability 2.2.1 Minimal Rserpool service The minimum service provided by Rserpool is the use of ENRP for Endpoint name resolution. The ENRP procol may be running over TCP or SCTP. o Endpoint name resolution o no automatic failover from one PE to another, has to be done by the application itself o bussinesscard or cookie mechanism not possible o May be used by already existing applications which do not want to change the interface between PU and PE. o Only PU-NS and PE-NS communication will use Rserpool protocols Coene, et al. Expires April 15, 2004 [Page 4] Internet-Draft Rspool applicability October 2003 2.2.2 Full Rserpool service The fullservice provided by Rserpool is the use of ENRP for Endpoint name resolution and the Use of ASAP for PU - PE communication . ENRP may be running over TCP or SCTP while ASAP may be running over TCP, SCTP, UDP or RTP. o Endpoint name resolution o automatic failover from one PE to another is transparent for the application itself o bussinesscard exhange for determining if a PU is a pool or not. It allows the PE to treat the PU's as pool and use Rserpool protocols for it o cookie mechanism can be used for state transfer between PE's o May be used by allready existing applications which do not want to change the interface between PU and PE. o All entities will use Rserpool protocols for communication with their respective peers Coene, et al. Expires April 15, 2004 [Page 5] Internet-Draft Rspool applicability October 2003 3. Application and Control data Transport 3.1 Rserpool use between 2 pools Bussinesscards will allow to detect if their peer is part of a pool itself. Both the PU and the PE can be part of their own pools. If the PU or PE would fails, then the businesscard will have informed the respective peer to contact a alternative fellow PE/PU belonging to the pool. 3.2 state sharing via the cookie Every time a response is send back, a cookie could be send along the response. The cookie is "encrypted" and is stored by the PU, no modification at all it done to the cookie . If a PE fails then the cookie is send to a alternate PE, the PE check if the cookie is valid. The contents of the cookie is only provided and validated by the PE. It can be used for state sharing between the PE. 3.3 PE Registration Services Pool Elements ("server") must use the following services to add or remove themselves from server pools: REGISTER, to add the pool element into a server pool using {pool handle, mapping mode, protocol or mapping id, port, policy info} where mapping mode is defined in Section 5. A response result code is returned. DEREGISTER, to remove the pool element from a server pool using {pool handle, mapping mode, protocol or mapping id, port, policy info} where mapping mode is defined in Section 5. A response result code is returned. TBD: if REGISTER also returns an opaque instance id, the application can just use that id for DEREGISTER, instead of passing in the (same) parameters used in REGISTER. 3.4 Failover Callback Function The charter of the RSerPool Working Group specifically states that transaction failover is out of scope for RSerPool, i.e. "if a server fails during processing of a transaction this transaction may be lost. Some services may provide a way to handle the failure, but this is not guaranteed." Accordingly, the RSerPool framework provides a "hook" for applications to provide their own application- specific failover mechanism(s). Specifically, an application can specify a callback function that is invoked whenever a failover has taken place. This callback function is invoked immediately after the new transport layer connection/ association is established with a new server, and gives the application the opportunity to send one or more messages that may Coene, et al. Expires April 15, 2004 [Page 6] Internet-Draft Rspool applicability October 2003 help the server to resume any transaction or session that was in progress when the first server failed. As a simple example of how such a callback is useful, consider a file transfer service built using RSerPool. Let us assume that some FTP mirroring software is used to maintain mirrored sites, and that the actual mirroring is out of scope. However, we would like to use RSerPool to select a server from among the available mirror sites, and to failover in the middle of a file transfer if a primary server fails. For this example, assume that a simple request/response protocol is used, where one request message results in one or more response messages. Each request message contains the filename, and the offset desired within the file, (default zero.) Each response message contains some portion of the file, along with the offset, length of the portion in this message, and the length of the entire file. A single request results is sufficient to result in a sequence of response messages from the requested offset to the end of the file. For simplicity, assume that the response messages are delivered by the underlying transport strictly in order (although this requirement could be relaxed if a small amount of extra complexity were introduced.) In this protocol, all that is needed for failover is for the application to keep track of the number of bytes that it has read from the server, and to provide a callback function that reissues the request to the new server, replacing the offset with this number. When there is no failover, only one request message is sent and the minimum number of response messages are returned; in the event of failover(s), single new request message is sent for each failover that occurs. While this is a simple example, for more complex application requirements, the failover callback could be used in a variety of ways: The client might send security credentials for authentication by the server, and/or to provide a "key" by which the server could locate and setup state by accessing some application-specific (and out-of-scope) state sharing mechanism used by the servers. The client might keep track of various synchronization points in the transaction, and use the failover callback to replay message from a recent synchronization point. 3.5 PE Selection Services When automatic failover is enabled, selection of a new pool element according to the pool policy in place is automatically performed by Coene, et al. Expires April 15, 2004 [Page 7] Internet-Draft Rspool applicability October 2003 the RSerPool framework in case of a detected failure (e.g. provides automatic failover). No application intervention is required. Automatic failover may be enabled by setting the appropriate send flag when used in conjuction with data channel services (described in Section 4.6) or explicitly during initialization when data channel services are not used. FAILOVER_INDICATION, delivered by callback, indicates that a failover has occurred and that any required application level state recovery should be performed. The newly selected pool element handle is provided. Business Card services: when automatic failover is used, the exchange of business cards for rendezvous services is automatically performed by the RSerPool framework (e.g. no application intervention is required. When automatic failover is not enabled, failover detection and selection of an alternate PE must be done by the upper layer/ application. The following primitives are provided: GET_PRIMARY_SERVER, takes as input a pool handle and returns the {IP address, transport protocol, transport protocol port} of the primary server. GET_NEXT_SERVER has a dual meaning. First, it indicates to the RSerPool layer the failure of the server returned by a previous GET_PRIMARY_SERVER or GET_NEXT_SERVER call. Second, it provides the {IP address, transport protocol, transport protocol port} of the next server that should be contacted, according to the best information available to the RSerPool layer at the present time. The appropriate pool policy for server selection for the pool should be used for selecting the next server. 3.6 Upper Layer/Application Level Acknowledgements The RSerPool framework provides an upper layer/application level ack service. The upper layer protocol may request that the peer acknowledge receipt and successful processing of its sent data, providing an additional degree of confidence over transport level message retrieval. When used in conjuction with the data channel services (described in Section 4.6), any unacknowledged data will be automatically sent to a new pool element in case of failover, if desired (e.g. automatic failover is enabled). The following service primitive is used to acknowledge an upper layer acknowledgement request. ULP_ACK, responds to a received upper layer acknowledgement request. 3.7 RSerPool Managed Data Channel The RSerPool framework provides these services to send and receive application layer data, which are used in place of the direct call of transport level system functions (e.g. send/sendto, recv/recvfrom) and provides additional functionality to those calls. DATA_SEND, to send data to a pool element by using a pool handle, Coene, et al. Expires April 15, 2004 [Page 8] Internet-Draft Rspool applicability October 2003 specific pool element handle, or by transport address. An upper layer acknowledgement may be requested with this service. Appropriate error code(s) are returned. When sending to a pool handle, the specific pool element handle is returned. DATA_INDICATION, delivered by callback, to indicate that data has been received from a pool element and to pass that data to the application layer protocol. An application layer acknowledgement request can be indicated along with the data. The application MAY direct that the RSerPool framework multiplex both the control and data channels onto the same SCTP association/TCP connection/ etc., if desired. Coene, et al. Expires April 15, 2004 [Page 9] Internet-Draft Rspool applicability October 2003 4. Transport protocols used by ENRP & ASAP 4.1 ASAP on top of UDP UDP is a unreliable message transport delivery protocol, so if a message gets lost due to a changeover of server(or client), then the message will not be retransmitted after changeover has occured. New messages will be sent to alternate server/client within the serverpool. This service may be of some importance to services where realtime constraints apply.(Example video servers: a few lost message ain't that important as long as the big bulk of messages get through). No congestion control is done and as such no real measure of the congestion status on the server(or client) is taken into account, thus making loadsharing harder. Only the ENRP server responsible for that particular server pool will have an up to date view of the load distribution in the pool. 4.2 ASAP on top of TCP TCP provides full reliable delivery with congestion control of the message to its peer node. It provides for a single homed, single stream delivery of a byte stream from or to the server. Change over will retrieve the unsent messages and send them on another TCP connection to a different server of the server pool. 4.3 ASAP on top of SCTP PR-SCTP is the only know protocol which allows the choice of full, partial or no reliable delivery with congestion control of the message to its peer node. If the no-reliable delivery option is selected of SCTP, then ASAP will function as described in ASAP over UDP and including congestion control. if multihoming, streams, unsequenced and/or assured delivery are required for the application, then SCTP should be used for ASAP. See SCTP aplicability statement RFC 3257 [9]. 4.4 Address hiding If an application requires only a single address(due to memory constraints) to reach a pool element of a pool , then ASAP can provide one address at a time when quering the ENRP server. If that pool element fails, then the client must request a new address from the ENRP server, before it can fail-over(as it has no information about the other pool elements of the same pool except the pool handle). This is done by ASAP itself in the full Rserpool service, Coene, et al. Expires April 15, 2004 [Page 10] Internet-Draft Rspool applicability October 2003 but must be done by the client software itself in minimal Rserpool service. This may require some buffering in the client during the failover. Coene, et al. Expires April 15, 2004 [Page 11] Internet-Draft Rspool applicability October 2003 5. Proxies and Rserpool Application which require absolutely no protocol changes to their clients, may be able to use Rserpool protocols by using a proxy between the client and the server pool. Neither ASAP nor ENRP is used by the client application, but the proxy employs ENRP and ASAP. The client will only know the IP address and portnumbers of the proxy to contact. This can be accomplished via normal DNS queries. The main drawback is that the proxy becomes the single point of failure for the connection between the client and the server. Coene, et al. Expires April 15, 2004 [Page 12] Internet-Draft Rspool applicability October 2003 6. Issues for Reliable Server pooling 6.1 State transfer accoss the server pool Rserpool protocols(ENRP and ASAP) do NOT provide any service for directly transfering state information of a application from one Processing Element(PE) to another PE. However by using the ASAP cookie mechanims, the PU may be able to transfer some state provided by the PE to the PU, to the new PE in case of failover. This is the responsability of the PU to do this. Coene, et al. Expires April 15, 2004 [Page 13] Internet-Draft Rspool applicability October 2003 7. Security considerations The protocols used in the Reliable server pool architecture only tries to increase the availability of the servers in the network. Rserpool protocols does not contain any protocol mechanisms which are directly related to user message authentication, integrity and confidentiality functions. For such features, it depends on the IPSEC protocols or on Transport Layer Security(TLS) protocols for its own security and on the architecture and/or security features of its user protocols. A overview of possible treats to Reliable Server pooll protcols is detailed in RFC TREAT [8]. Rserpool architecture allows the use of different Transport protocols for its application and control data exchange. Those transport protocols may have mechanisms for reducing the risk of blind denial-of-service attacks and/or masquerade attacks. If such measures are required by the applications, then it is advised to check the SCTP applicability statement[RFC3057] for guidance on this issue. Coene, et al. Expires April 15, 2004 [Page 14] Internet-Draft Rspool applicability October 2003 8. Acknowledgments The authors wish to thank H. Hazewinkel, M. Urena and M. Stillman and many others for their invaluable comments. Coene, et al. Expires April 15, 2004 [Page 15] Internet-Draft Rspool applicability October 2003 References [1] Tuexen, M., Stewart, R., Shore, M., Xie, Q., Ong, L., Loughney, J. and M. Stillman, "Requirements for Reliable Server Pooling", RFC 3237, January 2002. [2] Tuexen, M., Stewart, R., Shore, M., Xie, Q., Ong, L., Loughney, J. and M. Stillman, "Architecture for Reliable Server Pooling", Draft in progress , October 2002. [3] Stewart, R., Xie, Q., Stillman, M. and M. Tuexen, "Aggregate Server Access Protocol (ASAP)", Draft in progress , October 2002. [4] Xie, Q., Stewart, R. and M. Stillman, "Endpoint Name Resolution Protocol (ENRP)", Draft in progress , October 2002. [5] Stewart, R., Xie, Q., Stillman, M. and M. Tuexen, "Aggregate Server Access Protocol and Endpoint Name Resolution Protocol Common Parameters", Draft in progress , October 2002. [6] Conrad, P. and P. Lei, ""Services Provided By Reliable Server Pooling", Draft in progress , January 2003. [7] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson, ""Stream Control Transmission Protocol"", RFC 2960, October 2000. [8] Stillman, M., Gopal, R., Sengodan, S., Guttman, E. and M. Holdrege, ""Threats Introduced by Rserpool and Requirements for Security in response to Threats"", RFC zzzz, Nov 2002. [9] Coene, L., ""Stream Control Transmission Protocol Applicability statement"", RFC 3257, April 2002. Authors' Addresses Lode Coene Siemens Atealaan 32 Herentals 2200 Belgium Phone: +32-14-252081 EMail: lode.coene@siemens.com Coene, et al. Expires April 15, 2004 [Page 16] Internet-Draft Rspool applicability October 2003 Phil Conrad University of Delaware USA Phone: + EMail: pconrad@acm.org Peter Lei Cisco 8735 W Higgins Rd, Suite 300 Chicago, IL 60631 USA Phone: +1 847 870 7201 EMail: peter.lei@ieee.org Coene, et al. Expires April 15, 2004 [Page 17] Internet-Draft Rspool applicability October 2003 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Coene, et al. Expires April 15, 2004 [Page 18] Internet-Draft Rspool applicability October 2003 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Coene, et al. Expires April 15, 2004 [Page 19]