MMUSIC J. Rosenberg Internet-Draft dynamicsoft Expires: June 26, 2003 December 26, 2002 Proposed Changes to Connection Oriented Media Handling in the Session Description Protocol (SDP) draft-rosenberg-mmusic-comedia-fix-00 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on June 26, 2003. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract This document describes changes that are proposed to draft-ietf-mmusic-sdp-comedia-04, which describes the handling of connection oriented media in the Session Description Protocol (SDP). These changes are motivated by problems encountered in using comedia to support instant messaging sessions in the SIMPLE working group. Rosenberg Expires June 26, 2003 [Page 1] Internet-Draft Comedia Changes December 2002 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Overview of Comedia . . . . . . . . . . . . . . . . . . . . . 4 3. Problems with Comedia . . . . . . . . . . . . . . . . . . . . 5 3.1 Port Multiplicity on Servers . . . . . . . . . . . . . . . . . 5 3.2 No Connection Reuse . . . . . . . . . . . . . . . . . . . . . 6 3.3 Security Considerations . . . . . . . . . . . . . . . . . . . 7 4. Proposed Changes to Comedia . . . . . . . . . . . . . . . . . 9 4.1 The Endpoint ID . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 Decoupled Lifetimes . . . . . . . . . . . . . . . . . . . . . 11 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 13 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 16 Informative References . . . . . . . . . . . . . . . . . . . . 17 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 17 Intellectual Property and Copyright Statements . . . . . . . . 18 Rosenberg Expires June 26, 2003 [Page 2] Internet-Draft Comedia Changes December 2002 1. Introduction The Session Description Protocol (SDP) [4] defines a syntax for describing multimedia sessions, including audio, video, and text. SDP is used by several other Internet protocols, including the Session Initiation Protocol (SIP) [5] and the Real Time Streaming Protocol (RTSP) [6]. When used by SIP, SDP is exchanged by each participant using an offer/answer model [8]. Although SDP is meant to support a variety of different sessions, its primary usage was for sessions carried using the Real Time Transport Protocol (RTP) [7]. RTP is carried over UDP. As a result of this, although TCP-based sessions were discussed in RFC 2327, they were not the focus. Recent applications of SDP for describing TCP-based sessions led to the discovery of some gaps in SDP. To remedy those gaps, a set of extensions to SDP for supporting connection oriented media were developed, generally referred to as the comedia [2] extension. These attributes help resolve which participant in a two-party session will open a connection, when the connection is to be closed, and when each participant can cease listening on its advertised ports. Since the development of the comedia extension, another application has materialized. This application is instant messaging sessions [1]. It has been proposed that IM sessions should be transported using a relatively simple protocol called "CPIM over TCP" [3]. Since CPIM over TCP runs over TCP (big surprise), it can make use of comedia. However, it was discovered that comedia was unable to meet several of the needs for IM sessions. This document first briefly summarizes the comedia extension. Then, it outlines the problems identified with comedia. Lastly, it proposes new behavior and attributes to fix these problems. Rosenberg Expires June 26, 2003 [Page 3] Internet-Draft Comedia Changes December 2002 2. Overview of Comedia The comedia specification is relatively simple. The main issue with TCP-based sessions is who will open the connection. Ideally, a session between a pair of participants will use a single TCP connection to exchange data in both directions. Since a connection is opened only by one participant, there must be a way to determine who will open it. This determination is made through the a=direction attribute defined in the specification. It can contain three values - active, passive, or both. Active means that the participant would like to initiate the connection. Passive means that it would like to receive the connection. Both means that it can do either. If one participant sends an SDP with an active attribute, the other sends an SDP with the passive attribute, and the active side opens to the passive side. If both sides indicate a direction of "both", they both open connections, and if both succeed, an attempt is made to use just one of them, allowing the other to be closed after timeout. Comedia provides rules for connection lifetime management. A client has to listen on an advertised port until a connection has been established on it. The connection is kept open for the lifetime of the session. Comedia also specifies that a connection closure requires a new exchange of SDP to re-open the connection, if it still needs to be used. Comedia also allows the active side of the connection to indicate the source IP/port it will use when opening the connection. The specification describes two uses. One is in firewalls, which can allocate a "pinhole" for this session, using a full five-tuple, rather than just a cone (using the destination information only) when the source is unknown. The second use is allowing the passive side to disambiguate which client is connecting, in the case of port reuse. This has clear problems with NAT, which are mentioned in the spec. Rosenberg Expires June 26, 2003 [Page 4] Internet-Draft Comedia Changes December 2002 3. Problems with Comedia We have identified several key problems with the specification as currently written. Most of them manifest themselves when considering usage of comedia on servers that support multiple sessions (such as gateways to the PSTN), or on intermediaries. 3.1 Port Multiplicity on Servers The first problem with comedia is, for all intents and purposes, a user agent needs to allocate a single port for every single session that it handles. This constraint is not an issue for single user devices, like PCs or wireless phones, but it is a serious constraint for servers handling multiple sessions. Examples of such servers include gateways to other networks, such as an SMS gateway, and application servers, such as an IM recording application. To its credit, comedia does say that a UA can reuse a port for multiple sessions. However, in order to do this, a server must be able to correlate a connection with the session that is using it. Consider an SMS gateway that receives SIP INVITE requests for IM sessions. In the SDP answer it generates, it always places the same IP address and port information. Each client that connects to it opens a new TCP connection to the server. However, the gateway needs to know which session each connection is associated with. When it receives an SMS from the PSTN, it will look up the intended recipient, and figure out which SIP dialog to send the message on. From that, it needs to determine which connection to use. How is such a matching done? Comedia suggests that it is accomplished using the source IP address and port of the TCP connection request. When the client sends its SDP offer to the gateway, it can include the source IP and port it intends to use. The gateway could match this up with the source address of the TCP SYN packets. However, this mechanism completely fails in the presence of NAT. Since the gateway doesn't know whether the client is connecting through NAT, it has no way to know whether it can trust the source address information. This limits connection sharing to very specific network deployments where all participants are aware that there are no NATs. Unfortunately, that covers very few cases. So, since port sharing is basically impossible for all but the most specific cases, the server must allocate a single port for each incoming session, so that the port number itself is used to correlate each connection to the session. Besides having known scaling problems, this makes firewall policy configuration complicated, since a single well-known port cannot be used by the server. Rosenberg Expires June 26, 2003 [Page 5] Internet-Draft Comedia Changes December 2002 3.2 No Connection Reuse A more serious problem than the one outlined in Section 3.1 is that there is no possibility of connection reuse between elements which share multiple sessions. This occurs between servers (such as a pair of SMS gateways), and also between intermediaries. +---------------------------+ +------------------------------+ | | | | | | | | | | | | | +-+ +-+ | | |F| |F| | | |i| |i| | | +--------+ |r| |r| +--------+ | | | | |e| |e| | | | | | Proxy |--|w|----------|w|---| Proxy | | | | | |a| |a| | | | | | | |l| TLS |l| | | | | +--------+ |l| Connect. |l| +--------+ . | | .. . | | | | . . .. | | .. . . +-+ +-+ . . .. | | .. . . | | . . .. | | . . . | | . . . | | +-------+ . +-------+ | | +-------+ . +-------+ | | |Client | . |Client | | | |Client | . |Client | | | +-------+ . +-------+ | | +-------+ . +-------+ | | . | | | | +-------+ | | +-------+ | | |Client | | | |Client | | | +-------+ | | +-------+ | | | | | | | | | | | | | | Enterprise foo | | Enterprise bar | +---------------------------+ +------------------------------+ Figure 1 Consider the network of Figure 1. In this network, there are two enterprises, foo and bar. These enterprises are both connected to the public Internet, and would like to "federate". This means that they would like to communicate with each other through a secure connection between the enterprises. To do that, both enterprises deploy a SIP proxy on the border of their network. These proxies, in addition to processing SIP traffic, also act as an intermediary for instant messaging sessions. A single TLS connection is used in each Rosenberg Expires June 26, 2003 [Page 6] Internet-Draft Comedia Changes December 2002 direction, terminating on port 5061, between the proxies to handle the SIP traffic. Furthermore, it is desired that a single TLS connection is used between the enterprises (or two, one in each direction), terminating on a well known port. This is important for purposes of congestion control, scale, and security. The enterprises would like to configure their firewalls to allow outbound connections to the proxy in the peer network, but only to the specific port used for the instant messaging traffic. Indeed, RFC 2779 [9] explicitly identifies this as a requirement. This configuration is not practical with comedia. The reason is that the comedia specification tightly binds the lifetime of the connection and the lifetime of the sessions. Each proxy would need to reference count the number of sessions using the connection. Only when the number of sessions using the connection goes to zero could the proxies close the connection. Worse, if the connection closed due to some kind of network outage, the only way to re-establish would be for the proxies to generate re-INVITEs. This would require full b2bua functionality, rather than just a proxy. Furthermore, as describe in Section 3.1, internally, each proxy would need to allocate a single port for each client that connects to it. This has serious scale problems. 3.3 Security Considerations Comedia introduces some important security considerations which are not discussed in the current specification. Consider an attacker who wishes to hijack a call, so that Alice thinks she is calling Bob, but instead communicates with the attacker, Carol. Alice sends a SIP INVITE request with SDP that contains some comedia sessions. Alice indicates a direction attribute of "both". Carol is capable of eavesdropping, but not modifying, the contents of the messages. Carol observes the IP address and port listed by Alice. Similarly, Bob places his IP address and port in the answer, which is also observed by Carol. Carol then opens two TCP connections - one to the address indicated by Alice, and one to the address indicated by Bob. Both of these succeed, and are established before the connections initiated by Alice and Bob themselves. Carol immediately uses the connections she just established, forcing Alice and Bob to use that connection instead of the one each just established to the other. As a result, Carol can now communicate bidirectionally with Alice and with Bob, even though Alice and Bob think they are talking to each other. This attack doesn't occur with normal SDP/RTP usage. Hijacking a media stream in that case requires an attacker to act as a Rosenberg Expires June 26, 2003 [Page 7] Internet-Draft Comedia Changes December 2002 man-in-the-middle (MITM), modifying the IP address and port information. Here, the attacker need only eavesdrop. Since eavesdropping is much simpler to do than acting as a MITM, comedia significantly increases the risk of this attack. Sadly, this attack is difficult to prevent. Even if SIP S/MIME and the sips method are used, the attack can still take place. As long as the attacker can guess the IP address and port that either Alice or Bob will use, once the attacker just observes a communication attempt, they can launch the attack against the range of ports which are likely to be used. The only way to prevent this attack is with secure media streams, which are difficult to deploy. Furthermore, the current security considerations described by comedia cannot be implemented. The specification indicates that firewalls should make sure that the SDP comes from a trusted source before establishing the pinholes implied by the SDP. However, firewalls are intermediaries on the signaling paths. As such, they cannot participate in authentication mechanisms such as digest and S/MIME. There is, therefore, no way to reliably authenticate the sender of the SDP. Furthermore, the source address present in the SDP may not be correct, because the SDP has passed through a NAT which did not implement an ALG function for this SDP parameter. Rosenberg Expires June 26, 2003 [Page 8] Internet-Draft Comedia Changes December 2002 4. Proposed Changes to Comedia We propose several changes to the operation of comedia in order to address these shortcomings. Generally speaking, these changes are aimed at two specific goals. The first is to decouple the connection lifetimes and session lifetimes. The second is to provide a way to bind connections to sessions which does not require IP addressing information. 4.1 The Endpoint ID The first part of this proposal is the usage of an endpoint identifier to bind connections to sessions. Each endpoint that participates in a comedia session would choose a globally unique, cryptographically random ID which serves as an endpoint identifier. Single user endpoints can choose a different identifier for each session. Intermediaries would need to use more long-lived identifiers. When a client generates an offer, it includes its endpoint identifier in the SDP, as an attribute associated with the comedia session. The answerer responds with its own endpoint ID. Both sides proceed to open connections, as described in the comedia specification. Once the connection is established, each side sends its own endpoint ID to the other side. The active side always sends their ID first. Offerer Answerer | | |(1) SIP INVITE | |offer | |c=1.2.3.4 | |m=message 7766 | |a=eid:3344 | |-------------------->| |(2) 200 OK | |answer | |c=5.6.7.8 | |m=message 8899 | |a=eid:1122 | |<--------------------| | | |(3) ACK | |-------------------->| | | |(4) TCP connect | |to 5.6.7.8:8899 | |-------------------->| Rosenberg Expires June 26, 2003 [Page 9] Internet-Draft Comedia Changes December 2002 | | |(5) EID=3344 | |-------------------->| | | |(6) EID=1122 | |<--------------------| | | |(7) TCP connect | |to 1.2.3.4:7766 | |<--------------------| | | |(8) EID=1122 | |<--------------------| | | |(9) EID=3344 | |-------------------->| | | Figure 2 The basic flow for connection establishment is shown in Figure 2. The offerer sends a SIP INVITE request with SDP (1). This SDP offer indicates a single media session (instant messaging), using comedia. The offerer is listening on IP address 1.2.3.4 port 7766. The offerer also provides an endpoint ID, 3344. The answerer indicates its IP address and port (5.6.7.8:8899) and its endpoint ID, 1122. The offerer opens a TCP connection to the answerer. Once the connection is established, the first bytes sent by the offerer are its endpoint ID (3344), and the first bytes sent by the answerer are its endpoint ID (1122). The endpoint IDs are exchanged only during connection establishment, at the very beginning. Once they are exchanged, the data sent over the connection depends on the session type. The purpose of these endpoint IDs is to provide correlation. When the answerer accepts the TCP connection, and receives the endpoint ID 3344 over it, it knows that the connection is associated with the SIP dialog/offer-answer exchange that just occurred. A multi-user device (such as a gateway) could advertise a single IP address and port, accepting connections on it. As these connections are established, each client would send a different endpoint ID, allowing the gateway to determine which dialog/session the connection is associated with. This mechanism directly addresses the problem outlined in Section 3.1. A single TCP port can be used for all media sessions of a specific type. Sessions can be correlated to the connection using an Rosenberg Expires June 26, 2003 [Page 10] Internet-Draft Comedia Changes December 2002 ID not directly related to an IP address. The mechanism therefore works through NAT. A side-effect of this proposal is to address the security problem described in Section 3.3. Without any additional security measures, our proposal is subject to the same attack. However, preventing it is much easier. If clients choose sufficiently random endpoint IDs, so long as the SIP packets cannot be eavesdropped, an attacker cannot hijack the session. Even if the attacker can guess the IP address and port, if they cannot send the correct endpoint ID, the connection is ignored. Therefore, protecting the attack only requires confidentiality protection of the SIP packets. This can be done with sips, or with S/MIME. With the current comedia specification, neither sips nor S/MIME is sufficient to prevent this attack. We would also propose that the source address information be removed from the direction attribute. It is unreliable for connection identification, and cannot be validated by a firewall. We feel that the brittleness it introduces far outweighs the potential benefits. 4.2 Decoupled Lifetimes We also propose that connection lifetimes be decoupled with session lifetimes. Specifically, we would propose that the active side of the connection MAY close it at any time. The passive side SHOULD let the active side close it. If a participant indicated a direction of active or both, and it needs to send data, but no connection is open, it can open a new connection to the same IP address and port last received from its peer. No new SIP signaling is required. The endpoint ID exchange occurs as above when the new connection is established. A participant MUST continue to listen on its advertised address and port, and MUST be prepared to support multiple connections with the same endpoint ID pairs (that is, two connections which both resulted in the exchange of the same endpoint IDs). In that case, the connections are treated as equivalent, and either can be used. The active side may close one of them at any time. The effect of this proposal is that the TCP connection can be opened and closed during the lifetime of a session. It is not strongly bound to the session, and indeed, the same connection could be reused for subsequent sessions, as long as the underlying media type (such as instant messages) supports a de-multiplex function. This addresses the problem of Section 3.2. The intermediaries can use a single connection for the messaging traffic. So long as both intermediaries indicated a direction of "both", either side could close the connection at will, and reopen it when they had data to send. This avoids the need for the intermediaries to be session stateful. They no longer need to reference count sessions on the Rosenberg Expires June 26, 2003 [Page 11] Internet-Draft Comedia Changes December 2002 connection. Should the connection close due to network outage, it can be reopened at any time. The only difficulty is when there are NATs between the active and passive side. If the passive side closes the connection, and then determines that it has data to send, it will not be able to reopen the connection. That is why the passive side of the connection SHOULD let the active side close it. Of course, the corrolary is that the active side shouldn't close the connection until it knows its done with it. In the case of a single user device, this means it would keep the connection open for the lifetime of the session. With this new behavior, the reconnect attribute is no longer needed. If one side wishes to force the other to reconnect, it merely drops the connection. When the other side has data to send, it will establish a new connection. Rosenberg Expires June 26, 2003 [Page 12] Internet-Draft Comedia Changes December 2002 5. Conclusion We have proposed a number of changes to the semantics and syntax of the comedia specification. We believe these changes will allow comedia to be relevant for many useful situations, whereas its current specification limits it to corner cases. We do not believe this proposal introduces any increase in complexity. Rosenberg Expires June 26, 2003 [Page 13] Internet-Draft Comedia Changes December 2002 6. Security Considerations Section 3.3 discusses security considerations introduced by connection oriented media. We believe this proposal helps resolve the security considerations described there. If the endpoint ID is chosen randomly for each new session, the hijacking attack is mitigated using sips or S/MIME. However, intermediaries will want to maintain a more long-lived endpoint ID in order to facilitate connection reuse. Once the endpoint ID becomes long lived, it loses its utility in preventing the hijacking attack. For this reason, intermediaries need to be very careful about use of endpoint IDs. It is RECOMMENDED that the same endpoint ID be reused only in communicating with a next hop with whom the element has an administrative relationship, through which it knows connection reuse is desired. [[OPEN ISSUE: This limits connection reuse to the federated model only. Is there a way to reuse connections in an ad-hoc fashion without incurring security risk? Need to think some more on that.]] Rosenberg Expires June 26, 2003 [Page 14] Internet-Draft Comedia Changes December 2002 7. IANA Considerations There are no IANA considerations associated with this specification. Rosenberg Expires June 26, 2003 [Page 15] Internet-Draft Comedia Changes December 2002 8. Contributors This work was the result of collaborations with Ben Campbell and Paul Kyzivat. Rosenberg Expires June 26, 2003 [Page 16] Internet-Draft Comedia Changes December 2002 Informative References [1] Rosenberg, J. and B. Campbell, "Instant Message Sessions in SIMPLE", draft-campbell-simple-im-sessions-00 (work in progress), October 2002. [2] Yon, D., "Connection-Oriented Media Transport in SDP", draft-ietf-mmusic-sdp-comedia-04 (work in progress), July 2002. [3] Campbell, B., "Instant Message Transport Sessions using the CPIM Message Format", draft-campbell-simple-cpimmsg-sessions-00 (work in progress), October 2002. [4] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998. [5] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [6] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998. [7] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996. [8] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. [9] Day, M., Aggarwal, S. and J. Vincent, "Instant Messaging / Presence Protocol Requirements", RFC 2779, February 2000. Author's Address Jonathan Rosenberg dynamicsoft 72 Eagle Rock Avenue East Hanover, NJ 07936 US Phone: +1 973 952-5000 EMail: jdrosen@dynamicsoft.com URI: http://www.jdrosen.net Rosenberg Expires June 26, 2003 [Page 17] Internet-Draft Comedia Changes December 2002 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2002). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Rosenberg Expires June 26, 2003 [Page 18] Internet-Draft Comedia Changes December 2002 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Rosenberg Expires June 26, 2003 [Page 19]