Robust Header Compression Tom Hiller Internet Draft Pete McCann Document: draft-hiller-rohc-gehco-01.txt Lucent Technologies March 2001 Good Enough Header COmpression (GEHCO) Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract The Robust Header Compression Working Group has embarked upon the development and standardization of header compression schemes that perform well over links with high error rates and long round-trip times. The goal is that the schemes must perform well for cellular links built using technologies such as WCDMA, EDGE, and cdma2000. Of particular importance to the 3GPP2 community is the spectral efficiency of IP/UDP/RTP header compression for the voice over IP application to mobiles. Currently, the ROHC protocol adds at least one byte of overhead to each frame, when compared with ordinary circuit voice. This draft proposes a new zero-byte profile for ROHC to enable the use of physical channel timing as a substitute for sequence numbers. This allows for vocoded frames to be sent without any headers whatsoever, dramatically reducing the bandwidth required for voice over IP flows. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and Hiller, McCann Expires 09/2001 1 GEHCO March, 2001 "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [2]. 3. Introduction End-to-end IP-based multimedia applications, enabled by call control protocols such as SIP [3], will allow the rapid development of new kinds of communication services. In wireless applications, the need will then arise to support the transport of RTP packets to and from a mobile endpoint. Voice sessions transmit short frames of data, usually on a continuous basis according to known activity factors. The frequency and length of the voice packets requires compression of RTP packet overhead in order to reduce the bandwidth used by the call, because the uncompressed headers will often be longer than the codec data itself. If efficiency comparable to the existing circuit voice service cannot be obtained, carriers will be hard pressed to justify deployment of end-to-end IP based multimedia. The Robust Header Compression Working Group has standardized a header compression scheme that performs well over links with high error rates and long round-trip times. The current standardized ROHC algorithms are able to reduce the IP/UDP/RTP overhead to one or two bytes on average as well as tolerate some loss of packets and still maintain local decompression state without complete retransmission of a new IP/UDP/RTP header. In addition to the one or two bytes of header overhead some amount of bandwidth is necessary for feedback between decompressor and compressor or to retransmit a partial or complete header when the local state cannot be repaired. Of particular importance to the 3GPP2 community is the spectral efficiency of IP/UDP/RTP header compression for voice-over-IP application to mobiles. 3GPP2 carriers have indicated that voice over IP must have spectral efficiency comparable to legacy circuit transport over-the-air in order for them to deploy voice-over-IP to mobiles. To achieve such spectral efficiency, IP/UDP/RTP header compression must not transport any additional bytes over-the-air. An earlier version of this draft proposed the use of the real-time physical channel timing to convey a sequence number to the decompressor. Other types of information such as static fields like IP addresses are sent over a sister data link that is assumed to exist between compressor and decompressor, in addition to the over-the-air connection that carries codec data. Because header information was sent once and never updated, this method did not exactly reproduce all bits of the IP/UDP/RTP header; however, the behavior of the reproduced header could have been "good enough" for many voice-over-IP applications over cellular links. Such Hiller, McCann Expires 09/2001 2 GEHCO March, 2001 header compression was categorized as non-transparent header compression. Feedback from ROHC membership indicated that bit-identical transparency should be maintained as much as possible. This draft proposes a new ROHC profile for zero-byte compression that allows for context updates. Each update carries information relating it to the real-time stream of vocoder frames, enabling it to be properly placed with respect to the data stream. This allows for any field changes resulting from updates in the IP, UDP, or RTP header to be properly communicated to the decompressor and applied at the right point in the stream. Also, any slippage of sequence numbers with respect to the physical channel timing due to clock drift or channel reset events can be communicated and fixed. A basic premise of this draft is that the codec data carried on the real-time physical link should be sent in a format that is largely unchanged from the existing circuit voice interface. This will allow for the flexible development of new codecs that can make use of every bit of available payload without restrictions of any kind. The authors believe that embedding of ROHC overhead within the codec stream will be cumbersome for some phones and data nodes. In any case, it is not possible to embed all messages (both initial and dynamic updates) in the codec stream without introducing segmentation and reassembly mechanisms, because some updates will be longer than the longest air frame supported by the physical link. This draft proposes a new ROHC profile that maintains transparency but does not embed ROHC messages in the codec stream. Instead we continue to rely on a sister data link to carry full header updates but now add synchronous information to these updates. As with the original GEHCO proposal, the scheme is applicable to certain cellular links that synchronously transport vocoded or other multimedia payloads so that the compressor may be certain that the compressed frame will be delivered with a fixed and predictable delay. One example of such a cellular link is the cdma2000 air interface with cdma2000 vocoders. In the next section, this draft discusses assumptions of the underlying link layer that makes zero byte compression schemes possible. The draft then proposes a protocol in the form of a ROHC profile for transparent zero byte RTP header compression, making extensive use of existing ROHC messages. However, instead of inserting such ROHC overhead in the codec stream, this profile uses the sister data link, adding only a couple of new extensions to current ROHC message fields. Of course, there is some overhead when packet fields must be updated, but we expect such updates to be rare. Hiller, McCann Expires 09/2001 3 GEHCO March, 2001 4. cdma2000 Link Characteristics and Implications The cdma2000 link synchronously transports physical frames every 20ms. The number of bytes in a physical frame varies based on signaling negotiation between the user and the radio network as well as other criteria, such as backlog (for the forward direction of the network to mobile). The physical frames are sent in over- the-air connections. Currently, six such connections may exist simultaneously. The connections feature two modes, one in which the physical frames are subject to retransmission, and another in which they are not. In cdma2000, the over-the-air connections are referred to as service instances. The mode with retransmission serves to improve the error rate and thereby improve the throughput of TCP. It has variable delay. The one without retransmission behaves synchronously and each 20ms frame has a fixed number of bits such that the physical frame will be delivered in 20ms. There are four types of physical frames: full rate (171 bits), half rate (80 bits), quarter rate (40 bits) and eighth rate (16 bits). One reason for these different "rate" frames is variable rate vocoding. Depending on speech activity and patterns, the vocoder puts out a number of bytes that exactly fits one of these four physical frame sizes. The over-the-air connections have priority. Voice payloads are sent in over-the-air connections with high priority in the mode without any retransmission. It is possible to say with certainty for these kinds of frames that if a physical frame is transmitted it will be delivered 20ms later, if it is received at all. We consider the frame to be dropped if it has uncorrectable errors. Because the delivery of these frames is a very precise timing hierarchy, the underlying physical timing of the channel may be used to convey sequence number information to the decompressor, thus avoiding the overhead of actually sending sequence numbers. In addition to the foregoing use of underlying physical timing, 3G cellular data links offer another feature that may be exploited to great advantage in a transparent zero byte header compression scheme: Link start time. Link start time is the absolute real time that the network tells the mobile to transmit the first on an over-the-air connection. In cdma2000, link start time is referred to as "action time". Link start time is thus a synchronization event between the network and mobile node. To achieve transparency, the network first conveys link start time to the compressor-decompressor pairs (for both directions). The compressor then sends a full header in the form of an IR over the sister data link to the decompressor. This IR contains an extension that associates some a time offset from the action time with the particular full header (i.e. an RTP sequence number and RTP timestamp plus other packet fields). Recall that time offsets are simply measured by over-the-air framing times. In 3G wireless Hiller, McCann Expires 09/2001 4 GEHCO March, 2001 data, a frame time as explained above is 20ms. Thus the time offset is a whole number representing the number of 20ms periods from the action time. The compressor may indicate an offset of packets into the future at which the particular full header will be correct or may indicate a current time. The decompressor will be able to adjust the current timestamp context to the correct timestamp and sequence number based on this information. The compression context may change for various reasons, e.g. a handoff in which no packets are sent because the physical layer is being re-established and therefore does not exist to carry timing information. In this case the compressor (once notified of the handoff) may send a full header in the form of an IR-DYN to update and re-synchronize the decompressor with new timing and sequence number information. IR-DYN carrying full or partial headers could be optionally sent to refresh the state of the decompressor. We also supply a small Update packet for updating only sequence number, timestamp, and IP-ID fields. Any packet may carry extensions for updating CSRCs, as in basic ROHC-RTP. The use of a reasonably reliable sister data link with acknowledged messaging, in conjunction with the physical timing characteristics of the 3G data link, provide for a basis of a transparent zero byte overhead RTP compression profile. 5. Profile Selection Given that there are two kinds of header compression, usual ROHC and zero byte header compression it becomes necessary for the mobile node and network to know when to use a usual ROHC header compression for RTP or the zero byte version. The foregoing indicates that the zero byte approach only applies for those links which are tuned to the codec, i.e., for which the framing matches the codec size as well as provides timing information. Our approach is to first assume that that the mobile node has an application client (e.g. SIP) that knows the type media involved. If the application realizes that the codec and link type apply for a zero byte profile, the application arranges (via suitable API) for the establishment of a suitable bearer and the transmission of a full header in the form of an IR packet to be sent to the network. As in GEHCO, the full header carries a reference to the link used for transport of codec information. The network side compressor and decompressor, however, do not contain an application client and therefore do not know the codec type. The solution proposed below is for the mobile node to send a partial header in an IR packet that contains the IP addresses, UDP ports, and protocol (RTP) type. This IR also contains the Hiller, McCann Expires 09/2001 5 GEHCO March, 2001 reference to the over-the-air connection (likely same one as in the other direction) used to carry the codec data. When the network receives an RTP packet that matches this filter, the network compressor sends an IR packet with sequence number matching the action time to the mobile node decompressor. Hiller, McCann Expires 09/2001 6 GEHCO March, 2001 6. Zero Byte Header Compression Profile This section proposes a new profile for ROHC specifically designed to support zero-byte header compression. As with any profile, a new profile code must be assigned by IANA. 6.1 State Machine We propose the state machine for the zero byte profile in Figure 1. IR +------->------------>------------>----------+ | | | | | | | | | | | | | | Context v +-----------+ +-----------+ Update +-----------+ |IR State | |FO State | -------> |SO State | +-----------+ +-----------+ +-----------+ | | | | | | | IR-DYN | +----------------------+ Figure 1: State Machine In Figure 1, the decompressor begins in the IR state and transitions directly to the SO state upon receipt of a properly formed IR packet. The compressor will resend the IR packet until it receives an acknowledgment. This is important to ensure that the decompressor has a context state with link start time (an offset to the packet time of a given sequence number). The compressor may send zero byte header packets (codec samples) on the over-the-air connection at the same time. The decompressor will discard these zero byte header packets until it receives the IR packet. That is, the decompressor never decompresses while in the IR state. Hiller, McCann Expires 09/2001 7 GEHCO March, 2001 The decompressor remains in the SO state until some irregularity occurs in the packet stream that requires an unpredictable change in the context state of the decompressor. Examples of irregularities are handoffs during which time there is no physical layer present as well as changes in static headers such as CSRC fields due to users adding or dropping from conference mixers. When such irregularities occur, the compressor sends an IR-DYN with a partial header to the decompressor. At the same time the compressor continues to send zero byte header packets, i.e. codec samples embedded in units of data the size of native airframes. The decompressor transitions to the FO state upon receipt of the IR-DYN and updates the context. The decompressor continues to send decompress the zero byte packets during this time. Upon context update of the IR-DYN data, the decompressor transitions back to the SO state decompressing packets according to the updated context. The decompressor acknowledges the IR-DYN as usual on the sister data link. A similar behavior applies for an IR-DYN refresh, if the particular implementation supports optional refreshes. There is no reason for the decompressor to transition to the IR state once it reaches the SO state. 6.2 Mode This zero byte profile operates over a bi-directional link. We specify acknowledged operation for messaging so this is R-mode with respect to the ROHC draft. 6.3 Packet Types We require the following packet types for our new profile: * IR packet * IR-DYN packet * Feedback packet * Update packet The sister data link in this scheme provides framing so that segmentation services of ROHC are not necessary. Section 5.7.7 of the ROHC draft states the basic structure of the IR and IR-DYN packets. Our profile extends these formats as shown in Figures 2 and 3. Hiller, McCann Expires 09/2001 8 GEHCO March, 2001 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ | Add CID octet | if for small CIDs and CID != 0 +-+-+-+-+-+-+-+-+ |1 1 1 1 1 1 0 D| +-+-+-+-+-+-+-+-+ | | + 0-2 octets of + 1-2 octets for large CIDs | CID info | +---------------+ | profile | 1 octet +---------------+ | CRC | 1 octet +-+-+-+-+-+-+-+-+ |I| Channel ID | 1 octets +-+-+-+-+-+-+-+-+ | Offset from | + Link Start | 2 octets | Time | +-+-+-+-+-+-+-+-+ | | | static chain | variable length | | +---------------+ | | | dynamic chain | variable length | | +---------------+ Figure 2: IR for Zero Byte ROHC RTP Profile D: D = 1 indicates that the dynamic chain is present Profile: Profile identifier, abbreviated as defined in section 5.2.3 of the ROHC draft CRC: 8-bit CRC computed according to section 5.9.1 of the ROHC draft I: The reference direction of the IR packet. I = 0 means from compressor to decompressor. I = 1 means the packet should be Hiller, McCann Expires 09/2001 9 GEHCO March, 2001 interpreted as a reverse flow spec and only the static chain should be sent. Channel ID: A reference to the over-the-air connection that carries the zero byte header packets. Offset: This is the offset in airframe times from the link start time to which the IR packet's information applies Static chain: A chain of the static subheader information Dynamic chain: A chain of the static subheader information 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ | Add CID octet | if for small CIDs and CID != 0 +-+-+-+-+-+-+-+-+ |1 1 1 1 1 1 0 0| +-+-+-+-+-+-+-+-+ | | + 0-2 octets of + 1-2 octets for large CIDs | CID info | +---------------+ | profile | 1 octet +---------------+ | CRC | 1 octet +-+-+-+-+-+-+-+-+ | | + Offset from | 2 octets | Action Time | +-+-+-+-+-+-+-+-+ | Channel ID | 1 octets +---------------+ | | | dynamic chain | variable length | | +-+-+-+-+-+-+-+-+ Figure 3: IR-DYN for Zero Byte ROHC RTP Profile D: D = 1 indicates that the dynamic chain is present Profile: Profile identifier, abbreviated as define in section 5.2.3 of the ROHC draft Hiller, McCann Expires 09/2001 10 GEHCO March, 2001 CRC: 8-bit CRC computed according to section 5.9.1 of of the ROHC draft Channel ID: A reference to the over-the-air connection that carries the zero byte header packets. Offset: This is the offset in airframe times from the link start time to which the IR packet's information applies Dynamic chain: A chain of the static subheader information. For this profile, the relevant header information will be the RTP dynamic part of section 5.7.7.6 of the ROHC draft. As discussed above, profile selection between usual ROHC RTP and zero byte overhead ROHC RTP requires a reverse flow spec that the informs the network compressor of RTP packets to be compressed using the zero byte ROHC RTP profile. When the IR packet for this profile has the I bit set to 1, the IR packet contains a flow spec with IP addresses and UDP ports. The Rev-Flow-Spec may be viewed as a message sent from the mobile node side decompressor to the network side compressor of which packets to compress using the zero byte RTP profile and on which link channel the codec payloads should be sent. This reverse flow spec packet is necessary to preserve the independence of the location of call servers, etc., in wireless architectures. Reception of this packet does not cause a state transition of the compressor in Figure 1. Rather it simply readies the network to commence zero byte ROHC RTP header compression if an RTP packet matching flow should arrive. The Rev- Flow-Spec does not contain any RTP initialization. The actual initialization of the decompressor will occur when the compressor in the network receives an associated RTP packet and sends an IR packet to the receiver as specified above. Successful reception of that IR packet then causes the compressor to move to the SO state. Note that the CID spaces in each direction are distinct. The CID used for the reverse flow spec is not related to the CID that will actually be used for data in the opposite direction (i.e. that appears in a later IR packet from the network to the mobile after traffic finally appears). In comparison to the ROHC IR and IR-DYN there is no payload field because the payloads are transferred in the over-the-air connection that carries codec data. The IR and IR-DYN on the other hand are carried on the sister reliable data link. This profile requires acknowledged transport of IR and IR-DYN control messages. Acknowledgments are carried in ROHC feedback packets for this profile as shown in Figure 4. Hiller, McCann Expires 09/2001 11 GEHCO March, 2001 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |ack|0 0 0 0 0 0| 1 octet +-+-+-+-+-+-+-+-+ |I| Channel ID | 1 octet +-+-+-+-+-+-+-+-+ | Offset from | + Link Layer + 2 octets | Start Time | +-+-+-+-+-+-+-+-+ Figure 4: Feedback Packet for Zero Byte RTP Profile ack: 0 = ACK, 1 = NAK I: Copied from the IR or IR-DYN packet that this feedback packet acknowledges Channel ID: Copied from the IR or IR-DYN packet that this feedback packet acknowledges Offset: Copied from the IR or IR-DYN packet that this feedback packet acknowledges Note that if the CID is not 0 or if large CIDs are in use from compressor to decompressor, then there is one or two extra bytes of CID preceding this feedback data. This means that the Code field from the feedback format is set to 4, 5, or 6 depending on the length (0, 1, or 2, respectively) of CID. This profile uses the following header information from the ROHC draft. * Initialization of IPv6 header from ROHC Section 5.7.7.3 * Initialization of IPv4 header from ROHC Section 5.7.7.4 * Initialization of UDP header from ROHC Section 5.7.7.5 * Initialization of RTP header from ROHC Section 5.7.7.6, TS=0, and the header formats from ROHC Section 5.8.6 for CSRC lists. This profile can not support AH or non-null ESP because the over- the-air connection does not have any header bits to carry necessary encrypted fields. Initialization of GRE Headers occurs by the addition of an Extension type 3 of Section 5.8.5.1 with the uncompressed GRE list Hiller, McCann Expires 09/2001 12 GEHCO March, 2001 item of section 5.8.8.4. GRE CRCs, if used, should be regenerated from the received packets since there are not bits to transport them with the codec data. GRE sequence numbers are assumed to increment synchronously with the physical layer, as with RTP sequence numbers. The GRE sequence number in an IR is applicable at the offset time in the IR or IR-DYN. All of the above initialization headers appear in an IR or IR-DYN packet in Figures 2 and 3. It may be necessary to update certain fields, such as when the physical layer slips forward or backwards, or when contributing sources join or leave a conference mixer. Figure 5 shows an update packet for this profile that may be used to update certain fields. As with other control packets for this profile, the update packet must be acknowledged with a feedback packet. Note that an Extension 3 of section 5.7.5 of the ROHC draft may be added to update GRE parameters or the CSRC list. 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |0| Channel ID | 1 octet +-+-+-+-+-+-+-+-+ | Offset from | + Link Layer + 2 octets | Start Time | +-+-+-+-+-+-+-+-+ | Seq No | 1 octet +-+-+-+-+-+-+-+-+ | Time Stamp | 1 octet +-+-+-+-+-+-+-+-+ | IP ID | 1 octet +-+-+-+-+-+-+-+-+ Figure 5: Update Packet for the Zero Byte RTP Profile 7. Security Considerations Making use of end-to-end IP Security negates the effectiveness of header compression. Such packets must be carried over a data link with higher-layer framing because the physical layer frames discussed here will be too small to carry the essential security headers. Mechanisms for RTP payload encryption have been suggested; this draft allows for transparent reconstruction of timestamps and sequence numbers that can be used as cryptographic synchronization sources. Note that if silence suppression is used (something not supported by existing cellular vocoders) then the sequence numbers may become slightly out-of-sync (assuming we do Hiller, McCann Expires 09/2001 13 GEHCO March, 2001 not wish to send a context update at the start of every talk spurt) but the timestamps should still be accurate. If the radio link features strong encryption this may satisfy users that only care about security on the wireless link itself; such encryption is optional and users who defer it are more susceptible to attack. The identity of users of these radio links will be authenticated via a private key in both the radio realm and the IETF based AAA realm. In general, it is very difficult in inject frames onto the radio links; as other drafts on compression have pointed out, if a hacker is able to inject frames onto the radio links, the problems this creates far exceed just those associated with compression and decompression. 8. References [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [3] RFC 2543, SIP: Session Initiation Protocol, Handley, Schulzrinne, Schooler, Rosenberg, March 1999 9. Acknowledgments The author wish to acknowledge the input of Qualcomm's Ray Hsu et al who initially identified the spectral tax associated with IP/UDP/RTP transparent compression schemes, and Mark Lipford of SPCS and Mark Munson of Verizon for their business insights on spectral capacity, and encouragement to pursue non transparent compression. 10. Author's Addresses Tom Hiller Lucent Technologies 263 Shuman Drive Naperville, IL. USA 60137 Phone: 630-979-7673 Email: tom.hiller@lucent.com Pete McCann Lucent Technologies 263 Shuman Drive Naperville, IL. USA 60137 Hiller, McCann Expires 09/2001 14 GEHCO March, 2001 Phone: 630-713-9359 Email: mccap@lucent.com Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET Hiller, McCann Expires 09/2001 15 GEHCO March, 2001 ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Hiller, McCann Expires 09/2001 16