Network Working Group                                            R. Blom
Internet-Draft                                                  Y. Cheng
Intended status: Standards Track                             F. Lindholm
Expires: December 24, 2009                                   J. Mattsson
                                                              M. Naslund
                                                              K. Norrman
                                                       Ericsson Research
                                                           June 22, 2009


       The Use of the Secure Real-time Transport Protocol (SRTP)
                   in Store-and-Forward Applications
                       draft-naslund-srtp-saf-01

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on December 24, 2009.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.


Blom, et al.            Expires December 24, 2009               [Page 1]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


Abstract

   This memo describes the use of so called store-and-forward
   cryptographic transforms within the Secure Real-time Transport
   Protocol (SRTP).  The motivation is to support use cases when two
   end-points communicate via one (or more) store-and-forward
   middleboxes that are not fully trusted to access the media content.
   One of the main aspects of the transform is to make the
   confidentiality and message authentication independent of the RTP
   header.  Another central aspect is to enable identification of the
   cryptographic context (keys etc.).  Besides the security of the end-
   points, also trust assumptions regarding the store-and-forward
   middleboxes are addressed.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Scope of this Document . . . . . . . . . . . . . . . . . .  5
     1.2.  Conventions used in this Document  . . . . . . . . . . . .  5
       1.2.1.  Notation and Definitions . . . . . . . . . . . . . . .  5
   2.  SRTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
   3.  The Store-and-Forward Use Cases  . . . . . . . . . . . . . . .  6
     3.1.  Problem Statement  . . . . . . . . . . . . . . . . . . . .  6
     3.2.  Trust Model and Security Requirements  . . . . . . . . . .  7
     3.3.  Problems with SRTP in SaF Scenarios  . . . . . . . . . . .  9
     3.4.  Design Rationale . . . . . . . . . . . . . . . . . . . . .  9
   4.  Usage of SaF Security within SRTP  . . . . . . . . . . . . . . 10
     4.1.  The SaF Extension  . . . . . . . . . . . . . . . . . . . . 10
     4.2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . 11
     4.3.  SRTP SaF Packet Format . . . . . . . . . . . . . . . . . . 11
     4.4.  Extension of the SRTP Cryptographic Context  . . . . . . . 13
       4.4.1.  E2e Context Definition . . . . . . . . . . . . . . . . 14
       4.4.2.  Identification of e2e Context  . . . . . . . . . . . . 15
     4.5.  SRTP SaF Processing  . . . . . . . . . . . . . . . . . . . 18
       4.5.1.  Sender . . . . . . . . . . . . . . . . . . . . . . . . 18
       4.5.2.  SaF Middlebox  . . . . . . . . . . . . . . . . . . . . 18
       4.5.3.  Receiver . . . . . . . . . . . . . . . . . . . . . . . 20
     4.6.  Use of SRTCP with SRTP SaF . . . . . . . . . . . . . . . . 20
     4.7.  Cryptographic Transforms . . . . . . . . . . . . . . . . . 21
       4.7.1.  Default hbh Transforms . . . . . . . . . . . . . . . . 21
       4.7.2.  Default e2e Transforms . . . . . . . . . . . . . . . . 21
       4.7.3.  Session Key Derivation . . . . . . . . . . . . . . . . 22
   5.  SRTP SaF Default Parameters  . . . . . . . . . . . . . . . . . 22
     5.1.  Adding Future e2e Transforms . . . . . . . . . . . . . . . 23
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 23
     6.1.  General  . . . . . . . . . . . . . . . . . . . . . . . . . 23
     6.2.  Keystream Reuse  . . . . . . . . . . . . . . . . . . . . . 24


Blom, et al.            Expires December 24, 2009               [Page 2]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


     6.3.  Attacks on CCIs  . . . . . . . . . . . . . . . . . . . . . 24
     6.4.  Authentication and Authorization . . . . . . . . . . . . . 25
     6.5.  Replay Protection  . . . . . . . . . . . . . . . . . . . . 25
     6.6.  Key Management Considerations  . . . . . . . . . . . . . . 26
     6.7.  Privacy  . . . . . . . . . . . . . . . . . . . . . . . . . 26
     6.8.  RTCP Considerations  . . . . . . . . . . . . . . . . . . . 27
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 27
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 27
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 27
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 28
   Appendix A.  Use Cases . . . . . . . . . . . . . . . . . . . . . . 28
     A.1.  Streaming Pre-encrypted Media  . . . . . . . . . . . . . . 28
     A.2.  Recording Encrypted Media at Home  . . . . . . . . . . . . 28
     A.3.  Answering Machine  . . . . . . . . . . . . . . . . . . . . 28
     A.4.  Media Rewind . . . . . . . . . . . . . . . . . . . . . . . 28
   Appendix B.  Test Vector . . . . . . . . . . . . . . . . . . . . . 29
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30


Blom, et al.            Expires December 24, 2009               [Page 3]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


1.  Introduction

   The Secure Real-time Transport Protocol (SRTP) [RFC3711] is a profile
   of RTP, which can provide confidentiality, message authentication,
   and replay protection to the RTP traffic and to the RTP control
   protocol, the Real-time Transport Control Protocol (RTCP).  The basic
   SRTP profile in [RFC3711] solves real-time end-to-end use cases, and
   does not consider use cases requiring Store-and-Forward (SaF)
   middleboxes.  Such use cases are characterized by the need for a
   sender to deliver media to a receiver via a SaF middlebox.  A SaF
   middlebox temporarily stores media and retransmits it to the intended
   receiver.  Retransmission can be almost immediate (e.g. a push-to-
   talk group server), or be done at a much later time (e.g. a VoIP
   answering machine).  The SaF middlebox is typically considered as
   semi-trusted, meaning that a SaF middlebox will store and deliver
   media as requested, but it cannot be excluded that a SaF middlebox
   will also try to extract the information for its own purposes
   (whatever they might be).  The trust model will be made more formal
   later in this document.  What causes problems for standard end-to-end
   SRTP in these settings is its dependence on the actual RTP transport
   parameters which will differ when RTP is used on different hops,
   i.e., sender-middlebox and middlebox-receiver.

   SRTP is a framework that allows new security functions and new
   transforms to be added and this document defines a so called store-
   and-forward extension to SRTP to meet the additional use cases
   considered.  One of the main aspects of the transform is to make the
   confidentiality and message authentication independent of the RTP
   header.  This allows for end-to-end protection to be achieved also in
   the cases SaF middleboxes need to manipulate the RTP headers.

   Another aspect is that identification of the cryptographic context
   (keys etc.) between the end-points must be extended, as the
   parameters used in [RFC3711] are available only during transport of
   RTP packets over a "hop".  For instance, [RFC3711] specifies that the
   receiver's IP address shall be part of the context identifier, but
   this value may of course not be known to the sender when
   communicating messages via a SaF middlebox.  Indeed, the receiver may
   not even be on-line at the time when the source initiates the
   communication.  Another part of the cryptographic context identifier
   is the SSRC, which may be modified by SaF middleboxes.

   While there certainly are differences between this document and
   [RFC3711] on mechanism level, it is worth noticing that the kind of
   extensions defined herein are conceptually almost identical to the
   SRTP extensions previously defined in [RFC4383], which adds source
   origin authentication support to SRTP.  Moreover, as far as the
   cryptographic processing is concerned, the SaF middleboxes may use


Blom, et al.            Expires December 24, 2009               [Page 4]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   [RFC3711] compliant processing and changes in cryptographic
   processing are thus only needed in the end-points.

1.1.  Scope of this Document

   The scope of this document is to specify extensions to SRTP
   (parameters, processing, and cryptographic transforms) to support the
   store-and-forward use case and its associated trust model.  The SaF
   use case and trust models is defined in Section 3.  No claims are
   made about supporting also other use cases, though of course, all the
   original uses cases from [RFC3711] can also be supported.

   The SaF use case implies a different trust model than that originally
   considered when designing SRTP.  This manifests itself in terms of
   the need to ensure authorized access to the different cryptographic
   keys involved, i.e. the extensions defined herein MUST have support
   from some key management scheme.  Similar to the original SRTP
   specification, the actual definition of the key management solution
   is out of scope of this document.  Necessary (and sufficient)
   requirements on the key management can be found in Section 6.6.

1.2.  Conventions used in this Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   Throughout the specification all protocol data fields are assumed to
   be byte aligned, i.e. all defined bit-sizes SHALL be multiples of 8.

1.2.1.  Notation and Definitions

      DoS: Denial of service

      e2e: end-to-end

      hbh: hop-by-hop

      SaF: Store-and-Forward

   For the purpose of this document we use the following definitions:

   A is said to trust B with information I, if A is willing to share I
   with B. In the sequel we will simply say that A trusts B.

   A is said to have sender-semi-trust in B if A considers B to be
   "honest-but-curious" in the following sense.  A trust B to maintain
   information I provided by A, and (later) redistribute it to the


Blom, et al.            Expires December 24, 2009               [Page 5]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   intended recipients as specified by A (parties that A trust with I).
   However, A does not trust that B will not also try to extract the
   information I for him/herself and/or to attempt to distribute I also
   to other parties, e.g. parties that A does not trust with I.

   A is similarly said to have receiver-semi-trust in B, if A trusts B
   to maintain information intended for A and to (later) distribute this
   information to A if and only if A so requests.  However, A does not
   trust that B will not also attempt to distribute the information to
   other parties and/or try to extract it him/herself.

   When it is obvious from the context (or irrelevant) we shall omit the
   directivity (sender/receiver) and simply say that A semi-trusts B.


2.  SRTP

   The Secure Real-time Transport Protocol (SRTP) [RFC3711] is a profile
   of RTP, which can provide confidentiality, message authentication,
   and replay protection to the RTP traffic and to the RTP control
   protocol, the Real-time Transport Control Protocol (RTCP).  Note that
   the term "SRTP" may often be used to indicate SRTCP as well.  SRTP is
   a framework that allows new security functions and new transforms to
   be added.  In the sequel, we assume that the reader is familiar with
   the SRTP specification [RFC3711], its packet structure, and its
   processing rules.

   This specification defines a so called Store-and-Forward extension to
   SRTP to permit communication via semi-trusted SaF middleboxes.  As
   mentioned, the SRTP extensions defined herein are very similar in
   nature to the SRTP extensions previously defined in [RFC4383] to add
   source origin authentication support to SRTP.  In both cases, the
   extensions needed are: definition of new cryptographic transforms, a
   new packet format including additional in-band context signaling, and
   extensions to the SRTP cryptographic context concept.


3.  The Store-and-Forward Use Cases

3.1.  Problem Statement

   We consider RTP communication solutions that include semi-trusted SaF
   middleboxes, i.e. middleboxes that should not have access to
   cleartext media, but still should be able to have access to other
   data in order to retransmit media according to RTP standard
   procedures.  Below, we provide some use cases where S, M, and R refer
   to Sender, SaF Middlebox, and Receiver.  For each use case, we
   comment on aspects of the semi-trusted model defined above.


Blom, et al.            Expires December 24, 2009               [Page 6]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   Streaming Pre-encrypted Media: A content creator (S) distributes high
   value, encrypted content to clients (R).  Distribution is made via a
   streaming server (M).  From the content creator's point of view it is
   important that decrypted (plaintext) content is only made available
   to authorized clients.  This means that S should be able to use a
   streaming server M, to which it assigns a bare minimum of sender-
   semi-trust.  The clients may typically have some basic privacy
   requirement related to what type of content they access, but may
   otherwise be less concerned with whom else that also gets access to
   the content.

   Recording Encrypted Media: Encrypted IPTV is broadcasted in a
   network.  Only clients trusted by the content creator (S) should have
   access.  Before having acquired a license to view the content, a user
   (R) records media on a Hard Disk Drive (M), where the media is stored
   in encrypted format, awaiting a license for rendering.  Here, the
   trust in the HDD (M) by S and R is probably very asymmetric since the
   end-user most likely has a very strong trust in his personal home
   equipment.  An additional requirement is the possibility for M to
   authenticate the data source S in order not to exhaust storage
   capacity with garbage.

   Answering Machine: Operators commonly provide an answering machine
   service to their customers.  Communicating parties (S and R) may not
   wish to disclose the media to any other party.  Thus, the answering
   machine (M) acts as a SaF middlebox, which has to store encrypted
   data and retransmit it to the callee.  In this use case, sender and
   receiver-semi-trust in M is likely to be fairly symmetric.

   Further examples and more details can be found in Appendix A.

   The typical use case is thus to require that media is (at least)
   confidentiality protected end-to-end (e2e) between the sender and the
   receiver.  At the same time the communication should be protected
   hop-by-hop (hbh) to prevent malicious users from performing denial of
   service attacks by sending bogus data to SaF middleboxes, which the
   SaF middleboxes then would store, eventually exhausting their storage
   space and/or corrupting the data stored.

3.2.  Trust Model and Security Requirements

   The following figure shows the assumed trust model in terms of
   previous definitions.

   In practice, the model means that


Blom, et al.            Expires December 24, 2009               [Page 7]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   o  S trusts R,

   o  S semi-trusts M to deliver information to R, and,

   o  R semi-trusts M to forward any information intended for R.


                                   Trust
               -------------------------------------------->
             +---+                 +---+                 +---+
             | S |                 | M |                 | R |
             +---+                 +---+                 +---+
               ---------------------> <---------------------
                 Sender-semi-trust      Receiver-semi-trust

          Figure 1: Trust Model (Sender, SaF Middlebox, Receiver)

   As noted in the use cases above, S may be more concerned with who
   gets access to the information than R is.  Still, this trust model,
   assuming a bare minimum of sender- and receiver-semi-trust as defined
   above, has been chosen since it is a simple trust model and seems to
   apply (qualitatively) as a common denominator for all the SaF use
   cases.  Note also that the trust between S and R may often be mutual,
   but we do not require this.

   M does not need to trust either of S or R. However, in order to
   fulfill its (assumed) duty as a semi-trusted SaF middlebox, M must at
   least be able to authenticate S and the information S provides.  If
   this was not the case, some malicious party might exhaust the storage
   resources of M, implying that it could not even be semi-trusted by S
   and R. Similarly, a more robust implementation of M should have means
   to authenticate also R in order to avoid wasting resources,
   responding to spoofed requests.  That is, the trust model also
   assumes the existence of other parties (not shown) that are not
   trusted by any of S, M, or R, and which may attempt to intervene with
   the communication between them and the SaF services provided by M.

   When there are several SaF middleboxes in the path between S and R,
   it is necessary to assume that the SaF middleboxes semi-trust each
   other, at least in a transitive sense.  Also, we may then have a
   situation where S and R does not (directly) semi-trust a common M.

   The security requirements for SRTP SaF hence are:

   1.  It SHALL be possible to provide e2e confidentiality and message
       authentication between S and R.


Blom, et al.            Expires December 24, 2009               [Page 8]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   2.  It SHALL be possible to provide hbh message authentication
       between S and M, respectively between M and R.

   To provide a basis for enhanced privacy protection against other
   parties (e.g. traffic analysis), hbh confidentiality SHOULD also be
   provided.  Some practical use cases when this trust model is likely
   to apply are given in Appendix A.

   As mentioned, hbh cryptographic processing is compatible with SRTP
   [RFC3711] and therefore also RTCP SHOULD be protected using SRTCP on
   hbh basis according to [RFC3711].  However, the SRTP SaF extension
   defined herein makes no provision to provide e2e protection of RTCP
   also on e2e basis.  Relevant considerations (rationale and caveats)
   relating to RTCP are provided in Section 4.6.

3.3.  Problems with SRTP in SaF Scenarios

   It would be desirable to be able to offer use of SRTP as a general,
   lightweight mechanism to achieve the above type of protection, but
   trying to do so reveals two main problems.

   The first problem is due to the fact that RTP streams recorded and
   later resent by an entity in general are independent; received SRTP-
   encrypted payloads cannot just be stored and later retransmitted as
   they are For instance, a new SSRC is most likely used when
   retransmitting.  This in particular implies that SRTP with currently
   defined transforms cannot be applied end-to-end as they depend on the
   SSRC.

   The second problem is that in order to provide both e2e and hbh
   protection, two independent security contexts with associated
   protection mechanisms have to coexist; a feature unavailable in SRTP
   as currently specified.  While it is not too difficult to imagine how
   two contexts in place of one might be used, a problem arises when
   specifying how the e2e part of the context should be identified and
   signaled, as current SRTP context definition rests on parameters
   which are not constant end-to-end in the SaF scenario, namely SSRC
   and receiver's IP address and port.

   The SRTP SaF extension defined in this document addresses these
   problems.

3.4.  Design Rationale

   As noted above, different SaF scenarios may have slightly different
   security requirements and trust models and there may be many
   different possibilities to extend SRTP in different directions to
   handle a specific SaF use case (or some subset of use cases).  For


Blom, et al.            Expires December 24, 2009               [Page 9]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   example, the problems related to the most basic trust model extension
   (need to provide confidentiality e2e and integrity hbh) are due to
   the fact that in SRTP, parties always know both the encryption key
   and the authentication key.  This could be addressed (mainly) by just
   separating encryption and authentication keys (i.e. modifying SRTP
   key derivation and cryptographic context).  However, the solution
   would then become severely limited, e.g. it would not support pre-
   encryption of data or re-transmission of stored data.  Similarly, as
   will be seen below, SRTP SaF adds some additional in-band data
   fields, though some use cases above could probably be handled without
   them.  Again, the solution would be limited to these use-cases and
   would then not allow e.g. the secure fast-forward/rewind use cases,
   which requires in-band synchronization data.  By making the added
   fields optional, it is possible to support these features as needed,
   yet keeping bandwidth low when such features are not needed.

   This specification is rather based on

   -  identifying the common denominator(s) to the SaF use case
      problems, captured in the trust model and requirements of
      Section 3.2
   -  proposing a single extension of the SRTP framework (see
      Section 4.1) powerful enough to handle all foreseen SaF use cases,
      which, by
   -  simple configuration of the extended framework (Section 4.3) can
      be adopted to support the requirements of the specific SaF use
      case at hand.

   Considering that the impacts of the present specification on SRTP are
   very similar to those of [RFC4383], there does not appear to be any
   disadvantage in having a single SaF extension compared to having per-
   SaF-use-case extensions.


4.  Usage of SaF Security within SRTP

4.1.  The SaF Extension

   The SaF extension consists of a new packet format (Section 4.3), an
   extended cryptographic context concept (Section 4.4), and new SRTP
   processing at sender/receiver (Section 4.5).  Considering only the
   cryptographic processing, SaF middleboxes are compatible with
   [RFC3711], and the necessary additional processing is defined in
   Section 4.5.2.  Senders/receivers need to support new cryptographic
   transforms (see Section 4.7).


Blom, et al.            Expires December 24, 2009              [Page 10]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


4.2.  Terminology

   A SaF e2e session is defined as the set of SaF e2e protected data
   produced under a single e2e context (a security association between
   sender and the ultimate receiver, see Section 4.3 for the exact
   definition of e2e context).  A SaF e2e session may comprise several
   so-called SaF sources, i.e. several distinct logical e2e media
   streams to be protected by the same e2e context.

   A SaF hbh session is defined as the set of SaF hbh protected data
   produced under a single hbh context (a security association between
   two entities where at least one entity is a middlebox, see Section
   4.3 for the exact definition of e2e context).

   The cryptographic transforms, keys, etc., used for the e2e and hbh
   protection, respectively, are denoted e2e transform, hbh transform,
   e2e key, hbh key, etc.

4.3.  SRTP SaF Packet Format

   Figure 2 illustrates the format of the SRTP packet when SaF is
   applied.

   The packet format is composed of an "inner" e2e (sender-receiver)
   part embedded in an "outer" hbh (sender-middlebox or middlebox-
   receiver) part.  Between these parts, a new CCI field (explained
   below) is introduced.

   The e2e protected portion provides e2e encryption of the payload, RTP
   padding, RTP pad count.  The e2e protected portion also defines two
   new fields (PUV and SSS) for cryptographic synchronization, and an
   e2e MAC tag field.

   The e2e MAC tag covers the e2e protected portion, except the e2e MAC
   tag itself.  Whether authentication implies source origin
   authentication or only message integrity depends on the transform
   used.  Thus, e2e encryption is provided over the Payload, RTP
   padding, and RTP pad count fields, while authentication is provided
   for the PUV and SSS as well.


Blom, et al.            Expires December 24, 2009              [Page 11]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<---+
  |V=2|P|X|  CC   |M|     PT      |        sequence number        |    |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+    |
  |                           timestamp                           |    |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+    |
  |           synchronization source (SSRC) identifier            |    |
  +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+    |
  |            contributing source (CSRC) identifiers             |    |
  |                             ....                              |    |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+    |
  |                   RTP extension (OPTIONAL)                    |    |
+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ |
| |                          payload  ...                         |  | |
| |                               +-------------------------------+  | |
| |                               |  RTP padding  | RTP pad count |  | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  | |
| ~                        e2e PUV (MANDATORY)                    ~  | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  | |
| ~                        e2e SSS (OPTIONAL)                     ~  | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  | |
| ~                        e2e MAC (RECOMMENDED)                  ~  | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ |
| ~                        hbh CCI (OPTIONAL)                     ~  | |
+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-|-+
| ~                        hbh MKI (OPTIONAL)                     ~  | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  | |
| ~                        hbh MAC (RECOMMENDED)                  ~  | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  | |
|                                                                    | |
+-- hbh Encrypted Portion                   e2e Protected Portion ---+ |
                                                                       |
                                        hbh Authenticated Portion -----+

       Figure 2: The format of the SRTP packet when SaF is applied.

   Default e2e transforms, which provide both encryption and
   authentication, and which SHALL be supported are defined in
   Section 4.7.2.

   The e2e protected portion is opaque from SaF middlebox point-of-view.

   Thus, by treating the inner e2e protected portion and the Crypto
   Context Identifier (CCI, see below) as the (hbh) "encrypted portion"
   of [RFC3711], the overall SRTP SaF packet format conforms to standard
   [RFC3711] compliant SRTP.  (Note that the additional fields added in
   the inner e2e part could just as well have been added by a new


Blom, et al.            Expires December 24, 2009              [Page 12]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   transform defined for SRTP, e.g. padding and/or crypto synch fields.)
   Hence, the hbh MAC and hbh MKI are in one-to-one correspondence with
   the MAC and MKI of [RFC3711] and will not be discussed further.

   The additional fields added by the inner e2e security processing are:

   o  SSS: SRTP SaF Source is a value used by the SRTP SaF transform as
      an identifier for the SaF source within a SaF e2e session.  Thus,
      SSS MUST be unique for all SaF sources within the SaF e2e session.
      Since there may be only one such SaF source, the SSS field is
      OPTIONAL and of configurable length.  SSS resembles the SSRC usage
      in RTP/SRTP in the sense that it ensures that two-time pads do not
      occur under the same e2e master key, see Sections 4.7 and 6.2.
      The implementation of the necessary anti-collision mechanism is
      outside the scope of this specification.

   o  PUV: Packet Unique Value for the e2e transform.  PUV is transform
      dependent, of configurable length, and MANDATORY.  The format is
      transform dependent and security aspects need to be considered
      when defining the format, see Sections 6.2 and 6.4.  For a given
      SaF e2e session and SaF source, the PUV SHALL be unique for each
      generated e2e protected portion.  The PUV is used as input to the
      IV formation for the e2e encryption transform.

   o  e2e MAC: This field is used to carry payload authentication data
      e2e.  It is transform dependent, of configurable length and is
      RECOMMENDED to be used.  Observe that the e2e MAC SHALL cover the
      RTP payload, the PUV and SSS but SHALL NOT cover the RTP header,
      nor the CCI.

   o  CCI: Crypto Context Identifier: used to signal hbh, which e2e
      cryptographic context (keys and other parameters, see
      Section 4.4.1) to use.  The field is OPTIONAL and of configurable
      length.

   Parameters which are configurable have default values (see
   Section 5), and are otherwise negotiated during SaF e2e/hbh session
   establishment, agreed upon out of band, or hard coded for a specific
   application.

4.4.  Extension of the SRTP Cryptographic Context

   A SRTP SaF cryptographic context SHALL consist of two main parts.

   1.  A hbh context.  The hbh context SHALL be an SRTP cryptographic
       context conforming to [RFC3711] and SHALL be used for the hbh
       protection between sender and SaF middlebox, between SaF
       middlebox and receiver, or, between two SaF middleboxes.  The hbh


Blom, et al.            Expires December 24, 2009              [Page 13]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


       context SHALL thus be identified by the (destination IP address,
       destination port, SSRC) triplet exactly as defined in [RFC3711].

   2.  One (or more) e2e contexts: this part of the context is defined
       below and SHALL be used for the e2e protection between sender and
       receiver.

   If the context contains more than one e2e context, each e2e context
   SHALL be associated with a CCI value.  Since the length of the CCI
   field is variable, the length of the CCIs SHALL be determined by a
   length parameter, n_CCI.

4.4.1.  E2e Context Definition

   The e2e context SHALL contain the following e2e transform independent
   parameters.

   o  an identifier for the e2e encryption algorithm, i.e., the cipher
      and its mode of operation, see Section 4.7.2 for the default e2e
      encryption transform specification,

   o  an identifier for the e2e message authentication algorithm, see
      Section 4.7.2 for the default e2e message authentication transform
      specification,

   o  an identifier for the e2e pseudo-random function,

   o  an e2e master key, which MUST be random and secret to all except
      sender and receiver.  The e2e master key MUST be cryptographically
      independent of any hbh key,

   o  an e2e master salt, which MUST be random,

   o  non-negative integers n_e, n_a, n_s, and n_tag determining the
      length of the e2e session keys for encryption and message
      authentication, the e2e session salt, and the e2e authentication
      tag,

   o  non-negative integers n_PUV, and n_SSS determining the length of
      the PUV, and SSS fields.

   There may also be need to include e2e transform dependent parameters,
   see Section 4.7.2 for the parameters associated with the default e2e
   transforms.

   Observe that there is no replay protection data in the e2e context,
   see Section 4.5.3.1.  Also note that unlike [RFC3711] cryptographic
   contexts, the e2e context SHALL only contain parameters for RTP


Blom, et al.            Expires December 24, 2009              [Page 14]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   protection, and SHALL NOT contain parameters for RTCP protection, see
   Section 4.6.

   E2e contexts need only to be supported by end-points, i.e. senders
   and receivers.  SaF middleboxes need, however, to understand the
   usage of the e2e context identifiers (CCI) as discussed next.

4.4.2.  Identification of e2e Context

   A SaF context MAY contain several e2e contexts.  The motivation for
   allowing more than one e2e context is to support scenarios where the
   SaF middlebox and receiver use a single (S)RTP session into which
   they multiplex several e2e protected sessions, see Appendix A for use
   cases.

   The e2e context SHALL be identified by a combination of out-of-band
   and in-band signaling.

   The e2e context MAY be identified by simply transferring the entire
   context out-of-band.  The e2e context MUST be e2e protected so that
   middleboxes or other unauthorized entities cannot access or modify
   it.

   Alternatively, out-of-band context identification may use indirection
   and SHALL then be defined as follows.  Each sender, for each e2e
   context, defines a Content ID, CID.  When used, the CID MUST uniquely
   determine the context between a sender and a receiver but the exact
   format of the CID is outside the scope of this specification.  For
   example, a statistically unique (e.g. 256-bit) value may be used.
   The CID is communicated by out-of-band means:

   o  e2e, between sender and receiver

   o  hbh, between sender and SaF middlebox, between two SaF middleboxes
      (as applicable), and between SaF middlebox and receiver.

   How this is done (which protocol etc.) is outside the scope of this
   specification but will typically be part of session setup.

   If the SaF context contains a single e2e context, the in-band context
   identification SHALL be done as defined in [RFC3711].  Both the hbh
   context and the e2e context are uniquely identified by the triplet
   context identifier:

        <SSRC, destination network address, destination port number>

   The e2e context (either the context itself or its CID) is here
   implicit from the triplet.  Note that different triplets identify the


Blom, et al.            Expires December 24, 2009              [Page 15]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   e2e context on each "hop".

   If the SaF context contains more than one e2e context, the triplet
   context identifier cannot uniquely identify the e2e context and the
   in-band context identifier needs to be extended with the CCI field in
   the SRTP SaF packet (see Figure 2).  The e2e context is uniquely
   identified by the quadruplet context identifier:

     <CCI, SSRC, destination network address, destination port number>

   The CCI may thus be thought of as a mutant, short, in-band alias for
   the e2e context (possibly via additional indirection through CID) and
   is only used on hbh basis.  If multiple pieces of content
   corresponding to multiple CIDs are transferred within the same SaF
   hbh session, the source SHALL ensure the use of distinct CCIs for all
   CIDs.  It is RECOMMENDED for privacy reasons to assign CCIs randomly,
   with the above uniqueness requirement.  During transfer of e2e
   protected content associated with a certain CID, the source (initial
   sender or SaF middlebox) SHALL add the associated CCI to each packet
   being part of that content.

4.4.2.1.  CCI Mapping

   For each distinct e2e context (provided by the sender or a previous
   SaF middlebox through direct transfer of the protected context itself
   or a CID) the SaF middlebox SHALL ensure that when CCIs are used,
   distinct CCIs are used when forwarding messages to the receiver (or
   to another SaF middlebox) within any given hbh session.  The SaF
   middlebox SHALL, in conjunction to informing the next hop destination
   about the CID values, also inform if and how it has associated the
   CCIs to CIDs, e.g. as part of session setup signaling.  For instance,
   the SaF middlebox may provide pairs of form

                      (CID1, CCI1), (CID2, CCI2), ...

   The CCIs are then used in-band when forwarding the media to indicate
   which e2e crypto context shall be used with each packet.  As noted
   above, the CCI SHALL NOT be e2e authenticated, in order to allow
   changes by SaF middleboxes.  Clearly, the CCIs SHOULD be hbh
   authenticated to avoid e.g.  DoS attacks, see Section 6.3 for
   security considerations.

   The figure below shows a simple example of signaling of CID/CCI and
   their use.


Blom, et al.            Expires December 24, 2009              [Page 16]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


                                o-o-b: CID1
   +------------------------------------------------------------------+
   |          o-o-b: CID1                                             |
   |  +-------------------------+                                     |
   |  |                         |  o-o-b: (CID1, CCI1), (CID2, CCI2)  |
  +----+                        | +---------------------------------+ |
  |    |      SRTP SaF          | |                                 | |
  | S1 |------------------+     v |                                 v v
  |    |                  |    +---+                               +---+
  +----+                  +--->|   |     SRTP SaF: CCI1, CCI2      |   |
                               | M |------------------------------>| R |
  +----+                  +--->|   |                               |   |
  |    |                  |    +---+                               +---+
  | S2 |------------------+      ^                                   ^
  |    |      SRTP SaF           |                                   |
  +----+                         |                                   |
   |  |                          |                                   |
   |  +--------------------------+                                   |
   |          o-o-b: CID2                                            |
   +-----------------------------------------------------------------+
                                o-o-b: CID2

                           Figure 3: CCI Mapping

   The sender S1 wishes to store a message for R on a SaF middlebox M.
   S1 uses out-of-band (o-o-b) signaling to communicate the Content ID
   CID1 to R and M. (The communication with R typically will not happen
   at the same time as the communication with M; it may have already
   occurred, or it may occur later.)  S1 then uses SRTP SaF to transfer
   the content to M.

   Later, also S2 stores a message for R on the SaF middlebox M. S2 uses
   out-of-band signaling to communicate the Content ID CID2 to R and M.
   S2 then uses SRTP SaF to transfer the content.

   When R later retrieves the content from M it is multiplexed inside
   the same hbh session.  Before starting the streaming, however, M
   first (using out-of-band-signaling) informs R about the CIDs and
   their corresponding CCIs.

   In what follows, it is assumed that the sender and receiver agree
   out-of-band on the e2e cryptographic context parameters to use.

   It is similarly assumed that sender-middlebox and middlebox-receiver,
   respectively, agree on the hbh cryptographic context.


Blom, et al.            Expires December 24, 2009              [Page 17]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


4.5.  SRTP SaF Processing

4.5.1.  Sender

   The sender SHALL first, out-of-band, establish the necessary CIDs,
   CCIs and hbh context parameters with the SaF middlebox as discussed
   above.  The rest of sender's processing is identical to [RFC3711]
   with the following exceptions and extensions.

   S1  In analogy with step 1 of [RFC3711], the sender SHALL determine
       both the hbh context and the e2e context as discussed in Sections
       4.4.1 and 4.4.2.  Next, and prior to performing step 2 of
       [RFC3711], the sender SHALL perform step S2-S6 as defined below.

   S2  The sender SHALL from the e2e master key and master salt
       determine the e2e session key(s)/salt as discussed in
       Section 4.7.3.

   S3  The sender SHALL next apply the e2e encryption transform as
       described in Section 4.7.2.

   S4  The sender SHALL next apply the e2e authentication transform as
       described in Section 4.7.2 applying the e2e session key(s)/salt
       of step S2 to the result of S3 concatenated by the PUV, and the
       SSS.

   S5  The sender SHALL then form the e2e protected portion of the SRTP
       SaF packet by concatenating the result of S3, the PUV, the SSS
       and the tag from S4.

   S6  The sender adds the CCI (if used), see also Figure 2.

   The rest of the sender's processing conforms to [RFC3711], steps 2-8,
   by treating the result of S6 as the part to be encrypted ("encrypted
   portion" of [RFC3711]) and using the hbh context.

4.5.2.  SaF Middlebox

   SaF middleboxes do not have access to the e2e contexts and may even
   be unaware of their definition.  Hence, "context" in this section
   refers to standard [RFC3711] cryptographic contexts, which in turn
   agrees with the hbh contexts defined herein.

   Generally, the SaF middlebox SHALL first, out-of-band, establish the
   necessary CIDs, CCIs, and hbh context parameters with the source or
   destination as discussed above.


Blom, et al.            Expires December 24, 2009              [Page 18]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


4.5.2.1.  Acting as Receiver ("Store")

   MR1  When receiving media from a sender, the SaF middlebox SHALL
        retrieve the correct context and process the packet exactly
        according to the receiver behavior of [RFC3711].

   MR2  The SaF middlebox SHALL store sufficient information to later be
        able to map the correct content to the intended receiver, e.g.
        e2e context, the CID, or the intended receiver's identity (ID).
        ID format and usage is otherwise out of scope for this
        specification, but could, e.g., be retrieved during the session
        establishment.

   MR3  The SaF middlebox SHALL store information sufficient to later
        reconstruct the e2e protected portion of the packets
        (corresponding to Figure 2) and to allow the receiver to
        uniquely identify the correct e2e context, e.g. by storing the
        CID or the e2e context.  Note that information from RTCP SR are
        used for synchronization between streams e.g. in a multimedia
        video/audio session.  Such information also has to be stored by
        the SaF middlebox.

4.5.2.2.  Acting as Sender ("Forward")

   MS1  When forwarding media to the receiver, the SaF middlebox SHALL
        retrieve the correct hbh context as specified in [RFC3711].

   MS2  A payload SHALL be formed consisting of the e2e protected
        portion and, if used, the CCI.

   MS3  The SaF middlebox SHALL then add an RTP header and process the
        packet exactly according to the sender behavior of [RFC3711]
        using the retrieved context.  As noted above, certain
        information from RTCP messages, originating from the sender
        (e.g.  RTCP SRs), may also need to be forwarded.  These (and
        other RTCP messages) SHALL be processed according to the SRTCP
        specification of [RFC3711].

4.5.2.3.  Multiple SaF Middleboxes

   When more than one SaF middlebox is present, we consider a pair of
   adjacent SaF middleboxes M1 and M2, where M1 forwards media to M2.

   M1 SHALL act as if M2 was the (final) receiver for the media by
   providing M2 CIDs, CCIs, and hbh protected packets, i.e. according to
   Section 4.5.2.2.

   M2 SHALL act as a SaF middlebox receiver (Section 4.5.2.1).


Blom, et al.            Expires December 24, 2009              [Page 19]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


4.5.3.  Receiver

   R1  The receiver SHALL first, out-of-band, establish the necessary
       CIDs, CCIs and hbh context parameters with the SaF middlebox as
       discussed above.

   R2  Step 1 to 8 of [RFC3711] SHALL then applied, using the hbh
       context to perform hbh processing.

   The remainder of the processing concerns the e2e protection.  The
   result after performing the hbh authentication check and decryption
   as described above MAY be stored at the receiver for later
   application of the e2e processing.  If so, the receiver MUST store
   the e2e protected portion and the CCI in order to be able to perform
   the further steps as described below.

   R3  The receiver SHALL next determine the e2e context as discussed in
       Section 4.4.2.  (In case the CCI was NOT used or NOT encrypted by
       the hbh transform, the receiver MAY determine the e2e context
       already in step R1.)

   R4  The receiver SHALL determine the e2e session encryption/
       authentication key(s) as describe in Section 4.7.3 using the e2e
       master key and salt.

   R5  The receiver SHALL verify authentication and decrypt the e2e
       protected portion as specified by the e2e transform(s), see
       Section 4.7.2.  If the result of authentication is "FAILURE", the
       packet MUST be discarded from further processing and the event
       SHOULD be logged.  Note that there is no replay protection for
       the e2e context (see Section 6.5).

   R6  The receiver removes PUV, SSS, e2e MAC, and CCI as appropriate.

4.5.3.1.  Replay Protection

   For reasons discussed in Appendix A, it is in general not meaningful
   or desirable to provide application independent replay protection for
   the e2e part.  Some of the identified use cases make this clear by
   having a requirement that the receiver should be able to jump back/
   forward in the e2e media stream.  See Section 6.5 for security
   considerations.

4.6.  Use of SRTCP with SRTP SaF

   SRTCP protection SHALL be provided hbh, conforming to [RFC3711], and
   SHALL NOT be provided e2e, as this covers most/all use cases
   currently identified.  Further RFCs may specify additional e2e


Blom, et al.            Expires December 24, 2009              [Page 20]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   functionality for SRTCP SaF.

   As noted, it may still be beneficial to forward information from some
   of the inbound RTCP messages (from S to M) in the outbound RTCP (from
   M to R).  Also note that it may in general not be possible for the
   SaF middlebox to reproduce RTCP reports accurately reflecting the
   ongoing SaF hbh session.  For instance, since the e2e encryption
   hides any possible RTP padding, there may be a discrepancy between
   sender's byte counts on the S-M and M-R links, respectively.  After
   decryption at R, however, the correct values will be possible to
   reconstruct.

4.7.  Cryptographic Transforms

   We define a set of SRTP SaF transforms.  Note that SaF middleboxes do
   not need to support any cryptographic transform outside what is
   already defined in [RFC3711].

4.7.1.   Default hbh Transforms

   The hbh protection may reuse any of the existing SRTP transforms such
   as those defined in the original specification [RFC3711], or,
   transforms that have been added later.  By default the NULL
   encryption algorithm, the HMAC-SHA1 authentication algorithm and the
   AES-CM pseudo-random function SHALL be used.

4.7.2.  Default e2e Transforms

   The sender SHALL first apply the e2e encryption transform and then
   the e2e authentication transform.

   The default e2e encryption transform SHALL be AES Counter Mode as
   specified in [RFC3711], Section 4.1.1, with the following
   modification.  Instead of forming the initialization vector as
   defined in [RFC3711], the IV SHALL be formed as:

             IV = (k_s * 2^16) XOR (SSS * 2^64) XOR (PUV * 2^16)

   where k_s is the session salting key (derived from the e2e master key
   and salt, see Section 4.7.3) and where SSS and PUV are the SSS/PUV
   fields from the packet.  The PUV is a counter, initially set to zero
   and then increasing by one (1) for each packet.  The default size of
   the PUV SHALL be 24 bits and the maximum allowed size for AES counter
   mode SHALL be 48 bits.  If the SSS field is not present, the value 0
   (zero) SHALL be used.  The default size of the SSS SHALL be zero (not
   present) and the maximum size for AES counter mode SHALL be 64 bits.

   The key used SHALL be the session encryption key k_e (derived from


Blom, et al.            Expires December 24, 2009              [Page 21]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   the e2e master key and salt, see Section 4.7.3).

   The default e2e authentication transform SHALL be HMAC-SHA1 as
   defined in [RFC3711], Section 4.2.1, with the difference that it
   SHALL be applied to the e2e protected portion, excluding the e2e MAC
   field itself.  Note also that the e2e MAC SHALL NOT be applied to the
   CCI field.  The resulting MAC tag SHALL be inserted in the e2e MAC
   field.

   The key used SHALL be the session authentication key k_a (derived
   from the e2e master key and salt, see Section 4.7.3).

   The default e2e pseudo-random function SHALL be AES-CM as defined in
   [RFC3711], Section 4.3.3.

4.7.3.  Session Key Derivation

4.7.3.1.  Session Keys for hbh Processing

   For the hbh security processing, session key derivation SHALL be done
   exactly as in [RFC3711] using the hbh master key and hbh salt.

4.7.3.2.  Session Keys for e2e Processing

   For the e2e security processing the key derivation is also identical
   to [RFC3711] with the following exceptions

   o  The e2e master key and e2e salt, SHALL be used together with the
      defined labels of [RFC3711] for derivation of the different keys.

   o  The key derivation rate SHALL be zero.


5.  SRTP SaF Default Parameters

   The default hbh parameters are identical to [RFC3711].

   The default e2e parameters for master and session key lengths are the
   same as in [RFC3711] with the differences in transform definition as
   defined above and the following additional exception.

   o  Replay window size: N/A (or 0).

   We also add the following additional default parameters:

   o  n_PUV: 24 bits.


Blom, et al.            Expires December 24, 2009              [Page 22]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   o  n_SSS:  0 bits (not used)

   o  n_CCI:  0 bits (not used)

5.1.  Adding Future e2e Transforms

   Adding transforms for the hbh protection SHALL follow the existing
   guidelines of [RFC3711].  Indeed, any current (or future, as far as
   we can see) transform specification for SRTP is applicable for usage
   with the hbh protection.

   To add an e2e transform, the accompanying specification MUST, besides
   specifying the cryptographic operations, define the format and usage
   of the PUV field and, if used, for the SSS field.  An authentication
   transform MUST define how the e2e MAC is computed and MUST NOT
   include the CCI field in the authentication coverage.

   It is STRONGLY RECOMMENDED that, when separate transforms are used
   for encryption and authentication, the sender SHALL first apply the
   e2e encryption transform and then the e2e authentication transform.
   When a combined (data encapsulation) transform is used, the order of
   processing is typically built in to the transform.


6.  Security Considerations

6.1.  General

   Though it may seem that there are quite a few differences between the
   cryptography and key management used in [RFC3711] and the
   corresponding functions defined here, the differences are actually
   smaller then one may think and the security considerations turn out
   to be essentially equivalent.

   As noted, a problem of SRTP in SaF applications is the transforms'
   dependence of the SSRC.  The SSRC is part of IV formation and crypto
   context identification in [RFC3711].

   In this specification three new in-band parameters, PUV, CCI, and
   SSS, are specified.  Note that CCI and SSS are used in exactly the
   same way the SSRC is used in [RFC3711]: context identification (CCI)
   and IV formation (SSS).  Basically, one can think of the CCI as the
   e2e context identifier and the SSS as a substitute for the SSRC.  The
   SSS (when used) is e2e protected.


Blom, et al.            Expires December 24, 2009              [Page 23]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


6.2.  Keystream Reuse

   A main concern of [RFC3711] is to avoid keystream reuse.  This
   concern is present also here.

   The currently defined encryption transforms are additive stream
   ciphers which are sensitive to keystream reuse.  It is therefore
   RECOMMENDED that each session utilizes random, cryptographically
   independent e2e and hbh keys.

   When sender and receiver share an e2e key it may be convenient to
   reuse the key for several e2e sessions/messages via the SaF
   middlebox.  For the predefined e2e encryption transform such reuse
   will only be secure if the sender and receiver keep state to prohibit
   reuse of IVs.  Another situation when key reuse may be beneficial is
   if sender and receiver use the SaF middlebox in a "chat-like" fashion
   (with bi-directional communication using the same keys in both
   directions).  In this case there may be a risk that a message in one
   direction (e.g.  "A-to-B") reuses keystream of some message in the
   other direction ("B-to-A").  Again, with the default transform this
   REQUIRES that IVs in one direction are never reused in the opposite
   direction.

   Unique IVs MAY be assured by putting requirement on the
   implementation of the sender to ensure that unique SSS values are
   used each time the same e2e master key is reused.  For the
   bidirectional case (as well as for the more general case where a
   group key is used as e2e master key), some out-of-band signaling that
   assures that end-points use distinct SSSs is, as mentioned, REQUIRED.

   The situation is essentially equivalent to that of SRTP.  As noted in
   the security considerations of [RFC3711], keys may be reused (with
   the predefined transforms) if (and only if) unique SSRC values can be
   guaranteed.  As noted the SSS and CCI values defined here for SaF
   SRTP basically takes the place of SSRC in that they serves exactly
   the same two purposes: being part of the crypto context identifier
   and providing unique IVs.

   Due to the risks of misuse, reuse of master keys between sessions is,
   just as in [RFC3711], therefore NOT RECOMMENDED.

6.3.  Attacks on CCIs

   The CCI values are not e2e integrity protected as they only exist
   hbh.

   SaF middleboxes are semi-trusted which implies that they are assumed
   to (at least) forward data as requested by the sender/receiver.  A


Blom, et al.            Expires December 24, 2009              [Page 24]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   middlebox providing incorrect CCI mappings therefore falls outside
   the trust model.  Nevertheless, even if a SaF middlebox is malicious
   beyond our assumptions, there seems to be only two attacks that can
   be launched by such a middlebox, and where said attacks are either
   detectable, or, will have only DoS effects.  First, using incorrect
   CCI mappings, the middlebox may attempt to claim to the receiver that
   data comes from another sender.  This will be detected by the
   (RECOMMENDED) use of the e2e authentication.  Second, the middlebox
   could simply make data unintelligible for the receiver by providing
   "random" incorrect CCI mappings, causing the receiver either be
   unable to process the e2e protected media, or, doing so using
   incorrect context/keys, thereby producing "garbage" after decryption
   of the e2e part.

   An outsider (non-middlebox) may attempt to modify CCIs.  This would
   be a DoS attack which would be mitigated if hbh integrity is used as
   the CCI is then integrity protected on each hop where it might be
   exposed.  Modifications of the hbh integrity protected messages would
   still result in a DoS attack, since the messages would be dropped by
   the receiver.  However, the DoS effect is limited in that "garbage"
   does not even reach the e2e protection stage.  An attack where
   messages are simply blocked/dropped by the attacker would cause more
   or less the same effect.  Use of hbh integrity is RECOMMENDED as it
   also protects the SaF middleboxes from filling up storage space with
   junk.

6.4.  Authentication and Authorization

   For reasons already discussed, it is RECOMMENDED that middleboxes
   using this SaF specification authorize senders (typically involving
   authentication) before accepting messages to be stored/forwarded.
   Similarly, it is RECOMMENDED that the middleboxes authorize/
   authenticate the receiver before delivering data.  While the content
   is protected by keys supposedly only known to the receiver, this
   provides extra protection if the e2e keys have fallen into the wrong
   hands and it also avoids that the SaF middlebox wastes resources,
   responding to spoofed requests.  E2e authentication between sender
   and receiver is achieved by applying authentication/integrity to the
   e2e protected portion and is also RECOMMENDED.

6.5.  Replay Protection

   Replay protection is provided on an hbh basis by use of a hbh
   transform including message authentication.  It is RECOMMENDED to use
   hbh message authentication as it protects from outsiders attempting
   to change the order of packets.

   Since some scenarios considered makes it reasonable to expect that


Blom, et al.            Expires December 24, 2009              [Page 25]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   the receiver may wish to jump (fast-forward or rewind) in the e2e
   protected media flow, it is not meaningful to strictly enforce replay
   protection on an e2e basis.  Note however that our trust model
   assumes that the SaF middleboxes are trusted enough not to attempt to
   replay or reorder media unless the receiver so requests.

   It is however still possible (and RECOMMENDED) to provide e2e
   authentication of the packets in combination with inclusion of a
   sequence number in the PUV (as the default e2e transform does).  It
   then becomes infeasible even for the SaF middlebox to fake the
   relative association between a particular packet and its sequence
   number.  This means that the receiver will be able to detect a replay
   that occurs without the receiver actually having requested it.

6.6.  Key Management Considerations

   Key management is outside the scope of this specification which is an
   intentional design choice in order not to introduce any dependency on
   using a specific key management scheme.  Nevertheless, some
   considerations need to be highlighted and taken into account when
   deploying this specification in practice.

   To implement the targeted trust model, the main concern is that the
   e2e keys MUST be independent from the hbh keys.  In other words
   knowledge of any hbh key MUST NOT reveal non-trivial information
   about any e2e key.

   This can be achieved by ensuring that key management for hbh and e2e
   protection is carried out independently using fresh, random and
   independent keys each time.  This is the RECOMMENDED approach.

   Another alternative which may be attractive in some cases is to use
   the slightly weaker notion of cryptographic independence.  Here, the
   hbh keys MAY be derived from the e2e keys by applying a sufficiently
   strong pseudo-random function.

   Even if hbh keys are random and independent each time, it is still
   RECOMMENDED that e2e keys are not cached/reused (see above discussion
   on keystream reuse).

6.7.  Privacy

   In order for a SaF middlebox to deliver the correct media (produced
   with the correct e2e context) to the receiver, some SaF applications
   may choose to store information regarding the identity of the sender
   and will be able to deduce the communication taking part between the
   two.


Blom, et al.            Expires December 24, 2009              [Page 26]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   To enhance privacy, senders/receivers may use agreed pseudonyms or
   other similar Privacy Enhancing Techniques (PET)s.  One such
   technique is to use (random) CIDs to identify media and to follow the
   recommendation and assign random CCIs on each hop where CCIs are
   used.  Complete anonymity may be in conflict with the requirement
   that the SaF middlebox needs protection from flooding by garbage or
   other forms of unwanted traffic.

   When hbh encryption is configured additional protection against 3rd
   party traffic analysis is provided since the CCIs are encrypted.

6.8.  RTCP Considerations

   As specified, RTCP is only protected on hbh basis.  This is motivated
   by the assumption that a SaF middlebox indeed is a true store-and-
   forward entity (as opposed to performing a more intelligent
   function).  The inbound/outbound RTP sessions are then different and
   RTCP then reports only on the current RTP session.  As noted though,
   it may still be useful to forward e.g. sender reports to the receiver
   using hbh RTCP protection.


7.  Acknowledgements

   The authors would like to thank Magnus Westerlund for his support and
   valuable comments.  We are also grateful to Eric Rescorla for his
   feedback when reviewing version 00 of this draft.


8.  IANA Considerations

   To signal that the new transforms are used, each relevant key
   management protocol needs to register the new transforms including
   numbering scheme and syntax with IANA.


9.  References

9.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
              RFC 3711, March 2004.


Blom, et al.            Expires December 24, 2009              [Page 27]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


9.2.   Informative References

   [RFC4383]  Baugher, M. and E. Carrara, "The Use of Timed Efficient
              Stream Loss-Tolerant Authentication (TESLA) in the Secure
              Real-time Transport Protocol (SRTP)", RFC 4383,
              February 2006.


Appendix A.  Use Cases

   In the use cases below, we map the entities to the trust model of
   Section 3.2 by indicating which entity that corresponds to S, M, R.

A.1.  Streaming Pre-encrypted Media

   A content creator (S) wants to distribute high value content to
   clients (R).  The content provider distributes the media via a
   streaming server (M) which should not have access to cleartext media,
   typically because it is not trusted by the content creator.

A.2.  Recording Encrypted Media at Home

   High value encrypted media (e.g.  IPTV, and radio) is broadcasted in
   a network.  Only clients trusted by the content creator (S) have
   access to the encryption key.  A user (R) is recording the media on a
   Hard Disk Drive (M), but does not yet have a license or have a
   license that does not allow cleartext copying.  The media is
   therefore stored in protected format on the HDD.

A.3.  Answering Machine

   Operators commonly provide an answering machine service to their
   customers.  In this case the communicating parties (S and R) may not
   wish to disclose the media to any other party, and hence want to
   apply encryption between each other.  The answering machine (M) acts
   as a SaF middlebox, which has to store encrypted data and retransmit
   it to the callee.

   In this use case it is also likely that several callers leave
   messages protected by different e2e keys.  As discussed in the SRTP
   SaF specification, the receiver and SaF middlebox may agree to use a
   single hbh context into which the different e2e contexts are
   multiplexed using the CCI.

A.4.  Media Rewind

   Common to the use cases above is the possible desire to be able to
   rewind or jump forward in the media stream.  For instance, a user may


Blom, et al.            Expires December 24, 2009              [Page 28]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   wish to listen once again to a message left in a voice mail without
   terminating and reinitiating the session with the SaF middlebox.


Appendix B.  Test Vector

   The parameters are chosen to be typical for a voice call.  A frame
   size of 32 bytes is used by AMR 12.2 (Adaptive Multi-Rate) and with
   20 ms speech frames, a PUV length of 24 bits equals 93.2 h.  A 32-bit
   MAC offers good integrity protection for a voice call.  The 16 bit
   SSS is not typical but is included to make the test vector more
   general.

   Encryption algorithm:                     AES-CM
   Authentication algorithm:                 HMAC-SHA-1
   Pseudo Random Function:                   AES-CM

   n_e (encr session key length):            128
   n_a (auth session key length):            160
   n_s (session salt key length):            112
   n_PUV (Packet Unique Value length):        24
   n_SSS (SRTP SaF Source length):            16
   n_tag (Authentication tag length):         32

   The values below are in hexadecimal.

   e2e Master key (128 bits)
   00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f

   e2e Master salt (112 bits)
   40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d

   e2e Session encryption key (128 bits)
   12 ed 05 3a f7 8c 9a f2 96 5c 64 26 f4 d1 56 23

   e2e Session authentication key (160 bits)
   73 0c 3c ac 1d 75 27 36 91 97 d4 ab c2 b4 6b 46
   cd e0 19 83

   e2e Session salting key (112 bits)
   eb 31 d1 cb af 09 68 cd 14 f2 2b be 35 18

   Packet Unique Value (24 bits)
   80 81 82

   SRTP SaF Source (16 bits)
   c0 c1


Blom, et al.            Expires December 24, 2009              [Page 29]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   Payload (256 bits)
   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

   e2e Protected Portion (328 bits)
   82 37 69 bd f8 9c f3 61 57 e4 3d 74 b7 e6 07 4b
   05 80 52 ec 7d 68 72 63 b2 e1 10 ae b9 7b 7c a0
   80 81 82 c0 c1 bd ab 1e f6


Authors' Addresses

   Rolf Blom
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 31 707
   Email: rolf.j.blom@ericsson.com


   Yi Cheng
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 17 589
   Email: yi.cheng@ericsson.com


   Fredrik Lindholm
   Ericsson AB
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 31 705
   Email: fredrik.lindholm@ericsson.com


   John Mattsson
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 43 501
   Email: john.mattsson@ericsson.com


Blom, et al.            Expires December 24, 2009              [Page 30]

Internet-Draft   SRTP in Store-and-Forward Applications        June 2009


   Mats Naslund
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 33 739
   Email: mats.naslund@ericsson.com


   Karl Norrman
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 44 502
   Email: karl.norrman@ericsson.com


Blom, et al.            Expires December 24, 2009              [Page 31]