R. Mahy
Internet Draft                                            Cisco Systems
Document: draft-mahy-sip-signaled-digits-00.txt                Feb 2001
Expires: Aug, 2001


                         Signaled Digits in SIP


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
      all provisions of Section 10 of RFC2026 [RFC2026].

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts. Internet-Drafts are draft documents valid for a maximum of
   six months and may be updated, replaced, or obsoleted by other
   documents at any time. It is inappropriate to use Internet- Drafts
   as reference material or to cite them other than as "work in
   progress."
   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


1. Abstract

   This document demonstrates a way for interested SIP User Agents
   which are not a party to the media of a call or session, to receive
   SIP event notifications when signaled digits, or other specific
   telephony-related events are detected.  This is useful for a variety
   of applications that monitor calls for a specific event (e.g.: a
   long pound, special sequence of digits, or a fax signal) and--only
   then--take an active role in the monitored calls.


2. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in RFC-2119
   [RFC2119].

   Throughout this document, the author refers to "DTMF" and other
   "tones" as audio media.  Similar types of information conveyed as
   signaling are called "digits" or "signals".  This convention is
   consistent with RFC2833 [AVT], which provides a more detailed
   discussion of the issue.



 Mahy                     Expires: Aug 2001                          1
                         SIP Signaled Digits


3. Motivational text

   RFC2833 "AVT Tones" is widely acknowledged within the IP telephony
   community as the best way to transport telephony-related tones
   between end systems which already terminate media using [RTP].  This
   approach maintains synchronization of speech audio with tone audio,
   tolerates loss, provides event duration and volume information, and
   avoids detection delay.

   Some applications are interested in the telephony signals
   represented by these tones, but would not otherwise be a party to
   the audio media.  This proposal addresses the transport requirements
   of these signals in this context.  Synchronizing speech is a non-
   issue in these topologies, as there is no audio media with which to
   synchronize; SIP provides its own reliability mechanism to prevent
   loss; and since this proposal reuses the encoding specified in
   [AVT], volume and duration are preserved, and detection delay is
   minimized.

   For example, in some application scenarios, a user contacts an
   application, places a new call in the context of the application,
   and returns to the application after the new call is finished.  Some
   examples of such scenarios are: Calling card systems, Voicemail or
   Messaging systems which allows outgoing calls, and Voice Browsers or
   Voice Portals which allow outgoing calls.

   All of these applications require a way for the user to get back to
   the application if something has gone wrong with the outgoing call
   (ex: wrong number), or if the user changes his or her mind.  If the
   originating user is using a TDM telephone, or a simple IP endpoint,
   the application will typically expect a sequence of signaled digits
   (ex: a pound or hash (#) of long duration, three stars (*) in a row,
   etc.)


                    +-------------+
                    |             |
                    | Originating |
                    |    User     |
                    |             |
                    +-------------+
                     |         ^ ^
                               | |
              NOTIFY |     SIP | | RTP
                               | |
                     |         | |
                     v         v v
         +-------------+      +-------------+
         |             |      |             |
         | Waiting for |      | Target User |
         |   trigger   |      |  or Service |
         |             |      |             |
         +-------------+      +-------------+
 Mahy                     Expires: Aug 2001                          2
                         SIP Signaled Digits





   Below are several possible [SIP] topologies that would enable this
   type of behavior.  Most of these approaches fall into two
   categories: the application could receive DTMF media corresponding
   to the signaled digits, or it could receive the signaled digits
   using SIP.

   Below are three approaches to encoding this information as media.
   None of these approaches are very attractive.

   - The application could relay all the media itself. This wastes
   network resources and is inefficient for the application.

   - The application could setup a conference and INVITE itself to the
   conference.  This method requires setting up a complex set of call
   legs and wastes network and conferencing resources.

   - The application could request "forked-media" [Forked-Media], of
   just the RFC2833 media.  While the best media-related proposal, this
   method requires rather complex functionality in the "forking" UAs;
   requires [3pcc], and is problematic for firewalls because of the
   complexity of the [SDP].  Also, experience at interoperability tests
   showed that most current SDP implementations are much less robust
   than their SIP counterparts.

   This draft will summarize a few non-media approaches as well:

   - The application could expect to receive [INFO] messages containing
   a representation of the signaled digits.  There are a number of
   disadvantages to this method as well: a) it requires 3pcc, b) there
   is no way to turn the INFO messages on or off, c)  simultaneous use
   of AVT tones and INFO may cause double detection of events.  For
   these reasons, using INFO to carry signaled digits is now
   deprecated.

   - The application could ask the originating UA to execute a script
   (ex: in [Java]) or render a markup language (ex: [VoiceXML]) to
   watch for an event and transfer the call back to the application.
   This is a very elegant solution, but requires significant resources
   and implementation on each UA.

   - This proposal: the application SUBSCRIBEs [SUB/NTFY] to signaled
   digits on the originating UA, and then receives the signaled digits
   in corresponding NOTIFYs.  Although this requires wide deployment on
   UAs, it is fairly easy to implement and works with both 3pcc and
   fully distributed call control models.

   While this proposal only provides examples using signaled digits, it
   could be used to detect other telephony-related signals (for example
   FAX signals, or call progress signals).

 Mahy                     Expires: Aug 2001                          3
                         SIP Signaled Digits


   This proposal would also allow for a clean decomposition of some
   services into media and signaling components.  For example, below is
   a diagram of a VoiceXML browser split into media and non-media
   handling parts.


                    +-------------+
                    |             |
                    | VoiceXML    |
                    | Interpreter |
                    | (signaling) |
                    +-------------+
                      ^          ^
                      |          |
                  SIP |          | [RTSP]
                      |          |
                      |          |
                      v          v
         +-------------+        +-------------+
         |             |        |             |
         |  SIP UA     |   RTP  | RTSP Server |
         |             |<------>|   (media)   |
         |             |        |             |
         +-------------+        +-------------+


4. Formal Grammar

   The following syntax specification uses the augmented Backus-Naur
   Form (BNF) as described in RFC-2234 [BNF].

   This proposal adds a new Event type to the SUBSCRIBE/NOTIFY Event
   header.

        "Event" ":" SP "telephone-event" *[";" tparams ]

        tparams     =   event-param | rate-param | duration-param
        event-param =   "events" "=" nums *["," nums ]
        rate-param  =   "rate" "=" num
        duration-param = "duration" "=" num
        numbers     =   range | num
        range       =   num "-" num











 Mahy                     Expires: Aug 2001                          4
                         SIP Signaled Digits


5. Example of usage

   The example below shows a typical scenario used for calling cards.
   The Application acts as both an ordinary UA and as a 3pcc
   controller.


     Original         App          Target
       UA                           UA
        |              |             |
        |--INVITE----->|             |
        |<---200-------|             |
        |----ACK------>|             |
        |              |             |
        |<===RTP======>|             |
        |              |             |
        |  ..time..    |             |
        |              |             |
        |<--SUBSCRIBE--|             |
        |----200------>|             |
        |              |             |
        |              |--INVITE---->|
        |              |<---180------|
        |<--reINVITE---|<---200------|
        |----200------>|             |
        |<---ACK-------|----ACK----->|
        |              |             |
        |<=========RTP==============>|
        |              |             |
        |  ..time..    |             |
        |              |             |
        |---NOTIFY---->|             |
        |<---200-------|             |
        |              |             |
        |<--reINVITE---|----BYE----->|
        |-----200----->|<---200------|
        |<----ACK------|             |
        |              |             |
        |<====RTP=====>|             |
        |              |             |



   SUBSCRIBE sip:gateway.itsp.net SIP/2.0
   Call-Id: 100@gateway.itsp.net
   To: <sip:service@asp.com>
   From: <tel:+14085554000>;tag=abcd
   CSeq: 1 SUBSCRIBE
   Events: telephone-event;duration=2000
   Expires: 3600
   Content-Length: 0


 Mahy                     Expires: Aug 2001                          5
                         SIP Signaled Digits


   NOTIFY sip:
   Call-Id: 100@gateway.itsp.net
   To: <tel:+14085554000>;tag=abcd
   From: <sip:service@asp.com>;tag=efgh
   CSeq: 5 NOTIFY
   Events: telephone-event;rate=1000
   Content-Type: audio/telephone-event
   Content-Length: 4

   0x0B0F0300


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     event     |E|R|  volume   |         duration              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   event=11, end=0, reserved=0, volume=15, duration=2048

   a "#" tone at -15dmb volume has been in progress for 2.048 seconds
   (2048 units of duration at a sampling rate of 1000Hz).


6. Detailed Behavior

6.1 Behavior of application (Subscriber)

   First the application SUBSCRIBEs to the signaled digits.  It SHOULD
   only SUBSCRIBE to events if it is no longer a party to the media.
   Because it is impossible to perfectly synchronize the SUBSCRIBE and
   the reINVITE or transfer necessary for the application to divest
   itself of the media, it is sufficient for the application to ignore
   AVT media which it receives while SUBSCRIBEd to signaled digits.

   The application MAY specify an event mask.  The specified event mask
   MUST at least include the default events: 0-15.

   The application MUST NOT specify a rate.  The subscriber MUST accept
   any rate specified by the notifier.


   The subscriber MAY specify a "minimum" duration in milliseconds.
   The subscriber can be assured that it will not receive any
   notifications for events which have been in progress for less than
   this duration.  The subscriber MUST still accept "final" events (the
   end bit is set) with shorter durations.






 Mahy                     Expires: Aug 2001                          6
                         SIP Signaled Digits


   If the application receives a NOTIFY for an event in the range 0-15,
   it SHOULD verify that the volume parameter is less than or equal to
   36 (at least -36 dbm).  It MAY accept a DTMF event with a volume as
   large as 55 (-55 dbm).  For all other events, the volume MUST be
   zero.

   Most applications will only act either on interim events (end bit is
   zero), or on final events (end bit is one), but not both.  For
   example, an application which watches for "***" would look for the
   "*" event (10), the end bit equal to zero, and a duration greater
   than or equal to 40ms based on the current rate.  It could ignore
   all final events.



6.2 Behavior of the originating UA (Notifier)

   The notifier MUST send a NOTIFY when it detects either the end of a
   subscribed event, or the continuation of a subscribed event for a
   sufficient duration.  The notifier SHOULD NOT send events outside
   the subscribed event mask.  The default event mask is 0-15.

   The default rate for this application is 1000Hz. The duration
   parameter in a SUBSCRIBE indicates the number of milliseconds a
   signal must exist before it should be reported with NOTIFY.  The
   default duration is 40ms regardless of the sample rate.

   If the notifier detects that an event has begun and continued for at
   least the subscribed duration, it MUST send a NOTIFY for that event.
   The notifier SHOULD NOT wait for the end of the event.  If the
   notifier detects that an event has ended, it MUST send a NOTIFY for
   that event, even if that event previously generated a NOTIFY, and
   even if the event was shorter than the minimum duration requested.

   In a NOTIFY, the notifier MAY specify any rate.  If so, the duration
   returned in the body of the NOTIFY MUST be in units of the
   speificied rate.

   The notifier MUST NOT send the same event three times as required
   for AVT conveyed in RTP.  SIP provides its own redundancy mechanism,
   and without the timestamp header of the RTP packet available in SIP,
   there would no way to determine if these were duplicate events.

   Note that multiple applications may subscribe to signaled digits
   (possibly with different parameters) for the same call
   simultaneously.  A practical example is a calling card call to a
   voicemail application during an outcall.  The calling card
   application may wait for a long pound, while the messaging system
   waits for a different sequence.




 Mahy                     Expires: Aug 2001                          7
                         SIP Signaled Digits


6.3 Simple Implementation on IP phones

   IP phones only generate DTMF for compatibility with the PSTN.  The
   concepts of volume and minimum duration in this context are
   irrelevant.  Therefore, a simple IP phone MAY a) only support events
   zero through eleven (most phones do not have keys for ABCD), b)
   always set the volume to zero, c) only use the default rate, and d)
   never send an event shorter than 40ms.  Long key presses (ex: 2
   seconds) MUST still be correctly detected and reported.

   Accept or refuse SUBSCRIBE messages according to local authorization
   policy.  For example, always accept messages for your SIP peer for
   an active call.  Ignore any parameters in the subscriptions.

   When key activity occurs, check if there are any subscriptions which
   correspond to the active "line".  If so, send a NOTIFY (to each
   subscriber for this call-id) once when a "DTMF keypad" key is
   depressed (set the duration to 40ms).  Also, send a NOTIFY with the
   end bit set, and the approximate duration of the keypress when the
   key is released.  In both cases, always use the default rate of
   8000Hz, and set the volume to zero.

   This description does not intend to limit implementation to
   physical telephones with a "DTMF keypad".

6.4 Behavior of Proxy servers

   Proxy servers MUST be able to forward SUBSCRIBE and NOTIFY methods.


7. Security Considerations

   Signaled Digits may convey private information such as PINs, credit
   card numbers, or account numbers.  UAs SHOULD authenticate these
   subscriptions.  In addition, UAs are encouraged to encrpyt this
   information using a suitable mechanism as available in SIP (e.g.
   [PGP]).




8. References


   [SIP] M. Handley, E. Schooler, and H. Schulzrinne, "SIP: Session
   Initiation Protocol", RFC2543, Internet Engineering Task Force,
   Nov 1998.

   [SDP] M. Handley and V. Jacobson, "SDP: session description
   protocol," Request for Comments 2327, Internet Engineering Task
   Force, April 1998.


 Mahy                     Expires: Aug 2001                          8
                         SIP Signaled Digits


   [AVT]  H. Schulzrinne, S. Petrack, "RTP Payload for DTMF Digits,
   Telephony Tones and Telephony Signals", RFC2833, Internet
   Engineering Task Force, May 2000.

   [RTP]  Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson,
   "RTP:  A Transport Protocol for Real-Time Applications", RFC 1889,
   January 1996.

   [SUB/NTFY] Adam Roach, "Event Notification in SIP", Internet Draft
   <draft-roach-sip-subscribe-notify-03.txt>, IETF; January 2001.
   Work in progress.

   [RTSP] H. Schulzrinne, A. Rao, and R. Lanphier, "Real time streaming
   protocol (RTSP)," RFC2326, Internet Engineering Task Force, Apr.
   1998.

   [3pcc] J. Rosenberg, J. Peterson, H. Schulzrinne, "Third Party Call
   Control in SIP", Internet Draft <draft-rosenberg-sip-3pcc-01.txt>,
   IETF;  Nov. 2000.  Work in progress


   [INFO] S. Donovan, "The SIP INFO method," Request for Comments 2976,
   Internet Engineering Task Force, Oct. 2000.

   [Java] J. Gosling, B. Joy, G. Steele, "The Java Language
   Specification," Addison Wesley, 1996.

   [VoiceXML] VoiceXML Forum, "Voice extensible markup language
   (voicexml) version 1.00," VoiceXML forum specification, VoiceXML
   Forum, Mar. 2000.

   [PGP] D. Atkins, W. Stallings, and P. Zimmermann, "PGP message
   exchange formats," Request for Comments 1991, Internet Engineering
   Task Force, Aug. 1996.

   [RFC2026] S Bradner, "The Internet Standards Process -- Revision 3",
   RFC2026 (BCP), IETF, October 1996.

   [RFC2119] S. Bradner, "Key words for use in RFCs to indicate
   requirement     levels," Request for Comments (Best Current
   Practice) 2119, Internet     Engineering Task Force, Mar. 1997.

   [BNF] D Crocker, P Overell, "Augmented BNF for Syntax
   Specifications: ABNF", RFC2234, IETF, Nov 1997.


10.  Acknowledgments

   Funding for the RFC Editor is currently provided by the Internet
   Society.



 Mahy                     Expires: Aug 2001                          9
                         SIP Signaled Digits


11. Author's Addresses

   Rohan Mahy
   Cisco Systems
   170 West Tasman Dr, MS: SJC-21/3
   Phone: +1 408 526 8570
   Email: rohan@cisco.com


Full Copyright Statement

   "Copyright (C) The Internet Society (date). All Rights Reserved.
   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implmentation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph
   are included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.
   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.



















 Mahy                     Expires: Aug 2001                         10