Network Working Group                                         J. Klensin
Internet-Draft                                          October 19, 2003
Expires: April 18, 2004


                        A Name Munging Protocol
                   draft-klensin-name-munging-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at http://
   www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 18, 2004.

Copyright Notice

   Copyright (C) The Internet Society (2003). All Rights Reserved.

Abstract

   As one works on internationalization issues for DNS, email, and other
   protocols, it becomes clear that the various encodings and
   transformations required, while not intrinsically difficult, can be
   an impediment to rapid conversion of applications to international
   form and to rapid prototyping of new applications.  This document
   proposes a new, lightweight, protocol that can be used to make such
   conversions, rather than incorporating the needed tables and
   algorithms into each application.








Klensin                  Expires April 18, 2004                 [Page 1]

Internet-Draft          A Name Munging Protocol             October 2003


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  The Protocol . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.1 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.2 Initial List of Encodings  . . . . . . . . . . . . . . . . . .  4
   2.3 Outputs  . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.4 Reply codes  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
   4.  Signed Messages and Business Arrangements  . . . . . . . . . .  5
   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  5
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . .  6
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .  6
       Normative References . . . . . . . . . . . . . . . . . . . . .  6
       Informative References . . . . . . . . . . . . . . . . . . . .  6
       Author's Address . . . . . . . . . . . . . . . . . . . . . . .  7
       Intellectual Property and Copyright Statements . . . . . . . .  8


































Klensin                  Expires April 18, 2004                 [Page 2]

Internet-Draft          A Name Munging Protocol             October 2003


1. Introduction

   A variety of new and upcoming protocols, most, but not all, of them
   associated with internationalization, require that data be presented
   in, or mapped into, encoding forms that are specialized and largely
   unique to the Internet or those protocols.  The trend arguably
   started with the introduction of quoted-printable into MIME [RFC1341]
   and has continued to more recent DNS internationalization work
   [RFC3490] and developing errors in internationalization of electronic
   mail [I-D.hoffman-imaa].  These encodings are at least complex enough
   that testing for interoperability and accuracy is perceived to be
   needed.  Even though they are not, intrinsically, very hard, the
   process of getting the needed code incorporated and tested may be
   sufficient to discourage or delay internationalization of some
   applications, including those that are built around short scripts.

   This document describes a protocol -- designed for use over either
   TCP or UDP -- that can be passed short strings for conversion from
   one encoding to another.  There are various samples, testbeds, and
   web pages today that can do some of these conversions, but they are
   not general (few of them handle more than one or two conversions),
   and they are really not compatible with use in applications
   implementation (regardless of whether they can be used in testing or
   not). The core code in those samples and tests could presumably be
   adapted to support this protocol.

2. The Protocol

   The protocol is designed to be as simple as possible, following the
   general "send packet containing one line, get another line back"
   model used in finger [RFC1288] and whois [RFC0954].  For performance,
   it is designed to be used over either UDP or TCP, as meets the needs
   of the application.  The TCP variation on the above is, obviously,
   "open a connection, send a line, remote system sends a line back and
   closes the connection".   The lines are defined as follows:

2.1 Inputs

   The input line consists of
   o  A source-indication string
   o  an ASCII space (i.e., an octet containing hex 20)
   o  A target-indication string
   o  an ASCII space
   o  A bit count, expressed as an ASCII numeral
   o  An ASCII space
   o  The source
   The indication strings are positive integers, registered with IANA
   and described in Section 2.2, below.  These integers, and the single



Klensin                  Expires April 18, 2004                 [Page 3]

Internet-Draft          A Name Munging Protocol             October 2003


   ASCII space character that follows each one and the bit count, are
   protocol elements and are not intended to be internationalized.

   The source string will be a simple string of bits, of length
   specified by the bit count (with the first bit counted as one). While
   it will normally be an integral number of octets, some special
   encodings may not permit this, so any extra bits are ignored.  For
   convenience, the bit count may be specified as an ASCII asterisk
   ("*", an octet containing hex 2A), in which case the server will
   examine the string for the first pair of octets containing,
   respectively, hex 0D and 0A (the usual CRLF convention) and consider
   it to terminate immediately before those characters.

2.2 Initial List of Encodings

   As discussed below, IANA is expected to set up a registry of encoding
   codes for use in this protocol.  That list is initially:
   0  Information and debugging option.  If 0 appears as the input
      indicator, the rest of the input line is ignored and the server
      returns a reply code of "000 " followed by a blank-separated list
      of the indicator codes it recognizes.  If 0 appears as the output
      indication, the input is copied to the output, also with a reply
      code of 000, and returned.
   1  UCS-4
   2  Unicode (UCS-2)
   3  IDNA Punycode
   4  The IMAA encoding scheme described in [I-D.hoffman-imaa]
   5  UTF-8
   6  ISO 8859-1
   7  Unicode written as a blank-separated list of four or more digit
      codes, with the codes in ASCII digits

2.3 Outputs

   The output consists of
      a three-digit (ASCII) reply code (codes listed below)
      an ASCII space
      a bit count
      a string
   The bit count and string are as described above, but the "*"
   convention will not be used.

2.4 Reply codes

   The following reply codes are specified for use in this protocol. If,
   for some reason (presumably due to a new version of the protocol on
   the server), the three-digit code returned is not listed below, only
   the first digit should be examined.  A first digit of zero indicates



Klensin                  Expires April 18, 2004                 [Page 4]

Internet-Draft          A Name Munging Protocol             October 2003


   that the string returned contains either the original string or a
   recoding of it; a first digit of 5 indicates that the recoding failed
   and the string is either zero-length or contains an explanation in
   ASCII characters.
   000 String translated
   001 String not translated
   500 service not available to you
   501 Input encoding type not recognized
   502 Output encoding type not recognized
   503 Bit count exceeds length of line
   504 No translation available, i.e., the server recognizes the input
      encoding and the output encoding, but has no mapping between them.
   505 Translation failed or input string invalid, e.g., the input
      string was not a possible example of the input encoding specified.

3. Examples

   ... to be supplied...

4. Signed Messages and Business Arrangements

   In today's sometimes-hostile Internet environment, two questions
   immediately arise about a protocol that is designed to be this
   simple. One is how one tells that the returned string is the intended
   one, i.e., that it came from the designated server and that some is
   taking responsibility for that server's results.  The other is how to
   get someone to provide this service, especially if it is to be called
   from production-scale applications protocols.   Either or both
   requirements might be satisfied by sending digitally-signed strings.
   In the input (business model) case, we might imagine a subscription
   service with registered users, with the digital signature used to
   authenticate the query as coming from a subscriber and/or authorize
   billing.  In the output case, we might imagine a family of certified
   servers (using a certification process that lies outside this
   specification) able to sign the responses with a key the user or
   application would trust. Both of these issues, and the protocol
   changes that would be required, should be examined in depth before
   this protocol is published.

   This specification does not cover identification and location of
   appropriate servers.

5. IANA Considerations

   IANA is requested to assign a port number to this protocol.  A
   registry of encoding type indicator strings is also required, with a
   sequential integer to be assigned to each type of encoding registered
   and the list in Section 2.2 used to initialize that registry.  IANA



Klensin                  Expires April 18, 2004                 [Page 5]

Internet-Draft          A Name Munging Protocol             October 2003


   is requested to accept registrations only with contact information
   and a reference that defines the encoding involved, but, since there
   is no shortage of integers, checking and evaluation of such requests
   is not required except to the degree required to prevent denial of
   service attacks on IANA itself.

6. Security Considerations

   As mentioned in Section 4, there is an attack on this protocol,
   especially in which it is used over UDP, in which a response is sent
   to the client application that contains an encoding of a different
   string than the one that was submitted.  If that string is used
   without inspection or review by the client, various bad things might
   happen.  Signed strings, as discussed above, might protect against
   that problem, but only if keys are properly protected and verified.

7. Acknowledgements

   The author would like to express appreciation to Patrik Faltstrom and
   Leslie Dangle, who made some suggestions at a early formative stage
   of this proposal and, in particular, pointed out the desirability of
   digitally signing the strings.

Normative References

Informative References

   [I-D.hoffman-imaa]
              Hoffman, P. and A. Costello, "Internationalizing Mail
              Addresses in Applications (IMAA)", draft-hoffman-imaa-02
              (work in progress), August 2003.

   [RFC0954]  Harrenstien, K., Stahl, M. and E. Feinler, "NICNAME/
              WHOIS", RFC 954, October 1985.

   [RFC1288]  Zimmerman, D., "The Finger User Information Protocol", RFC
              1288, December 1991.

   [RFC1341]  Borenstein, N. and N. Freed, "MIME (Multipurpose Internet
              Mail Extensions): Mechanisms for Specifying and Describing
              the Format of Internet Message Bodies", RFC 1341, June
              1992.

   [RFC3490]  Faltstrom, P., Hoffman, P. and A. Costello,
              "Internationalizing Domain Names in Applications (IDNA)",
              RFC 3490, March 2003.





Klensin                  Expires April 18, 2004                 [Page 6]

Internet-Draft          A Name Munging Protocol             October 2003


Author's Address

   John C Klensin
   1770 Massachusetts Ave, #322
   Cambridge, MA  02140
   USA

   Phone: +1 617 491 5735
   EMail: john-ietf@jck.com










































Klensin                  Expires April 18, 2004                 [Page 7]

Internet-Draft          A Name Munging Protocol             October 2003


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   intellectual property or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; neither does it represent that it
   has made any effort to identify any such rights. Information on the
   IETF's procedures with respect to rights in standards-track and
   standards-related documentation can be found in BCP-11. Copies of
   claims of rights made available for publication and any assurances of
   licenses to be made available, or the result of an attempt made to
   obtain a general license or permission for the use of such
   proprietary rights by implementors or users of this specification can
   be obtained from the IETF Secretariat.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights which may cover technology that may be required to practice
   this standard. Please address the information to the IETF Executive
   Director.


Full Copyright Statement

   Copyright (C) The Internet Society (2003). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assignees.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION



Klensin                  Expires April 18, 2004                 [Page 8]

Internet-Draft          A Name Munging Protocol             October 2003


   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.











































Klensin                  Expires April 18, 2004                 [Page 9]