DNS Extensions Working Group                                  G. Barwood
Internet-Draft                                                          
Intended status: Experimental                          17 September 2009
Expires: March 2010


                            DNS Transport     
               draft-barwood-dnsext-dns-transport-06

Status of this Memo
  
   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on March 18, 2010.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Abstract

   Describes an experimental transport protocol for DNS. 
   IP fragmentation is avoided, blind spoofing, amplification attacks
   and other denial of service attacks are prevented. Latency for a 
   typical DNS query is a single round trip, after a setup exchange 
   that establishes a long term shared secret. No per-client server
   state is required between transactions. The protocol may have other
   applications.


Barwood                    Expires March 2010                   [Page 1]

Internet-Draft                DNS Transport               September 2009


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3

   2.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . .  4

     2.1 Fragmentation. . . . . . . . . . . . . . . . . . . . . . . .  4

     2.2 Spoofing . . . . . . . . . . . . . . . . . . . . . . . . . .  4
    
     2.3 Server state . . . . . . . . . . . . . . . . . . . . . . . .  4

     2.4 Amplification attacks  . . . . . . . . . . . . . . . . . . .  4

     2.5 Packet retransmission  . . . . . . . . . . . . . . . . . . .  4

     2.6 Performance . . . . . . . . . . . . . . . . . . . . . . . . . 4

   3.  Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . .  5

     3.1 Overview   . . . . . . . . . . . . . . . . . . . . . . . . .  5

     3.2 Setup request  . . . . . . . . . . . . . . . . . . . . . . .  5

     3.3 Setup response . . . . . . . . . . . . . . . . . . . . . . .  5

     3.4 Initial request  . . . . . . . . . . . . . . . . . . . . . .  6

     3.5 Server response : single page  . . . . . . . . . . . . . . .  6

     3.6 Server response : multi page . . . . . . . . . . . . . . . .  7

     3.7 Follow-up request  . . . . . . . . . . . . . . . . . . . . .  8

     3.8 Error response . . . . . . . . . . . . . . . . . . . . . . .  9

     3.9 Congestion control . . . . . . . . . . . . . . . . . . . . .  9

   4.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10

   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10

   6.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 10

   7.  Normative References . . . . . . . . . . . . . . . . . . . . . 10

   A.  Implementation of Cookies  . . . . . . . . . . . . . . . . . . 11

   B.  Anycast considerations  . . . . . . . . . . . . . . . . . . .  12

   Authors Address  . . . . . . . . . . . . . . . . . . . . . . . . . 12



Barwood                    Expires March 2010                   [Page 2]

Internet-Draft                DNS Transport               September 2009

1.  Introduction

DNSSEC implies that DNS responses may be large, possibly larger than the
de facto ~1500 byte internet MTU. 

Large responses are a challenge for DNS transport. EDNS [RFC2671] was
introduced in 1999 to allow larger reponses to be sent over UDP, 
previously DNS/UDP was limited to a 512 bytes.

EDNS is problematic for several reasons:

(1) It allows amplification attacks against 3rd parties. DNS/UDP has
always been susceptible to these attacks, but EDNS has increased the
amplification factor by an order of magnitude.

(2) The IP protocol specifies a means by which large IP packets are
split into fragments and then re-assembled. However fragmented UDP
responses are undesirable for several reasons:

  o Fragments may be spoofed. The DNS ID and port number are only
  present in the first fragment, and the IP ID may be easy for an
  attacker to predict.

  o In practise fragmentation is not reliable, and large UDP packets may
  fail to be delivered.

  o If a single fragment is lost, the entire response must be re-sent.

  o Re-assembling fragments requires buffer resources, which opens
  up denial of service attacks.

Instead, it is possible to use TCP, but this is undesirable, as TCP 
imposes increased latency and significant server state that may be
vulnerable to denial of service attack. In addition, support for TCP
is not universal.

Nearly all current DNS traffic is carried by UDP with a maximum size of
512 bytes, and relying on TCP is a risk for the deployment of DNSSEC.

Therefore a new protocol to solve these problems is proposed.















Barwood                    Expires March 2010                   [Page 3]

Internet-Draft                DNS Transport               September 2009

2. Requirements

2.1 Fragmentation

As described in the introduction, fragmentation is undesirable.
However, fragmentation is unavoidable if the path MTU is too small.
Therefore, we require only that fragmentation does not occur provided
the actual path MTU is at least the MTU sent by the client.

2.2 Spoofing

Blind spoofing attacks must be prevented.

2.3 Server state

No per-client server state should be needed between transactions.

2.4 Amplification attacks

Amplification attacks against third parties must be prevented.

2.5 Packet re-transmission

Only lost IP packets must be re-transmitted.
This reduces problems due to network congestion.

2.6 Performance

Each transaction ( for moderate response sizes ) must be performed
in 1 RTT, after setup, provided that no IP packets are lost.

























Barwood                    Expires March 2010                   [Page 4]

Internet-Draft                DNS Transport               September 2009

3. Protocol

3.1 Overview

Communication is in two stages. First a long-lived SERVERTOKEN is
acquired by the client, using a standard DNS lookup. Subsequent
queries are sent using a different port, and are protected by
the SERVERTOKEN.

Throughout, DNS Payload refers to a DNS Message [RFC1035], not 
including the 16-bit ID field. All numbers are unsigned integers,
with the first bit being the most significant.

3.2 Setup request

The client acquires a SERVERTOKEN for a given Server IP address by
sending a special question to the server, with

   QTYPE  = 0xFF4C (65356)
   QCLASS = IN
   QNAME  = <Client Secret>.DNS.TRANSPORT.LOCAL

where <Client Secret> is a secret label chosen to prevent spoofing
of the response.

3.3 Setup response

The server returns a record with type 0xFF4C, which contains one or 
more strings, each a byte count followed by binary data (similar to
a TXT record). One of the strings has format : 

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       1       |           SPORT               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          SERVERTOKEN                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where :

SPORT         is a 16 bit UDP port number, to which requests are sent.

SERVERTOKEN   is a 32 bit value computed as a hash of the client IP
              Address and a long-term server secret. 
      
The TTL should be at least 1 day. The client associates SERVERTOKEN,
SPORT and the client IP address ( for multi-homed clients ) with the
Server IP address.

If the no record is returned (normally with a NXDOMAIN error), or none
of the strings has the correct format, the server does not have
support, and this fact should be cached, with the TTL of any record 
returned, or at least 1 day, if no record is returned.



Barwood                    Expires March 2010                   [Page 5]

Internet-Draft                DNS Transport               September 2009

3.4 Initial request

To make a DNS request, a UDP packet is sent to SPORT, with format:

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            QUERYID                            |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          SERVERTOKEN                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    \                             DATA                              \
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |              MTU              |     COUNT     |       1       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where :

QUERYID        is a 64-bit value that identifies the request.

SERVERTOKEN    is a copy of SERVERTOKEN from the setup response.

DATA           is the DNS payload.

MTU            is a 16-bit number that limits the size of the IP packets
               used to send the response. Must be at least 576 bytes.

COUNT          is an 8-bit number that limits the number of pages the
               server will send.

Note: the various types of packet are distinguished by the last byte.
This is to allow header fields to be aligned on 32-bit boundaries.

3.5 Server response : single page

The server checks SERVERTOKEN, and divides the DNS payload into equal
size pages, so that the size of each IP packet is not greater than
MTU.  Servers should use a smaller MTU if the path MTU is known to be
less than the MTU supplied by the client.

If there is only one page, the UDP response packet has format :
      
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            QUERYID                            |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    \                             DATA                              \
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       2       |
    +-+-+-+-+-+-+-+-+

where :

QUERYID        is a copy of QUERYID from the request.


Barwood                    Expires March 2010                   [Page 6]

Internet-Draft                DNS Transport               September 2009

DATA           is the DNS payload.

The client uses DATA as the normal DNS response.

3.6 Server response : multi page

If there is more than one page, each UDP response packet has format :

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            QUERYID                            |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                             TOTAL                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            COOKIE                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |    COUNT      |                     PAGE                      | 
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    \                             DATA                              \  
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |           PAGESIZE          |       3       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where :

QUERYID      is a copy of QUERYID from the request.

TOTAL        is the size of the complete DNS payload.

COOKIE       is used to request further pages ( see section 3.7 ).

COUNT        is the number of pages sent.

PAGE         is the 0-based number of this page.

DATA         is part of the DNS payload.

PAGESIZE     is the size into which the response has been divided.

The client allocates an assembly buffer of TOTAL bytes (if not already
allocated), and copies DATA into it at offset PAGE x PAGESIZE.














Barwood                    Expires March 2010                   [Page 7]

Internet-Draft                DNS Transport               September 2009

3.7 Follow-up request

If the client does not receive a page, due to packet loss or not
all pages being sent, it sends a request with format :

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            QUERYID                            |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            COOKIE                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          SERVERTOKEN                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    \                            RANGES                             \
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    \                             DATA                              \
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |           PAGESIZE            |    NRANGE     |       4       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where :

QUERYID      identifies the request.

COOKIE       is a copy of COOKIE from the server response.

SERVERTOKEN  is a copy of the SERVERTOKEN from the setup response.

RANGES       is an array of ranges. Each range is an 8 bit count
             field (the number of pages in the range) followed by a
             24-bit page number which specifies the first page in the
             range.

DATA         is a copy of DATA from the initial request.

PAGESIZE     is a copy of PAGESIZE from the server response.

NRANGE       is the number of ranges.

The server response is the same as in section 3.6. 

Once a client has received all pages, it processes the complete
assembled response as normal.












Barwood                    Expires March 2010                   [Page 8]

Internet-Draft                DNS Transport               September 2009

3.8 Error response.

If the server encounters an error condition, it sends an error
response, with format :

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                            QUERYID                            |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |     ERRNUM    |       5       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where :

QUERYID      is a copy of QUERYID from the request.

ERRNUM       encodes the error condition, with values

             0            Invalid SERVERTOKEN. The client should acquire
                          a new SERVERTOKEN and try again.

             1            Invalid COOKIE/PAGESIZE/PAGE. The client
                          should send a new Initial Request.

3.9 Congestion control

Clients should take into account estimated network performance when
requesting pages. Factors are :

  o The round trip time (RTT). 
  o The time to transmit 1 packet due to limited bandwidth (PT).

The total number of pages requested but not received (INFLIGHT)
should not be more than max( RTT / PT, 4 ) at any time.

For example, if RTT = 150 milli-seconds, the bandwidth is 300 kilobytes
per second and the packet size is 1500 bytes, then we have PT = 5 
milli-seconds and INFLIGHT = 30.

Clients must use a conservative value for INFLIGHT until proper
estimates for RTT and PT are available.

In practice for current DNS purposes, INFLIGHT can simply be set to 4,
and this should be sufficient to send the complete response in a
single round trip, assuming the MTU is 1500 bytes.

Servers may send a smaller number of pages than requested, for 
policy reasons, or if there is local congestion, etc. Ranges are 
processed in order, so the client can infer which pages have not
been sent. For example if in a follow-up request the client requests
ranges 2-3, 9-9, 4-6, and the server only sends 4 pages, the pages
sent are 2,3,9 and 4.



Barwood                       Expires March 2010                [Page 9]

Internet-Draft                DNS Transport               September 2009

4.  Security Considerations

Fragmented responses are vulnerable to blind spoofing, therefore
fragmented responses should be avoided if possible.

A check should be made that the MTU is at least 584, to prevent an
attacker generating a large number of IP packets from a single request.

Secret values ( the long term server secret, the client secret, QUERYID
) should be generated so that an attacker cannot easily guess them, by 
using cryptographic hash functions and cryptographic random number
generators seeded from data that cannot be guessed by an attacker,
such as thermal noise or other random physical fluctuations.

The hash function used to compute SERVERTOKEN should be
cryptographically secure, although a relatively weak function may be
sufficient, since acquiring large numbers of input/output pairs in order
to deduce the long term server secret is not easy for an attacker.

5.  IANA Considerations

None at present. If the protocol were to be become a standard, 
there could be new registries for :

   o Strings in setup response ( transport option codes ).

   o Operation codes ( 1 - 5 have been used )

   o Error codes

6.  Acknowledgments

Mark Andrews, Alex Bligh, Robert Elz, Alfred Hines, Douglas Otis,
Wouter Wijngaards and Nicholas Weaver were each instrumental in
creating and refining this specification.

7.  Normative References

[RFC1035]  Mockapetris, P., "Domain names - implementation and
           specification", STD 13, RFC 1035, November 1987.    

[RFC2671]  Vixie, P., "Extension Mechanisms for DNS (EDNS0)", RFC
           2671, August 1999.












Barwood                       Expires March 2010               [Page 10]

Internet-Draft                DNS Transport               September 2009

Appendix A : Implementation of Cookies

To show how server state is avoided or limited, two possible approaches
to the implementation of cookies are shown. These are illustrative, and
actual implementations are of course free to take a different approach.

(1) The server maintains a DNS database version number, which is
incremented when the database is updated. COOKIE is simply the DNS 
database version number. The DNS database is stuctured so that old
queries may be replayed, with the database version number being supplied
as a parameter, or a COOKIE error is returned if the database is updated
while a transfer is in progress.

(2) The server maintains a list of recent multi-page responses:

COOKIE  DATA   ACCESSTIME
1       ....   10:25:11
2       ....   10:25:16
.....

If a response is multi-page, the list is checked to see if there is an
existing entry that can be used ( hashing techniques are used to make
the search efficient ).

Entries that have not been accessed for more than 5 seconds may be
deleted.

Some care should be taken to ensure that on server restart, old cookie
values are not re-used. Periodically, a new range of cookies should be
issued, and the new allocation value recorded in permanent storage.
Alternatively, the server should wait 10 seconds after restarting before
issuing any cookies, or use a new long-term secret to generate
SERVERTOKENs.






















Barwood                       Expires March 2010               [Page 11]

Internet-Draft                DNS Transport               September 2009

Appendix B : Anycast considerations

Anycast DNS servers need to operate consistently.
There are (at least ) two possibilities:

(a) Each server within the Anycast system issues distinct SERVERTOKENS. 
If the Anycast routing changes, a SERVERTOKEN error occurs, and the
client restarts the query.

(b) Each server within the Anycast system has the same shared secret,
and thus issues the same SERVERTOKEN to each client. They also
issue identical responses to each other, assuming the zone version
is the same. The cookie is the zone version number. If the Anycast
routing changes and the new server does not have the required zone
version, a COOKIE error will result, and the client has to restart
the query. Such errors can be avoided by not serving a new zone
until all the Anycast servers have received copy.

By incorporating the software version into the SERVERTOKEN, it should be
possible to smoothly update the system, effectively switching to
solution (a) while the software update is in progress.

Author's Address

   George Barwood
   33 Sandpiper Close
   Gloucester
   GL2 4LZ
   United Kingdom

   Phone: +44 452 722670
   EMail: george.barwood@blueyonder.co.uk























Barwood                       Expires March 2010               [Page 12]