Network Working Group M. Kucherawy Internet-Draft June 7, 2014 Intended status: Experimental Expires: December 9, 2014 A List-safe Canonicalization for DomainKeys Identified Mail (DKIM) draft-kucherawy-dkim-list-canon-00 Abstract DomainKeys Identified Mail (DKIM) introduced a mechanism whereby a mail operator can affix a signature to a message that validates at the level of the signer's domain name. It specified two possible ways of converting the message body to a canonical form, one intolerant of changes and the other tolerant of simple changes to whitespace within the message body. The provided canonicalization schemes do not tolerate changes in a structured message such as conversion between transfer encodings or addition of new message parts. It is useful to have these capabilities to allow for transport through gateways, and also for transport through handlers (such as mailing list services) that might add content that would invalidate a signature generated using the existing canonicalization schemes. This document presents a mechanism for generating a canonicalization that can allows easy detection of modified content while still being valid for the content it originally signed. It also presents a use profile of DKIM that takes advantage of this capability. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on December 9, 2014. Kucherawy Expires December 9, 2014 [Page 1] Internet-Draft DKIM List Canonicalization June 2014 Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. The 'list' Canonicalization Description . . . . . . . . . . . . 3 3.1. Preparing Content . . . . . . . . . . . . . . . . . . . . . 4 4. 'The 'lh=' Signature Tag . . . . . . . . . . . . . . . . . . . 5 5. Use Profile . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 6.1. Imported from DKIM . . . . . . . . . . . . . . . . . . . . 6 6.2. Added Content May Not Be Safe . . . . . . . . . . . . . . . 6 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 7.1. DKIM-Signature Canonicalization Body Registry . . . . . . . 7 7.2. DKIM-Signature Tag Specifications Registry . . . . . . . . 7 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 8.1. Normative References . . . . . . . . . . . . . . . . . . . 7 8.2. Informative References . . . . . . . . . . . . . . . . . . 7 Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . . 7 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . . 8 Kucherawy Expires December 9, 2014 [Page 2] Internet-Draft DKIM List Canonicalization June 2014 1. Background DomainKeys Identified Mail [RFC6376] (DKIM) defines a mechanism whereby a verified domain name can be attached to a message, or portion of a message, using a cryptographic signature. It presents two possible schemes for converting the header block to a canonical form, and similarly two schemes for canonicalizing the body. In each case, one scheme permits no changes whatsoever, and the other permits limited changes restricted to areas such as whitespace munging, case changing, and header field wrapping. Some agents deliberately, but innocently, modify content in transit. A prime example of this is mailing lists, which might add a prefix to the Subject field of a message, add list-specific information to the header (in the form of new header fields), or append administrivia to the body of messages before they are re-mailed to the list subscribers. Use of mailing lists with respect to DKIM, and a discussion of related challenges, can be found in [RFC6377]. There is a desire to have DKIM signatures survive transit through lists. One way to do this is to make use of DKIM's "l=" tag which limits the portion of the body that is signed. This exposes an attack vector, however, since one can simply append any content to a partly-signed message and the signature will continue to verify. (See Section 8.2 of [RFC6376].) This document defines a new body canonicalization for DKIM that includes a partial signature for each message part in a message structured using Multipurpose Internet Mail Extensions (MIME; see [RFC2045]). This allows a clear delineation between the author- generated content (which would be signed by the author) and content added downstream (which would be signed by the other actor). A DKIM verifier can then determine whether the author-generated content is intact, and then identify and verify the content that was added later. The utility of this mechanism is predicated on the notion that agents that modify signed messages will do so in ways compatible with MIME. 2. Definitions Numerous terms used here, especially "Author", are defined in [RFC5598]. 3. The 'list' Canonicalization Description This section defines the 'list' body canonicalization algorithm. Kucherawy Expires December 9, 2014 [Page 3] Internet-Draft DKIM List Canonicalization June 2014 Put simply, the list canonicalization constructs a hash tree of the MIME structure of the message after each part has been decoded (for those with a Content-Transfer-Encoding field). The hash used is implied by the signature algorithm to be used (see the DKIM "a=" tag). Each of the hashes can be made a part of the signature to allow for more precise part validation, and identification of added content. 3.1. Preparing Content A message is prepared for canonicalization by applying the following steps in order: 1. Create an empty tree. Each node of the tree includes the following components: A. The MIME type and subtype of the part, expressed as would be found in a Content-Type header field, with no whitespace or comments; B. The unencoded content represented by the MIME part at this node; C. A series of octets that will contain a hash of the content; D. A series of zero or more pointers to other (child) nodes. 2. If the message is not encoded using MIME, insert a node at the root of the tree using a type/subtype of "text/plain" and the full body content. The hash is not initialized. 3. If the message is encoded using MIME, then the tree is populated in a way that mirrors the MIME structure of the message. In particular, the outermost MIME object will appear at the root node, and the only nodes that have children are those with a MIME type of "multipart". The hashes are not initialized. 4. For each leaf node, compute a hash of the content of that node. Store the hash in the node. 5. For each non-leaf node, if all of its child nodes now have computed hashes, concatenate the hashes (with order preserved), and compute and store a hash of the concatenation. 6. Repeat the previous step until all hashes in the tree have been populated. When this canonicalization is in use, the "bh=" tag will contain the Kucherawy Expires December 9, 2014 [Page 4] Internet-Draft DKIM List Canonicalization June 2014 hash stored at the root of the tree. The processes for signing and verification are otherwise unchanged. 4. 'The 'lh=' Signature Tag A signer can include an "lh=" tag, defined here, to make more than just the root hash information available to verifying agents. This permits identification of the specific part of the MIME structure that was modified, added or removed by an intermediary. The "lh=" tag is constructed by performing an in-order traversal of the canonicalization tree described in Section 3.1. At each node, each of the following is output, separated by a colon character (ASCII 0x3A): 1. A base64 expression of the hash at that node; 2. The MIME type of that node; 3. An integer expression of the number of children at that node. Between each node's output, a comma character (ASCII 0x2C) is output. Reconstruction of the MIME tree can be accomplished by the following steps: 1. Create a tree "T" containing a single empty node. 2. Create an empty node queue, "Q". 3. Create an information queue "I", containing the sequence of node information fields found in the "lh=" tag. 4. Select the root node of the tree. Call this node "N". 5. Extract the first batch of node information ("B") from the "lh=" tag. 6. Store the hash and MIME type from "B" into "N". 7. Enqueue the specified number of empty nodes into "Q", and attach them all as children of "N". 8. If "I" and "Q" are both empty, terminate. If one is empty and the other is not, an error has occurred. 9. Extract the next batch of node information from "I", as "B". Kucherawy Expires December 9, 2014 [Page 5] Internet-Draft DKIM List Canonicalization June 2014 10. Dequeue the next node from "Q", as "N". 11. Return to step 6. By comparing the hashes in and structure of this tree to those in the canonicalized tree, a receiver can identify parts of the tree (or entire subtrees) that have been modified. Parts not covered by the signature can also be identified. 5. Use Profile The intended use of this mechanism is to affix two DKIM signatures to a message. The first signature is added by the Author, and canonicalizes the original message in its entirety. The second signature is added by a modifying intermediary, such as a mailing list manager (MLM). When verifying, the Author signature on an unmodified message would pass verification. For a modified message, in the typical case, the verification step would observe that the Author signature failed but the intermediary's signature verified. When the "lh=" tag is present, it is possible to reconstruct the MIME structure of the signed message and compare it to that of the received message, including hashes of the content seen by each party. By comparing hash values at each node of the MIME structures, it is possible to determine in which MIME parts changes were made and/or new parts added or removed by the intermediary. The verifying agent can then determine whether those changes are acceptable before allowing the message to continue toward delivery. 6. Security Considerations 6.1. Imported from DKIM Section 8 of [RFC6376] discusses numerous security considerations relevant to DKIM. Of particular interest here is Section 8.2, which discusses concerns regarding signatures that sill verify in the presence of added message content. 6.2. Added Content May Not Be Safe When the use profile described in Section 3 is applied, it is important to note that the added content was not signed by the Author domain, but only by the domain of the intermediary. Operators that might grant preferential handling based on valid DKIM signatures from favorable domains; assuming that appended content in the presence of such signatures does not mean the appended content is necessarily safe. Kucherawy Expires December 9, 2014 [Page 6] Internet-Draft DKIM List Canonicalization June 2014 7. IANA Considerations 7.1. DKIM-Signature Canonicalization Body Registry IANA is requested to add the following entry to the DKIM-Signature Canonicalization Body Registry: Type: list Reference: [this document] Status: active 7.2. DKIM-Signature Tag Specifications Registry IANA is requested to add the following entry to the DKIM-Signature Tag Specifications Registry: Type: lh Reference: [this document] Status: active 8. References 8.1. Normative References [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [RFC6376] Crocker, D., Hansen, T., and M. Kucherawy, "DomainKeys Identified Mail (DKIM) Signatures", STD 76, RFC 6376, September 2011. 8.2. Informative References [RFC5598] Crocker, D., "Internet Mail Architecture", RFC 5598, July 2009. [RFC6377] Kucherawy, M., "DomainKeys Identified Mail (DKIM) and Mailing Lists", BCP 167, RFC 6377, September 2011. Appendix A. Examples TODO: Show a few examples of the conversion between some different MIME structures, the hash tree, and the "lh=" tag's value. Kucherawy Expires December 9, 2014 [Page 7] Internet-Draft DKIM List Canonicalization June 2014 Appendix B. Acknowledgements The original idea was proposed by Ned Freed. The authors wish to acknowledge (names) for their comments during the development of this document. Author's Address Murray S. Kucherawy EMail: superuser@gmail.com Kucherawy Expires December 9, 2014 [Page 8]