Network Working Group H. Krawczyk Internet Draft Nov. 28, 1995 Expires May 28, 1996 Keyed-MD5 for Message Authentication draft-krawczyk-keyed-md5-01.txt Status of this Memo Distribution of this memo is unlimited. This document is an Internet-Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months, and may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material, or to cite them other than as a ``working draft'' or ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the internet-drafts Shadow Directories on: ftp.is.co.za (Africa) nic.nordu.net (Europe) ds.internic.net (US East Coast) ftp.isi.edu (US West Coast) munnari.oz.au (Pacific Rim) Abstract This document describes a keyed-MD5 mechanism for use as a message authentication code (MAC) (or, integrity check value -- ICV). It is mainly intended for integrity verification of information transmitted over open networks (e.g., Internet) between parties that share a common secret key. The proposed mechanism combines the (key-less) MD5 hash function [RFC-1321] with a shared secret key. The description of keyed-MD5 in this document is independent of its use in any particular protocol. An analogous mechanism can be used in combination with other iterative hash functions, e.g., SHA. This draft presents a different keyed-MD5 function than the one presented in the previous version of this document (.00) and adopted into [RFC-1828]. The present function follows the same principles as the previous proposal (efficiency, code availability, etc.) but is backed by a significantly better security analysis. Krawczyk Expires May 28, 1996 [Page i] INTERNET-DRAFT Keyed-MD5 Nov. 1995 1. Introduction Rivest introduced MD5 [RFC-1321] as a cryptographic hash function. It was originally designed and intended as a collision-resistant function as required for the hashing of information prior to application of a signature function (e.g., RSA). However, due to its relatively good performance in software implementation and its free availability the same function was adapted to provide functionalities different than the one for which MD5 was originally designed. In particular, MD5 (as well as other cryptographic hash functions) was proposed to provide message authentication by combining it with a secret key (see [Tsu]). Only recently a close analysis of the security of these keyed MD5 modes has been initiated [BCK1,BCK2,KR,PV]. As part of the results of this analysis we provide here with a particular definition of keyed-MD5. The description of this transform in this document is independent of its use in any particular protocol. It is intended to serve as a common mechanism for the many protocols (especially, IETF protocols) that require integrity verification based on a shared secret key (e.g., IP Authentication Header [RFC-1826]). An analogous mechanism can be used in combination with other iterative hash functions, e.g., SHA [SHA]. The mechanism presented in this document replaces the one proposed in version 00 of this draft and which is now part of [RFC-1828]. The current proposal follows the same principles that guided the previous construction, that is: * it is based on MD5 * no change to the MD5 code required * no degradation of MD5 speed * simple key requirements * replaceability of MD5 by other cryptographic hash functions However, it improves on the previous proposal relative to its security analysis. The present is the first construction (and the only one we are aware of) that can be fully analyzed based on relatively weak assumptions on the underlying hash function (MD5). It only requires MD5 to be collision-resistant in a weak sense, and its compression function to be weakly unpredictable. These requirements are weaker than the ones needed for other common applications of MD5, e.g., as hash functions for digital signatures and as "randomizers" (for pseudorandom generation, key derivation, etc.). In particular, we only require that collisions are hard to find when the "initial vector" of the function is random and secret, and that the output of the compression function is unpredictable when applied to partially unknown strings. The analytical results that back this particular construction will be presented in [BCK2]. Krawczyk Expires May 28, 1996 [Page 1] INTERNET-DRAFT Keyed-MD5 Nov. 1995 2. Calculation The definition and reference implementation of MD5 appears in [RFC-1321]. Let `text' denote the data to which keyed-MD5 is to be applied and K be the message authentication secret key shared by the parties. The length of K is 16 bytes (corresponding to the output length of MD5). We define two fixed strings pad1 and pad2 as follows: pad1 = the byte 0x36 repeated 48 times pad2 = the byte 0x5C repeated 48 times. To compute keyed-MD5 over the data `text' we perform MD5(K,pad2,MD5(K,pad1,text)) Namely, (1) apply MD5 to the concatenation of the key K, pad1 and `text ' (i.e., K immediately followed by pad1 immediately followed by `text'); (2) apply MD5 to the concatenation of K, pad2 and the result in (1). 3. Keys The key for keyed-MD5 is fixed in this proposal to 16 bytes. (This is done for the sake of simplicity. Other lengths could be used. However, less than 16 bytes would decrease the security strength of the function, while longer than 16 byte keys would not significantly increase this strength.) Keys need to be chosen at random, or using a cryptographically strong pseudo-random generator seeded with a random seed. Keys need to be changed periodically, and as frequently as possible (e.g., usage of the same key to authenticate more than 1 GByte of information is not advisable). Note: when using other cryptographic hash functions the length of the key should be typically chosen as the length of the function output (e.g., 160 bits in the case of SHA). 4. Implementation Implementation of keyed-MD5 requires the implementation of MD5 (see [RFC-1321]) and the calculation described in Section 2. Krawczyk Expires May 28, 1996 [Page 2] INTERNET-DRAFT Keyed-MD5 Nov. 1995 Notice that this calculation does not require any change in the definition or code of MD5. However, if desired, a performance improvement can be achieved by a simple adaptation of the MD5 code. (Choosing or not to implement keyed-MD5 in this way is a decision of the local implementation and has no effect on inter-operability.) The idea is that the intermediate results of MD5 on the 64-byte blocks (K,pad1) and (K,pad2) can be precomputed only once at time of generation of the key K, or before its first use. These intermediate results (called "contexts", and stored under MD5's structure MD5-CTX) are then used to initialize the MD5 function each time that a message needs to be authenticated. (This involves setting the MD5-CTX variable to the stored value and applying MD5Update to the data.) This method saves the application of the MD5's compression function (MD5Update) on two 64-byte blocks, a savings which may be significant when authenticating short streams of data. We stress that the stored contexts need to be treated and protected as secret keys. 5. Security The security of the message authentication mechanism presented here depends on cryptographic properties of MD5: the resistance to collision finding (limited to the case where the initial value is secret and random, and where the output of the function is not explicitly available to the attacker), and the message authentication property of the compression function of MD5 when applied to single blocks (these blocks are partially unknown to an attacker as they contain the result of the internal MD5 computation and, in particular, cannot be fully chosen by the attacker). These properties, and actually stronger ones, are commonly assumed for MD5. While significant weaknesses of MD5 regarding these properties could be discovered in the future, such weaknesses would rend MD5 useless for most (probably, all) cryptographic applications, including alternative message authentication schemes based on this function. Two important properties of the application of MD5 here are: 1. This construction is independent of the particular details of MD5 and then the latter can be replaced by any other secure (iterative) cryptographic hash function. 2. Message authentication, as opposed to encryption, has a "transient" effect. A published breaking of a message authentication scheme would lead to the replacement of that scheme, but would have no adversarial effect on information authenticated in the past. This is in sharp contrast with encryption, where information encrypted today may suffer from exposure in the future if, and when, the encryption algorithm is broken. Krawczyk Expires May 28, 1996 [Page 3] INTERNET-DRAFT Keyed-MD5 Nov. 1995 The strongest attack known against the proposed scheme is based on the frequency of collisions for MD5 ("birthday attack") [BCK1,PV] for which the attacker needs to acquire the correct message authentication tags computed on about 2**64 known plaintexts (and with the _same_ secret key K!). This would require the processing of 2^70 bytes under MD5, an impossible task in any realistic scenario (this would take 250,000 years in a 1Gbit link, and without changing the secret key K all this time). This attack could become realistic only if serious flaws in the collision behavior of MD5 are discovered (e.g. collisions found after 2**30 messages or less). Such a discovery would determine the immediate replacement of MD5 (the effects of such failure would be far more severe for the traditional uses of MD5 in the context of digital signatures, public key certificates, etc., rather than in the message authentication scenario). Note: this attack needs to be strongly contrasted with regular collision attacks on MD5 where no secret key is involved and where 2^64 off-line operations suffice to find collisions. The latter are far more feasible [VW] than the attack described above, but are irrelevant to our construction of keyed-MD5. (In all of the above discussion 64 should be replaced by 80 if one uses a 160 bit hash function as SHA.) A correct implementation of the above construction, the choice of random (or cryptographically pseudorandom) keys, a secure key exchange mechanism, frequent key refreshments, and good secrecy protection of keys are all essential ingredients for the security of the integrity verification mechanism provided by keyed-MD5. Acknowledgments The author thanks Mihir Bellare and Ran Canetti for their collaboration in the design and analysis of keyed-MD5, and Burt Kaliski and Matt Robshaw for helpful discussions on these issues. Krawczyk Expires May 28, 1996 [Page 4] INTERNET-DRAFT Keyed-MD5 Nov. 1995 References [RFC-1826] R. Atkinson, "IP Authentication Header", RFC-1826, August 1995. [BCK1] M. Bellare, R. Canetti, and H. Krawczyk, "Cascaded Pseudorandomness and Its Concrete Security", submitted. Available from the author. [BCK2] M. Bellare, R. Canetti, and H. Krawczyk, "Keyed Hash Functions and Message Authentication", to be presented at the 1996 RSA Data Security Conference, San Francisco, Jan. 1996. [KR] B. Kaliski and M. Robshaw, "Message Authentication with MD5", RSA Labs' CryptoBytes, Vol. 1 No. 1, Spring 1995. (http://www.rsa.com/rsalabs/cryptobytes/) [RFC-1828] P. Metzger and W. Simpson, "IP Authentication using Keyed MD5", RFC-1828, August 1995. [PV] B. Prennel and P. VanOorschot, "Building fast MACs from hash functions", Advances in Cryptology -- CRYPTO'95 Proceedings, Lecture Notes in Computer Science, Springer-Verlag Vol.963, 1995, pp. 1-14. [RFC-1321] Ronald Rivest, "The MD5 Message-Digest Algorithm", RFC-1321, DDN Network Information Center, April 1992. [SHA] NIST, FIPS PUB 180: Secure Hash Standard, May 1993. [Tsu] G. Tsudik, "Message authentication with one-way hash functions", In Proceedings of Infocom~92, 1992. [VW] P. van Oorschot and M. Wiener, "Parallel Collision Search with Applications to Hash Functions and Discrete Logarithms", Proceedings of the 2nd ACM Conf. Computer and Communications Security, Fairfax, VA, November 1994. Author's Address Hugo Krawczyk IBM T.J. Watson Research Center P.O.Box 704 Yorktown Heights, NY 10598 hugo@watson.ibm.com Krawczyk Expires May 28, 1996 [Page 5]