Crypto Forum Research Group David A. McGrew Internet Draft Cisco Systems, Inc. Expires June, 2003 October, 2002 The Truncated Multi-Modular Hash Function (TMMH), Version Two Status of this Memo This document is an Internet Draft and is in full conformance with all provisions of Section 10 of RFC-2026. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and working groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract TMMH is a `universal' hash function suitable for message authentication in the Wegman-Carter paradigm. It is simple, quick, and especially appropriate for Digital Signal Processors and other processors with a fast multiply operation. This document specifies version two of TMMH, which eliminates the large storage requirement for hashing large messages that was present in version one. 2. The TMMH Specification. TMMH is a hash function that maps a key and a message to a hash value. TMMH uses sixteen bit unsigned integers as a basic data unit, and uses the multiply-and-accumulate operation as its core. TMMH can be used as a message authentication code, as described in Section 5. McGrew [Page 1] Internet Draft Truncated Multi-Modular Hash (TMMH) October, 2002 The key, message, and hash value are treated as a sequence of unsigned sixteen bit integers in network byte order. In the following, we call such an integer a word. The number of words in the hash value is denoted as TAG_WORDS. The number of words in the key is 35 + 6 * TAG_WORDS. The message can have any length between zero and 65,536 octets (32,768 words). The message may contain an odd number of octets, in which case the all-zero octet is appended to the message to align it on a word boundary. The number of octets in the message before this padding operation is denoted as MSG_LEN, and is used in the computations below. In the following, we define p to be equal to 2^16 + 1, and we use the symbol to * denote integer multiplication and the symbol +32 to denote integer addition modulo 2^32. TMMH uses several key-dependent internal data structures: the length multiplier array L, and an array of subkeys A. The length multiplier array L is an array of words, the ith element of which is denoted as L[i], with i ranging from zero to (TAG_WORDS - 1). A subkey is an array consisting of (TAG_WORDS + 7) words, and the ith element of the subkey S is denoted as S[i]. Five subkeys are used in TMMH; the subkey array is denoted as A, and the ith subkey is denoted as A[i], with i ranging from zero to 4. Figure 1. An illustration of how the arrays L and A are assigned from the words of the TMMH key K. In this example, TAG_WORDS is equal to two. Here K[i] denotes the ith word of the TMMH key (where i is a hexadecimal number). +----+----+----+----+----+----+----+----+----+----+----+--- Key |K[0]|K[1]|K[2]|K[3]|K[4]|K[5]|K[6]|K[7]|K[8]|K[9]|K[a]|... +----+----+----+----+----+----+----+----+----+----+----+--- +----+----+--------------------------------------------+--- Field |L[0]|L[1]| A[0] |... +----+----+--------------------------------------------+--- The length multiplier array L and the subkey array A are taken from the TMMH key K as follows. 1. The value L[i] is set to K[i], for i from zero to TAG_WORDS-1. 2. The value A[j][i] (the ith element of the jth subkey) is set to K[ TAG_WORDS + i + (TAG_WORDS + 7) * j ]. This process is illustrated in Figure 1. McGrew [Page 2] Internet Draft Truncated Multi-Modular Hash (TMMH) October, 2002 The function V(S, M) defined below maps a subkey S and a data string M to a 32-bit unsigned integer, and is defined as V(S,M) = S[1] * M[1] +32 S[2] * M[2] +32 ... +32 S[8] * M[8] with the convention that M is padded by appending null words if its length is less than eight. Here +32 denotes integer addition modulo 2^32. The length of the subkey K may be greater than eight, but the excess words are ignored by the function V. (The definition of V is such that the most significant words of the subkey may be ignored simplifies the exposition below.) The function U(S, M) is defined as U(S, M) = [ V(S, M) modulo p ] modulo 2^16, and it maps a subkey S and a data string M to a single word. The core of TMMH is a `compression' function C which maps a subkey value and an input string to an output string which is about eight times smaller than the input string. To compute C(S, D) for a given subkey value S and data string D of w words, do the following. 1) Divide up D into blocks of eight words each, padding the last block by appending null words until its length is a multiple of eight if needed: D = D[0] || D[1] || ... || D[ ceil(w/8) ] where D[i] is the ith block, || denotes concatenation, and ceil(x) denotes the largest integer not less than x. 2) Apply the function U to each block, using the subkey value S each time, then concatenate the outputs as follows: C(S, D) = U(S, D[0]) || U(S, D[1]) || ... ... || U(S, D[ ceil(w/8) ]. We next give a definition of TMMH that applies only when TAG_WORDS is one, as this case is simpler than the more general case. When TAG_WORDS is one, TMMH is computed using the following algorithm: set X to M and set i to zero while the number of words in X is greater than eight, do set X to C(A[i], X) increment i end while McGrew [Page 3] Internet Draft Truncated Multi-Modular Hash (TMMH) October, 2002 return [ [ [ L[0] * MSG_LEN ] +32 V(A[i], X) ] mod p ] mod 2^16 In the general case of TAG_WORDS > 1, the jth word of the TMMH tag is computed using the following algorithm: set X to M and set i to zero while the number of words in X is greater than eight, do set X to C((A[i] << j), X) increment i end while return [ [ [ L[j] * MSG_LEN ] +32 V((A[i] << j), X) ] mod p ] mod 2^16 where (x << j) denotes the word-wise left-shift of x, the result of which has null words in its j rightmost positions. For example, if x is the string { 5ea7 f27a c536 2192 11be ea35 db9d 63d6 fa8a } of words (in hexadecimal), then the value of (x << 1) is { f27a c536 2192 11be ea35 db9d 63d6 fa8a 0000 }. Note that the shift operation causes all of the words of a given subkey to be used by the function V. 2.1 Implementation Notes The reduction modulo p can be implemented without the explicit use of a division operation, as it is close to 2^16 (See Section 3.1 of [MMH] for details). The reduction modulo 2^16 can be implemented by truncating all but the lower sixteen bits of a higher-precision value. It is possible to implement TMMH in a scatter/gather approach or with an API consisting of initial, update, and final operations. One way to do this is to note that TMMH defines a tree of hash values, and to implement TMMH as a postorder traversal of that tree. 3. Test Vectors This section provides test vectors which can be used to test an implementation of TMMH. The key, message, and outputs are expressed as octet sequences, with each octet in hexadecimal. 3.1 Test Vector One TAG_WORDS: 2 key: { e627 6a01 5ea7 f27a c536 2192 11be ea35 McGrew [Page 4] Internet Draft Truncated Multi-Modular Hash (TMMH) October, 2002 db9d 63d6 fa8a fc45 e08b d216 ced2 7853 1a82 22f5 90fb 1c29 708e d06f 82c3 bee6 4f21 6f33 65c0 d211 c25e 9138 4fa3 7c1f 61ac 3489 2976 8c19 8252 ddbf cad3 c28f 68d6 58dd 504f 2bbf 0278 70b7 cfca } L: { e627 6a01 } A[0]: { 5ea7 f27a c536 2192 11be ea35 db9d 63d6 fa8a } A[1]: { fc45 e08b d216 ced2 7853 1a82 22f5 90fb 1c29 } A[2]: { 708e d06f 82c3 bee6 4f21 6f33 65c0 d211 c25e } A[3]: { 9138 4fa3 7c1f 61ac 3489 2976 8c19 8252 ddbf } A[4]: { cad3 c28f 68d6 58dd 504f 2bbf 0278 70b7 cfca } message: { 6015 f141 5ba1 29a0 f604 0d1c 02d9 aa8a 7931 } tag: { 8a82 4bb0 } 3.2 Test Vector Two TAG_WORDS: 2 key: { 2337 47e0 1564 671b 6f80 dcdd a6cc 5ff1 3e5d 88eb 612e 7c99 02e8 d8b2 77a5 09f9 f0bc 9997 1dc9 d478 396f 9602 8538 aa7f 16a0 a456 e77e 5262 a1dc 6b06 5e67 b2d5 74ee 5045 82c1 310a 28a5 bb49 f15a 3834 59c8 f3a 36f7 1b8c 953d bf74 2080 } message: { 5741 5220 4953 2050 4541 4345 2c20 4652 4545 444f 4d20 4953 2053 4c41 5645 5259 2c20 4947 4e4f 5241 4e43 4520 4953 2053 5452 454e 4754 4800 } tag: { 1d0f 7536 } N.B. This message is given by the 56-octet ASCII string "WAR IS PEACE, FREEDOM IS SLAVERY, IGNORANCE IS STRENGTH" in network byte order. 3.3 Test Vector Three TAG_WORDS: 2 key: { 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 } message: the word 0001 repeated 32,768 times tag: { 7fff 7fff } 4. Optimizations It is not necessary to store the entire TMMH key in memory if an McGrew [Page 5] Internet Draft Truncated Multi-Modular Hash (TMMH) October, 2002 upper bound on the length of the messages to be hashed is known. For example, if the messages of a particular protocol are no greater than 256 octets in length, then only the first three subkey values need to be stored; the other subkeys are not needed in the computation of TMMH for messages of that length. To describe this property, we define the parameter MAX_HASH_LENGTH, which denotes the maximum value which MSG_LEN may take. The number of subkeys which must be stored are described in terms of this parameter in Figure 2. Figure 2. The storage and the number of subkeys as a function of MAX_HASH_LENGTH. Subkeys MAX_HASH_LENGTH Storage (octets) --------------------------------------------- 1 16 14 + 4 * TAG_WORDS 2 128 28 + 6 * TAG_WORDS 3 1024 42 + 8 * TAG_WORDS 4 8192 56 + 10 * TAG_WORDS 5 65536 70 + 12 * TAG_WORDS 5. Using TMMH as a Message Authentication Code TMMH can be used to make a Wegman-Carter type message authentication code. To use TMMH to compute the authentication tag of a message, the TMMH hash value of that message is computed, then that value is combined with a pseudorandom string of TAG_LENGTH octets. The combining operation is word-wise addition modulo 2^16. The pseudoranom strings MUST NOT be correlated with each other nor with the messages they protect. A message/authentication tag pair is verified as follows. Compute a new authentication tag using the algorithm above, and compare it to the tag associated with the message. If the two are equal, then the message/tag pair is valid; otherwise, it is not. 6. Security Considerations TMMH is (2^(-11.1)^TAG_WORDS)-Delta Universal with respect to addition modulo 2^16. This property enables TMMH to be used as a message authentication code in the Wegman-Carter paradigm [WC81]. When used in this manner, the probability with which an adversary can forge a tag for a given message is no more than 2^(-11.1)^TAG_WORDS. This value is actually an upper bound; it is slightly less for shorter messages. McGrew [Page 6] Internet Draft Truncated Multi-Modular Hash (TMMH) October, 2002 These security claims for TMMH follow from the derivations in [MMH]. The interested reader should adapt Definition 2 of that document to TMMH as a starting point. The cyclic use of the words of the key is known as a Toeplitz construction, and is known to be secure (see Section 2.4 of [MMH] and the references therein). 7. History This document is based on the Internet Draft draft-mcgrew-saag-tmmh-01.txt, which was submitted to SAAG in November, 2001 and which expired in May, 2002. The second version of TMMH improves upon the initial version by enabling long messages to be hashed without requiring the key to be large. Version two essentially uses MMH recursively, and handles the variable message length in the same manner as version one. Changes to draft-mcgrew-saag-tmmh-01.txt from version 00 of that document include: * a correction of one of the test vector cases, * the addition of a section on using TMMH in a MAC, * some scattered corrections and clarifications, * removed references to other Internet Drafts. TMMH is an extension of the MMH hash function [MMH]. Unlike the function introduced in that seminal work, TMMH can deal efficiently and securely with variable-length messages, and is defined for both sixteen and thirty-two bit words. The PacketCable Security specification includes a MMH variant for RTP authentication [PC]. This variant can only be used for fixed message lengths, and is not interoperable with TMMH. To the best knowledge of the authors, this specification is not covered by any patents [H01]. 8. Acknowledgments Thanks are due to Mats Naslund, Karl Norrman, Doug Smith, Scott McGrew [Page 7] Internet Draft Truncated Multi-Modular Hash (TMMH) October, 2002 Fluhrer, and der Mouse for their comments and corrections. 9. Contact Information Questions and comments about this Internet Draft SHOULD be directed to: David A. "I'm not gonna pay a lot for this Presentation Layer" McGrew Cisco Systems, Inc. mcgrew@cisco.com The discussion list for the Crypto Forurm Research Group is: cfrg@ietf.org 11. References [B97] Bradner, S., Key words for use in RFCs to Indicate Requirement Levels, RFC 2119, March 1997. [WC81] M. Wegman and L. Carter, New hash functions and their use in authentication and set equality, J. of Computer and System Sciences, vol. 22, 1981. [H01] Halevi, S. Personal communication, May, 2001. [MMH] Halevi, S., and Krawczyk, H. MMH: Software Authentication in the Gbit/second rates, Fast Software Encryption Workshop, 1997. Also available online at http://www.research.ibm.com/people/s/shaih/pubs/. [PC] The PacketCable Security Specification, PKT-SP-SEC-I03-010626, Sections 9.8.1 and 9.8.2. Available online at http://www.packetcable.com/specifications.html [S92] Stinson, D. R. Universal hashing and authentication codes. Advances in Cryptology - CRYPTO '91, Lecture Notes in Computer Science 576 (1992), pp. 74-85. McGrew [Page 8]