W3CDigital Signature Initiative MD5 Message Digest 5 - Version 1.0


This document provides an overview of the Message Digest 5, and details how a MD5 digest is encoded in a Resource Reference Information Extension, providing the digest of a referenced web resource.


Overview

The MD5 algorithm (Message Digest 5) is a cryptographic message digest algorithm.

MD5 was designed by Ron Rivest, who is also the `R' in `RSA' in 1991. MD5 is described in rfc1321. C source code is included with the RFC. It is basically MD4 with "safety-belts" and while it is slightly slower than MD4, it is more secure. The algorithm consists of four distinct rounds, which have a slightly different design from that of MD4. Message-digest size, as well as padding requirements, remains the same. Den Boer and Bosselaers [B. den Boer and A. Bosselaers. Collisions for the compression function of MD5. In Advances in Cryptology - Eurocrypt '93, pages 293-304, Springer-Verlag, 1994.] have found pseudo-collisions for MD5 (see RSA FAQ Question 98), but there are no other known cryptanalytic results.

The MD5 algorithm takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest. The MD5 algorithm is intended for digital signature applications, where a large file must be "compressed" in a secure manner before being encrypted with a private (secret) key under a public-key cryptosystem such as RSA or PGP.

For further information on MD5, see:


DSig 1.0 Encoding

The BNF below shows how a MD5 digest is encoded in a Resource Reference Information Extension.

resinfo-data      ::= '(' HashAlgoURL resource-hash hash-date*1 ')'
HashAlgoURL       ::= '"http://www.w3.org/PICS/DSig/MD5_1_0.html"'
resource-hash     ::= '"base64-string encoding of 128 bit MD5 message
                        digest of the information resource."'
hash-date         ::= quoted-ISO-date 
quoted-ISO-date   ::= '"'YYYY'.'MM'.'DD'T'hh':'mmStz'"'
     based on the ISO 8601:1988 date and time standard, restricted
     to the specific form described here:
     YYYY ::= four-digit year
     MM   ::= two-digit month (01=January, etc.)
     DD   ::= two-digit day of month (01 through 31)
     hh   ::= two digits of hour (00 through 23) (am/pm NOT allowed)
     mm   ::= two digits of minute (00 through 59)
     S    ::= sign of time zone offset from UTC ('+' or '-')
     tz   ::= four digit amount of offset from UTC
           (e.g., 1512 means 15 hours and 12 minutes)
     For example, "1994.11.05T08:15-0500" is a valid quoted-ISO-date
     denoting November 5, 1994, 8:15 am, US Eastern Standard Time
     Note: The ISO standard allows considerably greater
     flexibility than that described here.  PICS requires precisely
     the syntax described here -- neither the time nor the time zone may
     be omitted, none of the alternate formats are permitted, and
     the punctuation must be as specified here.
base64-string     ::= as defined in RFC-1521.

hash-date is optional. There may be zero or one dates included here at the signer's behest.

The following example shows a valid MD5 resinfo extension with two MD5 hashes of the referenced information resource. The first without a date, the second with a date.

   extension
     ( optional "http://www.w3.org/PICS/DSig/resinfo-1_0.html"
                ( "http://www.w3.org/PICS/DSig/MD5_1_0.html" "base64-hash" )
                ( "http://www.w3.org/PICS/DSig/MD5_1_0.html" "base64-hash" 
                  "1997.02.05T08:15-0500" ) )

Philip A. DesAutels, DSig Project Manager 8 Oct 1997