Network Working Group B. Hoehrmann Internet-Draft September 20, 2010 Intended status: Informational Expires: March 24, 2011 Parsing malformed HTTP dates draft-hoehrmann-date-parsing-00 Abstract The HTTP/1.1 specification encourages recipients of date values, as found in HTTP headers like Date and Last-Modified, to be robust in accepting date values that may have been sent by non-HTTP software. This memo defines an error-tolerant parsing algorithm based on the date formats permitted by the HTTP specification for this purpose. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on March 24, 2011. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Hoehrmann Expires March 24, 2011 [Page 1] Internet-Draft Parsing malformed HTTP dates September 2010 1. Introduction HTTP/1.1 defines three different date formats for use in HTTP (see [RFC2616], section 3.3). Some deployed software generates values that do not strictly match any of the three formats, and the HTTP specification encourages implementations to be robust in accepting them. Differences include for instance whether numeric values are padded with spaces or with leading zeros, and which delimiters are used. This specification defines a grammar for HTTP date values that tolerates these minor differences to accomodate malformed formats that are known to occur relatively frequently in malformed values and are supported by widely deployed implementations. No effort is made to mirror a particular set of existing implementations or entirely different date formats. This specification does not update the HTTP specification; values that match the grammar in this document but not the requirements of the HTTP standard continue to be non-compliant. Implementations continue to conform (or not) to the HTTP specification whether or not they follow the requirements defined in this specification. 2. Terminology and Conformance A software module that interprets dates expressed as sequences of octets conforms to this specification if and only if it interprets dates that match the "recoverable" rule defined in this document according to this specification. 3. Tolerant HTTP date syntax The following ABNF [RFC5234] grammar corresponds to the HTTP-Date grammar with some minor modifications: the leading weekday is a free- form string separated from the remainder by a comma or white space, hyphen and space are valid separators between month, day, and year regardless of the exact format, an optional trailing time zone (a free form string for which no interpretation is defined in this document, though it is common to simply ignore it) is allowed for all variants, and sequences of white space and digits do not have to have the exact length specified in the HTTP specification. Hoehrmann Expires March 24, 2011 [Page 2] Internet-Draft Parsing malformed HTTP dates September 2010 recoverable = [ wkday wsc ] (dmyt / mdty) [ ws tz ] dmyt = day sep month sep year ws time ; rfc1123-ish mdty = month sep day ws time ws year ; asctime-ish time-of-day = hour ":" minute ":" second hour = 1*2digit minute = 1*2digit second = 1*2digit day = 1*2digit year = 2digit / 4digit ws = 1*sp wsc = ws / "," *sp wkday = %x00-1f / %x21-ff tz = *octet sep = 1*sp / "-" month = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" / "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" ; TODO: These should perhaps be case-sensitive. sp = digit = octet = The interpretation of time-of-day, day, and month is the same as defined in the HTTP specification, with the clarification that where the HTTP specification requires more digits than are matched, the missing digits are assumed to be leading zeros; for year, the rules are the same as defined for the rfc850-date format. 4. IANA Considerations None. 5. Security Considerations The HTTP specification already encourages implementers to be robust in accepting malformed date values and there are no new security concerns with the particular error recovery rules defined in this document. 6. References [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. Hoehrmann Expires March 24, 2011 [Page 3] Internet-Draft Parsing malformed HTTP dates September 2010 Author's Address Bjoern Hoehrmann Mittelstrasse 50 39114 Magdeburg Germany EMail: mailto:bjoern@hoehrmann.de URI: http://bjoern.hoehrmann.de Note: Please write "Bjoern Hoehrmann" with o-umlaut (U+00F6) wherever possible, e.g., as "Björn Höhrmann" in HTML and XML. Hoehrmann Expires March 24, 2011 [Page 4]