Internet Draft Editor: Peter Gutmann draft-ietf-pkix-certstore-http-01.txt University of Auckland December 4, 2001 Expires June 2002 Internet X.509 Public Key Infrastructure Operational Protocols: Certificate Store Access via HTTP Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract The protocol conventions described in this document satisfy some of the operational requirements of the Internet Public Key Infrastructure (PKI). This document specifies the conventions for using the Hypertext Transfer Protocol (HTTP) as an interface mechanism to obtain certificates and certificate revocation lists (CRLs) from PKI repositories (although RFC 2585 covers fetching certificates via HTTP, this merely mentions that certificates may be fetched from a static URL, which doesn't provide a general-purpose interface to a certificate store). Additional mechanisms addressing PKIX operational requirements are specified in separate documents. 1. Introduction This specification is part of a multi-part standard for the Internet Public Key Infrastructure (PKI) using X.509 certificates and certificate revocation lists (CRLs). This document specifies the conventions for using the Hypertext Transfer Protocol (HTTP) as an interface mechanism to obtain certificates and certificate revocation lists (CRLs) from PKI repositories. Although RFC 2585 [RFC2585] covers fetching certificates via HTTP, this merely mentions that certificates may be fetched from a static URL, which doesn't provide any general-purpose interface capabilities to a certificate store. The conventions described in this document allows HTTP to be used as a general- purpose, transparent interface to any type of certificate store ranging from flat files through to standard databases such as Berkeley DB and relational databases, as well as traditional X.500/LDAP directories. Typical applications would include use with web-enabled relational databases (which most current databases are) or simple key/data lookup mechanisms such as Berkeley DB and its various descendants. Additional mechanisms addressing PKIX operational requirements are specified in separate documents. The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. This draft is being discussed on the "ietf-pkix" mailing list. To join the list, send a message to with the single word "subscribe" in the body of the message. Also, there is a Web site for the mailing list at . 2. HTTP Certificate Store Interface The GET method is used in combination with a query URI to retrieve certificates from the underlying certificate store [RFC2068]. The parameters for the query URI are a certificate identifier consisting of an attribute type and a value which specifies one or more certificates to be returned from the query. The query URI may be specified in a certificate AuthorityInfoAccess extension or configured at the client (see section 3). Permitted attribute types and associated values are described below. Arbitrary-length binary values are converted into a search key by the process described in section 2.1. Attribute Value --------- ----- certHash Search key derived from the SHA-1 hash of the certificate (sometimes called the certificate fingerprint). email Email address contained in the certificate, typically as an rfc882Name attribute. iHash Search key derived from the certificate's issuer DN as it appears in the certificate. iAndSHash Search key derived from the certificate's issuerAndSerialNumber [RFC2630]. name CommonName contained in the certificate. sHash Search key derived from the certificate's subject DN as it appears in the certificate. sKID Search key derived from the certificate's subjectKeyIdentifier. The full URI is formed by concatenating the query URI and the attribute and value. Certificates are retrieved from one query URI (the certificate URI) and CRLs from another query URI (the CRL URI). These may or may not correspond to the same certificate store (the exact interpretation is a local configuration issue). The form of the complete URI is therefore: '?' '=' The query value MUST be encoded using the form-urlencoded media type [RFC1866]. Certificate URIs MUST support retrieval by all of the above attribute types. CRL URIs MUST support retrival by the iHash and sKID attribute types, which identify the issuer of the CRL. If more than one certificate matches a query, it MUST be returned as a multipart response. [Or a SEQUENCE OF Certificate? This has the advantage that it takes a lot less code to parse, OTOH it may be harder to produce if what you're using is a web-enabled RDBMS, which is what most of them are] In some instances servers may return HTTP type 3xx redirection requests to redirect queries to another server. Clients receiving this response SHOULD use the returned URI to replace their existing one and resubmit the query to the new server. Other information such as naming conventions and MIME types are specified in [RFC2585]. 2.1 Converting Binary Blobs into Search Keys Some of the fields which are used in queries are of arbitrary length and contain binary data. Both of these properties make them unsuited for direct used in HTTP queries. In order to make them usable, they are first hashed down to a fixed-length 128-bit value and then base64-encoded: Step 1: Hash the key data using SHA-1 to produce a 160-bit value Step 2: Encode the first 128 bits of the hash value using base64-encoding to produce a 22-byte text-only value The one exception to this process is the subjectKeyIdentifier, which is already a hashed value. In this case it isn't hashed again, but only base64-encoded as per step 2. Certificate stores should verify that the base64-encoded values submitted in requests contain only characters in the range 'a'-'z', 'A'-'Z', '0'-'9', '/', or '.'. Queries containing any other character MUST be rejected (see the rationale in section 2.2 and security considerations in section 4 for more details on this requirement). 2.2 Examples To convert the subject DN C=NZ, O=... CN=John Smith into a search key: Hash the DN, in the DER-encoded form it appears in the certificate, to obtain: 96 4C 70 C4 1E C9 08 E5 CA 45 25 10 D6 C8 28 3A 1A C1 DF E2 base-64 encode the first 128 bits: 96 4C 70 C4 1E C9 08 E5 CA 45 25 10 D6 C8 28 3A to obtain: lkxwxB7JCOXKRSUQ1sgoOh This is the search key to use in the query URI. To fetch all certificates useful for sending encrypted email to foo@bar.com: GET /search-cgi?email=foo%40bar.com HTTP/1.0 In this case "/search-cgi" is the abs_path portion of the query URI, and the request is submitted to the server located at the net_loc portion of the query URI. Note the encoding of the '@' symbol as per [RFC1866]. To fetch the CA certificate which issued the email certificate: GET /search-cgi?iHash= HTTP/1.0 Alternatively, if chaining is by key identifier: GET /search-cgi?sKID= HTTP/1.0 To fetch other certificates belonging to the same user as the email certificate: GET /search-cgi?sHash= HTTP/1.0 To fetch the CRL for the certificate: GET /search-cgi?iHash= HTTP/1.0 2.2 Rationale The identifiers are taken from PKCS #15 [PKCS15], a standard which covers (among other things) a transparent interface to a certificate store. These identifiers have been field proven through having been in common use for a number of years, typically via PKCS #11 [PKCS11]. Certificate stores and the identifiers which are required for typical certificate lookup operations are analysed in some detail in [Gutmann]. Another possible identifier which has been suggested is an IP address or DNS name, which will be required for web-enabled embedded devices. This is necessary to allow for example a home automation controller to be queried for certificates for the devices which it controls. Since this value is regarded as the CN for the device, common practice is to use this value for the CN in the same way that web server certificates set the CN to the server's DNS name, so this option is already covered in a widely-accepted manner. The binary search key sizes are limited to 128 bits to save space in search indexes, and because there's no advantage to be gained from using the full 160 bits. The base64-encoded form of the identifier must be carefully checked for invalid characters, since allowing raw data through presents a security risk. Consider for example a certificate store implemented using an RDBMS in which the SQL query is built up as "SELECT certificate FROM certificates WHERE iHash = " + . If is set to "ABCD;DELETE FROM certificates" the results of the query will be quite different from what was expected by the certificate store administrators. For this reason only valid base64 encodings should be allowed. The same checking applies to queries by name or email address. The query types have been specifically chosen to be not just an HTTP interface to LDAP but as a general-purpose retrieval mechanism which allows arbitrary certificate storage mechanisms (with a bias towards simple key/data stores, which are deployed almost universally, whether as ISAM, Berkeley DB, or an RDBMS) to be employed as back-ends. Hashes are used for arbitrary-length fields such as ones containing DNs in place of the full field to keep the length manageable. In addition the use of the hashed form emphasizes the fact that searching for structured name data isn't a supported feature, since this is a simple interface to a key/data certificate store rather than an HTTP interface to an X.500 directory. Users specifically requiring an HTTP interface to X.500 may use technology such as HTTP LDAP gateways for this purpose. The attributes are given shortened name forms (for example iAndSHash in place of issuerAndSerialNumberHash) in order to keep the lengths reasonable, or common name forms (for example email in place of rfc822Name, rfc822Mailbox, emailAddress, mail, email, etc etc) where multiple name forms exist. Certificate and CRL stores are allocated separate URIs because they may be implemented using different mechanisms. A certificate store typically contains large numbers of small items while a CRL store contains a very small number of potentially large items, by providing independant URIs it's possible to implement the two stores using mechanisms tailored to the data they contain. This access mechanism is similar to the PGP HKP protocol, however the latter is almost entirely undocumented and requires implementors to reverse-engineer other implementations. Because of this lack of standardisation, no attempt has been made to ensure interoperability or compatibility with HKP-based servers. One benefit which HKP brings is extensive implementation experience, which indicates that this is a very workable solution to the problem of a simple key/certificate retrieval mechanism. HKS servers have been implemented using flat files, Berkeley DB, and various databases such as Postgres and MySQL. 3. Locating HTTP Certificate Stores In order to locate servers from which certificates may be retrieved, relying parties can employ one or more of the following strategies: Information contained in the certificate Use of a "well-known" location Manual configuration of the client software The intent of the various options provided here is to make the certificate store access as transparent as possible, only requiring manual user configuration as a last resort. 3.1 Information in the Certificate In order to convey to relying parties a well-known point of information access, CAs SHALL provide the capability to include the AuthorityInfoAccess (AIA) extension [RFC2459] in certificates. The OID value for the accessMethod is one of: id-ad-http-certs OBJECT IDENTIFIER ::= { id-ad 6 } id-ad-http-crls OBJECT IDENTIFIER ::= { 1 3 6 1 4 1 3029 5 1 } and the corresponding accessLocation is the query URI. [It may be preferable to provide a third accessMethod for attribute certificates. The id-ad-http-certs implicitly included these (that is, it didn't try to exclude them and the access mechanism is identical to the one for certificates), however it has been pointed out that it would be useful to separate identity from attribute certificates] This provides a CA with a convenient place to indicate where further certificates may be found, for example for path construction. Note that it doesn't mean that this service is limited to CAs only. 3.2 Use of a "well-known" Location If no other location information is available, the certificate store interface may be located at a "well-known" location constructed from the service provider's domain name. In the usual case the URI is constructed by prepending the type of information to be retrieved, either "certificates." or "crls.", to the domain name to obtain the net_loc portion of the URI and appending a fixed abs_path portion "search.cgi". The URI form of the "well-known" location is therefore: certificates./search.cgi crls./search.cgi Service providers SHOULD use these URIs in preference to other alternatives. For example if a CA with the domain kiwisign.com were to make its certificates available via an HTTP certificate store interface, the "well-known" query URIs for certificates and CRLs would be: certificates.kiwisign.com/search.cgi crls.kiwisign.com/search.cgi A second case occurs when the service is being provided by web-enabled embedded devices such as Universal Plug and Play devices [UPNP]. In this case the device has a single, fixed net_loc (either an IP address or a DNS name) and makes services available via an HTTP interface. In this case the URI is constructed by appending a fixed abs_path portion "certificates/search.cgi" for certificates and "crls/search.cgi" for CRLs to the net_loc. The URI form of the "well-known" location is therefore: /certificates/search.cgi /crls/search.cgi Web-enabled devices SHOULD use these URIs in preference to other alternatives. For example a home automation controller with IP address 192.168.1.1 (a control point in UPNP terminology) would make certificates for devices such as HVAC controllers, lighting and appliance controllers, and fire and physical intrusion detection devices available as: 192.168.1.1/certificates/search.cgi 192.168.1.1/crls/search.cgi A print server with DNS name "printspooler" would make certificates for web- enabled printers which it communicates with available as: printspooler/certificates/search.cgi printspooler/crls/search.cgi 3.3 Manual Configuration of the Client Software The accessLocation for the HTTP certificate/CRL store MAY be configured locally at the client. This can be used if no other information is available, or if it is necessary to override other information. 3.4 Rationale An AIA extension is used to indicate the location for the CRL store interface rather than the CRLDistributionPoint (CRLDP) extension since the two perform entirely different functions. A CRLDP contains "a pointer to the current CRL", a fixed location containing a CRL for the current certificate, while the AIA extension indicates "how to access CA information and services for the issuer of the certificate in which the extension appears", in this case the CRL store interface which provides CRLs for any certificates issued by the CA. In addition CRLDP associates other attribute information with a query which is incompatible with the simple query mechanisms presented in this document. The well-known location URI is designed to make hosting options as flexible as possible. Locating the service at www. would generally require it to be handled by the provider's main web server, while using a distinct server URI allows it to handled as desired by the provider. Although there will no doubt be servers which implement the interface using Apache and Perl scripts, a more logical implementation would consist of a simple network interface to a key-and-value lookup mechanism such as Berkeley DB. The URI form presented in section 3.2 allows for maximum flexibility, since it will work with both web servers/CGI scripts and non-web-server-based network front-ends for certificate stores. Web-enabled (or more strictly HTTP-enabled) devices are intended to be plug- and-play, with minimal (or no) user configuration necessary. The "well-known" URI allows any known device (for example one discovered via UPNP's Simple Service Discovery Protocol) to be queried for certificates without requiring further user configuration. 4. Security Considerations HTTP caching proxies are common on the Internet, and some proxies may not check for the latest version of an object correctly. [RFC2068] specifies that responses to query URLs should not be cached, and most proxies and servers correctly implement the "Cache-Control: no-cache" mechanism which can be used to override cacheing, however in the rare instance in which an HTTP request for a certificate or CRL goes through a misconfigured or otherwise broken proxy, the proxy may return an out-of-date response. Care should be taken to ensure that only valid queries are fed through to the backend used to retrieve certificates. Allowing an attacker to submit arbitrary queries may allow them to manipulate the certificate store in unexpected ways if the backend tries to interpret the query contents. For example if a certificate store is implemented using an RDBMS in which the SQL query is built up as "SELECT certificate FROM certificates WHERE iHash = " + and is set to "X;DELETE FROM certificates" the results of the query will be quite different from what was expected by the certificate store administrator. The same applies to queries by name and email address. Alongside filtering of queries, the backend should be configured to disable any form of update access via the web interface. For Berkeley DB this restriction can be imposed by opening the certificate store in read-only mode from the web interface. For relational databases, it can be imposed through the SQL GRANT/REVOKE mechanism, for example "REVOKE ALL ON certificates FROM webuser; GRANT SELECT ON certificates TO webuser" will allow read-only access of the appropriate kind for the web interface. Author Address Peter Gutmann University of Auckland Private Bag 92019 Auckland, New Zealand pgut001@cs.auckland.ac.nz References Gutmann A Reliable, Scalable General-purpose Certificate Store, P.Gutmann, Proceedings of the 16th Annual Computer Security Applications Conference, December 2000. PKCS11 Cryptographic Token Interface Standard, RSA Laboratories, December 1999. PKCS15 Cryptographic Token Information Syntax Standard, RSA Laboratories, June 2000. RFC1866 Hypertext Markup Language - 2.0, T. Berners-Lee and D. Connolly, November 1995. RFC2068 Hypertext Transfer Protocol -- HTTP/1.1, J. Gettys, J. Mogul, H. Frystyk, and T. Berners-Lee, January 1997. RFC2119 Key Words for Use in RFCs to Indicate Requirement Levels, S.Bradner, March 1997. RFC2459 Internet X.509 Public Key Infrastructure: Certificate and CRL Profile, R. Housley, W. Ford, W. Polk, and D. Solo, January 1999. RFC2585 Internet X.509 Public Key Infrastructure: Operational Protocols: FTP and HTTP, R. Housley and P. Hoffman, May 1999 UPNP Universal Plug and Play Device Architecture, Version 1.0, UPnP Forum, 8 June 2000. Full Copyright Statement Copyright (C) The Internet Society 2001. All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.