Network Working Group M. Wahl INTERNET-DRAFT Critical Angle Inc. T. Howes Netscape Communications Corp. Expires in six months from 24 March 1997 Intended Category: Standards Track Use of Language Codes in LDAPv3 1. Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). 2. Abstract The Lightweight Directory Access Protocol [1] provides a means for clients to interrogate and modify information stored in a distributed directory system. The information in the directory is maintained as attributes [2] of entries. Most of these attributes have syntaxes which are human-readable strings, and it is desirable to be able to indicate the natural language associated with attribute values. This document describes how language codes [3] are carried in LDAP and are to be interpreted by LDAP servers. All implementations must be prepared to accept language codes in the LDAP protocols. Servers may or may not be capable of storing attributes with language codes in the directory. 3. Language Codes Section 2 of RFC 1766 [3] describes the language code format which is used in LDAP. Briefly, it is a string of ASCII alphabetic characters and hyphens. Examples include "fr", "en-US" and "ja-JP". Language codes are case insensitive. For example, the language code "en-us" is the same as "EN-US" and "en-us". One language code is a prefix of another if both codes are equal up to the length of the first code. For example, the language code "en" is a prefix of the language codes "en-us" and "EN-US". Wahl, Howes [Page 1] INTERNET-DRAFT Use of Language Codes in LDAPv3 March 1997 Implementations must not otherwise interpret the structure of the code when comparing two codes, but should treat them as simply strings of characters. Client and server implementations must allow any arbitrary string which follows the patterns given in RFC 1766 to be used as a language code. 4. Use of Language Codes in LDAP This section describes how LDAP implementations must interpret language codes in performing operations. In general, an attribute with a language code is to be treated as a subtype of the attribute without a language code. If a server does not support storing language codes with attribute values in the DIT, then it must always treat an attribute with a language code as an unrecognized attribute. Clients may request the use of a particular language through the preferredLanguage control. This control determines how the server interprets attributes without an explicit language parameter. The details of this interaction for specific operations are given below. 4.1. Attribute Description An attribute consists of a type, a list of options for that type, and a set of one or more values. In LDAP, the type and the options are combined into the AttributeDescription, defined in section 4.1.4 of [1]. This is represented as an attribute type name and a possibly-empty list of options. One of these options associates a natural language with values for that attribute. ::= "lang-" ::= -- a code as defined in RFC 1766 There can be at most one language option present in an AttributeDescription. The language code has no effect on the character set encoding for string representations of DirectoryString syntax values; the UTF-8 representation of UniversalString (ISO 10646) is always used. Examples of valid AttributeDescription: givenName;lang-en-US CN;lang-ja-JP-kanji CN;lang-ja-JP-romaji In LDAP and in examples in this document, a directory attribute is represented as an AttributeDescription with a list of values. Note that the data may be stored in the LDAP server in a different representation. 4.2. Preferred Language Control The preferredLanguage control is always non-critical. Its value is a language code as defined in RFC 1766 [3]. If this control is absent, the default is that there is no preferred language for the client. The OID of the control is "1.3.6.1.4.1.1466.20035". Wahl, Howes [Page 2] INTERNET-DRAFT Use of Language Codes in LDAPv3 March 1997 It is recommended that clients should use the most general language code which is suitable for their purpose. A language code with multiple subtags may result in too much directory information being filtered out of responses. In most cases, it is recommended that only the primary language tag (such as "EN") should be provided. If the server supports the storing of language codes with attribute values in the DIT, then it must indicate that the OID given above is a supported control in the supportedControl attribute of the root DSE. Otherwise it must not indicate support for this control. 4.3. Distinguished Names and Relative Distinguished Names No attribute description options are permitted in Distinguished Names or Relative Distinguished Names. Thus language codes MUST NOT be used in forming DNs. 4.4. Search Filter A client may provide a language code in an AttributeDescription in a search filter. If present, then only attribute values in the directory which match the base attribute type or its subtype, the language code and the assertion value match this filter. Thus for example a filter of an equality match of type "name;lang-en-US" and assertion value "Billy Ray", against the following directory entry objectclass: top DOES NOT MATCH (wrong type) objectclass: person DOES NOT MATCH (wrong type) name;lang-EN-US: Billy Ray MATCHES name;lang-EN-US: Billy Bob DOES NOT MATCH (wrong value) CN;lang-EN-US;dynamic: Billy Ray MATCHES CN;lang-en;dynamic: Billy Ray DOES NOT MATCH (differing lang-) name: Billy Ray DOES NOT MATCH (no lang-) SN: Ray DOES NOT MATCH (wrong value) (Note that "CN" and "SN" are subtypes of "name".) If the server does not support storing language codes with attribute values in the DIT, then any filter which includes a language code will always fail to match, as it is an unrecognized attribute type (note however than no error will be returned because of this). If no language code is specified in the search filter, then only the base attribute type and the assertion value need match the value in the directory. Wahl, Howes [Page 3] INTERNET-DRAFT Use of Language Codes in LDAPv3 March 1997 Thus for example a filter of an equality match of type "name" and assertion value "Billy Ray", against the following directory entry objectclass: top DOES NOT MATCH (wrong type) objectclass: person DOES NOT MATCH (wrong type) name;lang-EN-US: Billy Ray MATCHES name;lang-EN-US: Billy Bob DOES NOT MATCH (wrong value) CN;lang-EN-US;dynamic: Billy Ray MATCHES CN;lang-en;dynamic: Billy Ray MATCHES name: Billy Ray MATCHES SN: Ray DOES NOT MATCH (wrong value) There is no effect of the preferredLanguage control in filtering. 4.5. Compare A client may provide a language code in an AttributeDescription used in a compare request AttributeValueAssertion. This is to be treated by servers the same as the use of language codes in a search filter with an equality match, as described in the previous section. If there is no attribute in the entry with the same subType and language code, the noSuchAttributeType error must be returned. A server may return a language code as part of the matchedSubtype field in the result. Thus for example a compare request of type "name" and assertion value "Johann", against an entry with all the following directory entry objectclass: top objectclass: person givenName;lang-de-DE: Johann CN: Johann Sibelius SN: Sibelius The server must return compareTrue, and may set the matchedSubtype field to be "givenName;lang-de-DE". If the server does not support storing language codes with attribute values in the DIT, then any comparison which includes a language code will always fail to locate an attribute type, and noSuchAttributeType must be returned. There is no effect of the preferredLanguage control in comparing. 4.6. Requested Attributes in Search Clients may provide language codes in AttributeDescription in the requested attribute list in a search request. If a language code is provided in an attribute description, then only attribute values in a directory entry which have the same language code as that provided may be returned. Thus if a client requests an attribute "description;lang-en", the server must not return values of an attribute "description" or "description;lang-fr". Wahl, Howes [Page 4] INTERNET-DRAFT Use of Language Codes in LDAPv3 March 1997 Clients may provide in the attribute list multiple AttributeDescription which have the same base attribute type but different options. For example a client may provide both "name;lang-en" and "name;lang-fr", and this would permit an attribute with either language code to be returned. Note there would be no need to provide both "name" and "name;lang-en" since all subtypes of name would match "name". If a server does not support storing language codes with attribute values in the DIT, then any attribute descriptions in the list which include language codes are to be ignored, just as if they were unknown attribute types. If a request is made specifying all attributes or an attribute is requested without providing a language code, and the preferredLanguage control has not been set, then all attribute values regardless of their language code are returned. For example, if the client has set no preferredLanguage control and requests a "description" attribute, and a matching entry contains objectclass: top objectclass: organization O: Software GmbH description: software description;lang-en: software products description;lang-de: softwareproduckte postalAddress: Berlin 8001 Germany postalAddress;lang-de: Berlin 8001 The server will return: description: software description;lang-en: software products description;lang-de: softwareproduckte If the client has set the preferredLanguage control, then attributes are excluded from the result if either of the following is true: - the attribute has a language code for which the preferredLanguage value is not a prefix, or - the attribute does not have a language code, but there is another attribute of the same type or a subtype in the entry, which has a language code for which the preferredLanguage value is a prefix. For example, if the client sets that the preferredLanguage was "en" and requests all attributes, then the following will be returned. The "description;lang-de" and "postalAddress;lang-de" are excluded, since the language code in these attributes does not match the preferredLanguage. The "description" attribute is excluded, since it is a subtype of the "description;lang-en" attribute, which does match the language code. objectclass: top objectclass: organization O: Software GmbH description;lang-en: software products postalAddress: Berlin 8001 Germany Wahl, Howes [Page 5] INTERNET-DRAFT Use of Language Codes in LDAPv3 March 1997 If a server does not support storing language codes with attribute values in the DIT, then it will ignore the preferredLanguage control. 4.7. Add Operation Clients may provide language codes in AttributeDescription in attributes of a new entry to be created, subject to the limitation that the client must provide the attribute values used in the RDN without any language code or any other option. A client may provide multiple attributes with the same attribute type and value, so long as each attribute has a different language code. Servers which support storing language codes in the DIT must allow any attributes with DirectoryString to have a language code associated with it. Servers may allow language codes to be associated with other attributes. For example, the following is a legal request. objectclass: top objectclass: person objectclass: residentialPerson name: John Smith CN: John Smith CN;lang-en: John Smith SN: Smith streetAddress: 1 University Street streetAddress;lang-en: 1 University Street streetAddress;lang-fr: 1 rue University houseIdentifier;lang-fr: 9e etage If a server does not support storing language codes with attribute values in the DIT, then it must treat an AttributeDescription with a language code as an unrecognized attribute. If the server forbids the addition of unrecognized attributes then it must fail the add request with the appropriate result code. There is no effect of the preferredLanguage control in storing attributes in the add operation. 4.8. Modify Operation A client may provide a language code in an AttributeDescription as part of a modification element in the modify operation. Attribute types and language codes must match exactly against values stored in the directory. For example, if the modification is a "delete", then if the stored values to be deleted have a language code, the language code must be provided in the modify operation, and if the stored values to be deleted do not have a language code, then no language code is to be provided. If the server does not support storing language codes with attribute values in the DIT, then it must treat an AttributeDescription with a language code as an unrecognized attribute, and must fail the request with an appropriate result code. Wahl, Howes [Page 6] INTERNET-DRAFT Use of Language Codes in LDAPv3 March 1997 There is no effect of the preferredLanguage control in performing this operation. 4.9. Diagnostic Messages If the server supports returning diagnostic messages in more than one language, then if the preferredLanguage control has been set, it may use the preferredLanguage to choose an appropriate message. If the preferredLanguage is not recognized, the diagnostic messages must be returned in the default language. It is strongly recommended that in the default language for diagnostic messages, only printable ASCII characters be used, as not all clients will be able to display the full range of Unicode. 5. Security Considerations Security issues are not discussed in this memo. 6. Bibliography [1] M.Wahl, T. Howes, S. Kille, "Lightweight Directory Access Protocol (Version 3)", INTERNET DRAFT , October 1996. [2] M. Wahl, A. Coulbeck, T. Howes, S. Kille, "Lightweight X.500 Directory Access Protocol Standard and Pilot Attribute Definitions", , October 1996. [3] H. Alvestrand, "Tags for the Identification of Languages", RFC 1766, March 1995. 7. Authors Addresses Mark Wahl Critical Angle Inc. 4815 W Braker Lane #502-385 Austin, TX 78759 USA EMail: M.Wahl@critical-angle.com Tim Howes Netscape Communications Corp. 501 E. Middlefield Rd Mountain View, CA 94043 USA Phone: +1 415 937-3419 EMail: howes@netscape.com Expires: September 24, 1997 Wahl, Howes [Page 7]