MIMESGML Working Group E. Levinson Internet Draft: MIME/SGML ACCURATE Info. Sys. July 12, 1995 Encapsulating SGML Documents Using the Multipart/Related Content-Type This draft document is being circulated for comment. Please send your comments to the authors or to the sgml-internet mail list . Archives of the email discussions are available at ftp://ftp.naggum.no:/pub/SGML-internet filed by date and time. Status of this Memo This document is an Internet Draft; Internet Drafts are working documents of the Internet Engineering Task Force (IETF) its Areas, and Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. They may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress". Please check the abstract listing in each Internet Draft directory for the current status of this or any other Internet Draft. Abstract This document describes the encapsulation of a Standard Generalized Markup Language (SGML) document within a MIME message. The document may be represented in the message by some or all of its components. The MIME message may also include auxiliary information to be used by the recipient in processing the encapsulated SGML. The RFC proposes new content sub-types of Text/SGML, Application/SGML, and Application/SGML-notation, and a new header, Content-SGML- Entity. This specification uses the proposed Multipart/Related Content-Type [RFC-REL] and access-type=content-id [RFC- ACTI] specifications. Multipart/Related provides the mechanism for treating the set of SGML components as a single object and access-type=content-type allows a several MIME entities to contain identical bodies without replicating the body in each MIME entity. Levinson Expires January 31, 1996 [Page 1] Internet Draft MIME-SGML Table of Contents Abstract ........................................ 1 Table of Contents ............................... 2 1. Introduction .................................... 3 1.1. Terminology ..................................... 3 1.2. Standard Generalized Markup Language (SGML) ..... 4 1.3. SGML Document Interchange Format (SDIF) ......... 5 2. A Model for MIME/SGML ........................... 5 2.1. The SGML Media-Types ............................ 6 2.1.1. Text/SGML ....................................... 6 2.1.2. Application/SGML ................................ 6 2.1.3. Application/SGML-Notation ....................... 7 2.2. SGML Content-Type Parameters .................... 7 2.3. ABNF Specification of the SGML Media-Types ...... 8 2.4. Data Entities ................................... 8 3. The Content-SGML-Entity Header .................. 8 4. Encapsulting the SGML Entities .................. 10 4.1. The Multipart/Related Media-Type ................ 11 4.2. Examples ........................................ 12 4.2.1. Implied SGML Declaration ........................ 12 4.2.2. An SGML Text Entity ............................. 13 4.2.3. A Notation Declaration .......................... 13 4.2.4. Script-based Notation ........................... 13 4.2.5. Active and Doctype Parameter Usage .............. 14 4.2.6. Auxiliary Information ........................... 14 5. SGML Entities Not Part of a Document ............ 15 6. SGML Document Interchange Format (SDIF) ......... 15 7. Security ........................................ 16 8. References ...................................... 16 9. Acknowledgements ................................ 17 10. Author's Address ................................ 17 Apppendicies: A. An Extended Example ............................. 18 B. Notes for Implementors .......................... 20 C. ISO-10744 BCTF Values and Boot Attribute ........ 21 C.1. Bit Combination Transformation Format Values .... 22 C.2. The Boot Attribute .............................. 23 Levinson Expires January 31, 1996 [Page 2] Internet Draft MIME-SGML 0. Changes Introductory and explanatory text has been changed and expanded to eliminate any bias towards "complete" documents. Added explanatory text covering auxiliary information, given by the "start-info" Multipart/Related parameter. Included usage description for the "type" and "start-info" parameters. Also added an example using "start-info". Added "active" parameter to the Content-SGML-Entity header and added an example showing its use. Added additional detail to a number of examples and two more examples. Tightened up text discussion SDIF. 1. Introduction A need exists for the transfer of documents constructed using the Standard Generalized Markup Language (SGML) [ISO- 8879]. A document transfer consists of a set of the documents components sufficient to enable the receiver to process the document. Such processing might consist fo displaying the document or a portion of it, interacting with a local application, or other appropriate action. SGML documents consist of a set of inter-related components, or SGML entities, whose structural relationship must be preserved independently of the system on which the document exists. The components and their relationships are often represented as files with explicit internal references to other components (files). The encapsulation described here permits such transfers using the Multipurpose Internet Mail Extensions (MIME) specification [RFC-1521]. The goals for the MIME encapsulation of SGML is to permit the receiving system to display (or process) the SGML document with minimal effort and maximum flexibility. In particular, multiple parses of the SGML document can be avoided by using information from SGML entity and notation declarations. The Content-SGML-Entity header makes that information available to the MIME user agent. Sections 2 and 3 define the basic elements for labelling the SGML entities. Section 4 describes the encapsulation of the documents entities within a single Multipart MIME entity. The two sections following that describe the handling of incomplete or unparsable documents and the SGML Document Interchange Format (SDIF) [ISO-9069]. 1.1. Terminology Levinson Expires January 31, 1996 [Page 3] Internet Draft MIME-SGML Both SGML and MIME use the term "entity" to refer to their basic components. Here the use of "entity" generally connotes an SGML entity. For MIME entities, body part, is used; in some contexts that proves awkward and "MIME entity" is used instead. The context hopefully makes such usage clear. Two SGML terms, SGML Document and SGML Document Entity, are used in this paper and the difference between them is significant. An SGML Document [ISO-8879, 4.282] is the entire collection of objects or entities that make up a document. Those objects include markup definitions, text with SGML markup, plain text, image data, etc. An SGML document entity [ibid., 4.283], on the other hand, is the specific object with which an SGML system begins processing the SGML document. 1.2. Standard Generalized Markup Language (SGML) The Standard Generalized Markup Language (SGML) is used to encode document structure and a rigorous description of it is left to [ISO-8879]. The terms used in the present document attempt to be consistent with SGML terminology and usage. An SGML document exists as a collection of one or more entities; entities are system independent analogues to files. Those SGML entities are mapped to storage objects or files and the mapping may be one-to-one, many-to-one, or one-to-many. The SGML document refers to the storage objects via entity declarations. The declarations may define the name and type of the storage object or provide a name by which a SGML system can map the declared entity to a storage object. Preservation of the structure of references from one entity to another, known in SGML as the entity structure, are key to the email exchange of SGML documents. Not all the entities that constitute an SGML document need to be included in the document's MIME encapsulation. Communities or individuals may often agree to include only a subset of the SGML entities. For example certain communities use standard SGML document type definitions (DTDs), within those communities the DTD and other similar SGML entities need not be included in the encapsulation. Other communities may only want to encapsulate a minimal entity or set of entities and include auxiliary information to enable the recipient to retrieve any additional entities the recipient requires. The description of such community agreements, auxiliary information, and protocols for requesting additional entities are beyond the scope of this RFC. To enable the receiver to efficiently process the encapsulated SGML document the MIME message must carry Levinson Expires January 31, 1996 [Page 4] Internet Draft MIME-SGML detailed information for each SGML entity in the Multipart/Relate MIME body part. Additional information about displaying the document or about the document's entities may be included in the auxiliary information. In the sender's environment the SGML entities may reference standard names, called formal public identifiers, or specific local files, or both. Further, some SGML entities may refer to other entities, for example files containing text, images, or graphics. The identity and content of each entity must be available to the recipients to enable them to transform the sender's entity references into an equivalent local reference and to instantiate the entities locally. This document describes the MIME encapsulation of an SGML document that preserves the entity structure and permits the recipient of the encapsulated document to automatically instantiate it locally. 1.3. SGML Document Interchange Format (SDIF) SDIF [ISO-9069] defines a data stream structure for exchanging one or more SGML documents. A MIME entity conforming to this RFC is a conforming SDIF data stream [N1781]. 2. A Model for MIME/SGML Four issues must be addressed for the recipient's user agent to display, or pass on to some process, the encapsulated SGML document. The various document parts must be specified and entity references on the sender's systems must be resolved to corresponding references on the receiver's system. Similarly, notation declarations, that is, references to processors for non-SGML data, must be resolved into valid processes on the receiving system. An appropriate application, called the unpacker, must be in control to present the MIME body parts and the entity and notation information to the SGML processing software. Finally, the MIME encapsulated SGML text entities must be independent of the sender's system representation dependencies. The first three issues are addressed in the following manner. Two MIME media-types (content-types) are defined for SGML text, Text/SGML and Application/SGML, and one for conveying process associations, Application/SGML-notation. A new header, Content-SGML-Entity, provides the entity description for the body part containing the entity. Notation information, carried in an application/SGML- notation body part, associates notation declarations with MIME media-types. The SGML document entities are, encapsulated within the same Multipart/Related MIME entity, are the basis for the SGML MIME encapsulation. Levinson Expires January 31, 1996 [Page 5] Internet Draft MIME-SGML Auxiliary information, to be used in processing the SGML document, may be included in the MIME message. The information is referenced through the Multipart/Related start-info parameter. SGML defines an Entity Manager [ISO-8879, 4.123] that performs the mapping between SGML entities and the local file system. The specification of that mapping is system dependent. Consequently each SGML entity shall be represented as one MIME entity. 2.1. The SGML Media-Types There are two media-types for SGML parsable entities, Text/SGML and Application/SGML. Both have the same optional parameters and produce the same results for recipients with SGML capability. Text/SGML provides a fallback for those without SGML capability. Senders should base the choice between text and application media-types on the entity's content. Text is suggested for entities that would be meaningful to a human being without SGML processing. Application/SGML is recommended for all others. A third media-type, Application/SGML-notation, applies to non-SGML data and provides the connection between an SGML declaration and a MIME media-type. 2.1.1. Text/SGML MIME type name: Text MIME subtype name: SGML Required parameters: none Optional parameters: charset, SGML-bctf, SGML-boot Encoding considerations: may be encoded Security considerations: none Published specification: RFC-SGML Person and email address to contact for further information: E. Levinson The Text/SGML media-type can be employed when the contents of the SGML entity is intended to be read by a human and is in a readily comprehensible form. That is the content can be easily discerned by someone without SGML display software. Each record in the SGML entity, delimited by record start (RS) and record end (RE) codes, must correspond to a line in the Text/SGML body part. SGML entities that do not meet the above requirements should use the Application/SGML media-type. 2.1.2. Application/SGML MIME type name: Application MIME subtype name: SGML Levinson Expires January 31, 1996 [Page 6] Internet Draft MIME-SGML Required parameters: none Optional parameters: SGML-bctf, SGML-boot Encoding considerations: may be encoded Security considerations: none Published specification: RFC-SGML Person and email address to contact for further information: E. Levinson Use the Application/SGML media-type for SGML text entities that are not appropriate for Text/SGML. When used, each record start (RS) and record end (RE) character shall be explicitly represented by the bit combination specified in the SGML declaration. 2.1.3. Application/SGML-Notation MIME type name: Application MIME subtype name: SGML-Notation Required parameters: none Optional parameters: none Encoding considerations: none Security considerations: none Published specification: RFC-SGML Person and email address to contact for further information: E. Levinson The Application/SGML-Notation media-type provides the connection between the document's SGML notation declarations and MIME media-types. The MIME entity must contain a Content-SGML-Entity. The body of the SGML-Notation MIME entity contains a Content-Type header that specifies the media-type associated with the name parameter of the Content-SGML-Entity statement. Some SGML notation declarations may correspond to a script for an active media-type (e.g., safe-Tkl). In those cases a MIME entity with the corresponding media-type should be used. That MIME entity shall contain an appropriate Content-SGML-Entity header. 2.2. SGML Content-Type Parameters The parameters for the Text and Application SGML subtypes are defined below. charset The charset parameter is defined in [RFC-1521], the valid values and their meaning are registered by the Internet Assigned Numbers Authority (IANA) [RFC-1590]. The default charset value for all Text content-types is "us-ascii" [RFC-1521]. The charset parameter is provided to permit non- SGML capable systems to provide reasonable Levinson Expires January 31, 1996 [Page 7] Internet Draft MIME-SGML behavior when Text/SGML defaults to Text/Plain. SGML capable systems will use the SGML-bctf param- eter. SGML-bctf The SGML-bctf (SGML bit combination transformation format) parameter describes the method used to transform the sequence of constant width binary numbers (called "bit combinations" in [ISO 8879, 4.24]) that constitute the entity into the octet stream contained in the MIME body part. Valid values for SGML-bctf are the BCTF notation names defined in Annex C of [ISO-10744] and are reproduced for convenience in Appendix III. SGML-boot The SGML-boot parameter value is the content-ID of a MIME body part (Application/Octet-stream) that satisfies the requirements of the boot attribute in [ISO-10744]. Appendix III contains a summary of those requirements. 2.3. ABNF Specification of the SGML Media-Types sgml-type := sgml-tora parm-list / sgml-nttn sgml-tora := ( "text" / "application" ) "/" "SGML" sgml-nttn := "application" "/" "SGML-notation" parm-list := *( ";" sgml-parm ) sgml-parm := sgml-attr "=" value sgml-attr := "charset" / "SGML-bctf" / "SGML-boot" Note: The sgml-tora, sgml-nttn, and sgml-attr strings are case independent. 2.4. Data Entities Data entities (those that contain data but may not be parsable as SGML) shall be included as MIME body parts whose media- types reflect the data content, i.e., Text/Plain, Image/JPEG, etc. 3. The Content-SGML-Entity Header The Content-SGML-Entity (cse) header is required when encapsu- lating an SGML document within a Multipart/Related MIME entity. The header contains information from the SGML entity declaration corresponding to the entity contained in the body of the body part. A cse header is not required for an SGML entity that is not declared by any other entity in the MIME Levinson Expires January 31, 1996 [Page 8] Internet Draft MIME-SGML message. Data for an entity catalog, defined in [TR9401], can be gen- erated by the unpacker using the cse header data and the stored entity's local file name. When the same data is referred to by several SGML entity declarations, the data only need be present in the one MIME body part. Subsequent body parts can use the Message/External-Body access-type=content-id media-type [RFC- ACTI]. Each of those body parts must have its own Content- SGML-Entity header. The Content-SGML-Entity header is defined as follows. entity-header := "Content-SGML-Entity" ":" "decl-type" "=" decl-type *( ";" cse-parm ) decl-type := "doctype" / "linktype" / "general" / "parameter" / "baseset" / "capacity" / "syntax" cse-parm := cse-attr "=" value / cse-attr := "active" / "doctype" / "linktype" / "name" / "notation-name" / "public-id" / "public-id-ver" / "system-id" / extension-token value := token / quoted-string ; c.f. [RFC-1521] extension-token := ( "X-" / "x-" ) token ; no intervening white space active Specifies the name of an active document type or link type. The parameter is used when the prolog contains more than one document or link type declaration. The parameter can occur multiple times, once for each active document or link type and SGML entity parsing occurs with respect to the active document or link types. decl-type A string specifying the entity declaration type. Decl-type is a token specifying how the entity was declared. Within an SGML document or subdocument each entity type constitutes a unique name space. The possible values for decl-type are: doctype An entity containing an external DTD subset, declared by a doctype declara- tion; the name in this case would be the document type name. Levinson Expires January 31, 1996 [Page 9] Internet Draft MIME-SGML baseset An entity declared by a public identif- ier in a base character set [production 174, ISO-8879, 13.1.1.1]. capacity An entity declared by a public identif- ier in a capacity set [180, ISO-8879, 13.2]. general An entity declared in a entity declara- tion as a general entity linktype An entity containing an external Link Process Definition subset, declared in a linktype declaration; the name parameter is the link type name. notation The header describes a notation declara- tion and, for Application/SGML-notation, the body of the MIME body part contains a content-type header giving the MIME content type corresponding to the nota- tion name or, for other media-types, a description of the processing that the notation specifies. parameter An entity declared in an entity declara- tion as parameter entity. syntax An entity declared by a public identif- ier in a public concrete syntax [183, ISO-8879, 13.4]. doctype A string specifying the document type name of the DTD subset in which the entity was declared, if the entity was declared in a DTD subset other than the base DTD subset. This parameter applies only to entities with a decl-type of "general", "nota- tion", or "parameter". extension-token A parameter not defined in this document and agreed upon by the parties using it, a group of consenting adults. linktype A string specifying the link type name of the Link Process Definition (LPD) subset in which the entity was declared, if the entity was declared in an LPD subset. This parameter is required only for entities with a decl-type of "general" or "parameter". name A string giving the name of the entity and is omitted if the entity has no name. Levinson Expires January 31, 1996 [Page 10] Internet Draft MIME-SGML notation-name The notation name of an external entity. Not valid if the decl-type is "notation". The value of this parameter corresponds to the value of a Content-SGML-Entity header name parameter. public-id The public identifier in the entity's declaration. public-id-ver The display version if a public text display ver- sion was not present in the public id. Use this the parameter only if a device dependent display version was used. system-id The system identifier in the entity's declaration. MIME headers, including the cse header only contain US-ASCII characters. SGML entity declarations, on the other hand, will contain characters from the SGML document's character set. Characters in that set that are not US-ASCII should be represented as an SGML numeric character reference in the reference concrete syntax (e.g., "&#nnn;", where "nnn" is the integer code position of the character). 4. Encapsulating the SGML Entities The SGML document's must be processed as a unit, handling the MIME body parts individually is not sufficient. The MIME Multipart/Related provides the framework for handling the SGML composite structure. An SGML document can recursively contain subdocuments each of which has its own entity structure. The name spaces for SGML entities are wholly contained within a subdocument. Conse- quently the entity names specified on a Content-SGML-Entity header must exist in an environment that preserves those name spaces. Recursively encapsulating each SGML subdocument within an Multipart/Related MIME entity accomplishes that. Thus a subdocument within the document occurs as its own Multipart/Related entity within the document's Multipart/Related entity. The recursive MIME Multipart struc- ture directly mirrors by the recursive subdocument structure. 4.1. The Multipart/Related Media-Type The Multipart/Related [RFC-REL] media-type contains a set of related body parts, in this case entities in an SGML document, and its "start" parameter names the body part within the MIME entity with which processing starts, i.e. the SGML document entity. If there is no start parameter then the first MIME entity in the Multipart/Related must be the SGML document entity. The Multipart/Related "type" parameter shall be the media-type Levinson Expires January 31, 1996 [Page 11] Internet Draft MIME-SGML of the SGML document entity, Text/ or Application/SGML. The "start-info" parameter can contain a list of one or more con- tent references which provide alternative sets of auxiliary information, e.g. stylesheets. The unpacker shall accept the first such set of information that it can use. Below are sample excerpts of an encapsulated SGML document, an appendix contains an extended example. 4.2. Examples The following examples point out some of the key features of the MIME/SGML encapsulation. The examples cover a combined prolog and instance with an implied SGML declaration, the use of the Content-SGML-Entity, Application/SGML-notation, active parameter, and auxiliary information. In the examples below and in Appendix A all content references [RFC-REF] are given by content-ID headers. 4.2.1. Implied SGML Declaration Consider the following document instance which includes the SGML prolog, but which implies the SGML declaration. ]> &chap1; &chap2; &chap3; The Multipart/Related MIME entity's start parameter's value is the content-id of the MIME body part containing the document. MIME-Version: 1.0 Content-Type: Multipart/Related; boundary=tiger-lily start=""; type="application/SGML" --tiger-lily ... --tiger-lily Content-Type: Application/SGML Content-ID: ]> &chap1; &chap2; &chap3; --tiger-lily ... 4.2.2. An SGML Text Entity The entity "chap3" would be a MIME body part such as --tiger-lily Content-Type: Text/SGML Content-SGML-Entity: decl-type=general; name=chap3; system-id="chapt3.sgml"

This is chapter THREE ... --tiger-lily Here, as in most other situations the cse header describes the entity contained in the body part. 4.2.3. A Notation Declaration The notation declaration contained in the SGML prolog will be represented as a separate body part. --tiger-lily Content-Type: Application/SGML-notation Content-SGML-Entity; decl-type=notation; name=jxz; system-id="/usr/local/bin/jxz"; Content-type: Image/JPEG --tiger-lily Note: It can be argued that a separate MIME body part associ- ating an SGML notation declaration name with a MIME media-type is redundant; the association exists in the body part contain- ing the actual data. The content-type header gives the media-type and the Content-SGML-Entity header, the notation name. That, however, does not suffice for entities that are not included directly. There may be public entities that are not included in the encapsulation. 4.2.4. Script-based Notation Consider a notation declaration in which the non-SGML data is processed by a script interpreted by a local process. In this case we use a x-safe-Tkl script. Levinson Expires January 31, 1996 [Page 13] Internet Draft MIME-SGML ... --tiger-lily Content-Type: Application/X-safe-Tkl Content-SGML-Entity: decl-type=notation; name=stkl; system-id="/usr/local/bin/safe-Tkl" [safe-Tkl script] --tiger-lily ... 4.2.5. Active and Doctype Parameter Usage In the example below, two document type declarations are given, one of which is active. Additionally one other entity is included and is labelled as to which document type con- tained the declaration. MIME-Version: 1.0 Content-Type: Multipart/Related; start=; type=Text/SGML; boundary="tiger-lily" --tiger-lily Content-Type: Text/sgml Content-ID: Content-SGML-Entity: decl-type=doctype; active=usrgde ] > ] > ... &chap2; ... --tiger-lily --tiger-lily Content-Type: Text/sgml Content-SGML-Entity: name=chap2; decl-type=general; doctype=usrgde; This is Chapter 2 of a User Guide ... --tiger-lily-- 4.2.6. Auxiliary Information This example show the inclusion of Auxiliary information. X- SGML-catalog is an invented media-type and does not imply a Levinson Expires January 31, 1996 [Page 14] Internet Draft MIME-SGML recommendation or discouragement for using catalogs in an SGML encapsulation. In this particular example the catalog has been included inside the Multipart/Related MIME entity. MIME-Version: 1.0 Content-Type: Multipart/Related; start=; type=Text/SGML; boundary="tiger-lily" --tiger-lily Content-Type: Text/SGML Content-ID: --tiger-lily Content-Type: Application/X-SGML-catalog; Version=TR9401 Content-ID: ENTITY chap2 "/home/users/widget/chapt2.sgml" ENTITY chap3 "/home/users/widget/chapt3.sgml" PUBLIC "-//Acme//TEXT chapt1//EN" "/home/users/sgml/fpis/chapt1.sgml" PUBLIC "-//Acme//DTD Book//EN" "/home/users/sgml/dtds/book.dtd" PUBLIC "ISO 8879-1986//ENTITIES Added Math Symbols: Arrow Relations//EN" "/home/users/sgml/fpis/math-arrow.sgml" PUBLIC "ISO 8879-1986//ENTITIES Added Math Symbols: Binary Operators//EN" "/home/users/sgml/fpis/math-binops.sgml" --tiger-lily-- 5. SGML Entities Not Part of a Document Independent SGML and data entities included in MIME messages constitute independent MIME body parts and are not included within a Multipart/Related MIME entity. The SGML entities shall have the SGML media-types appropriate to the data being sent; data entities shall use the media-type corresponding to their notation declaration. Content-SGML-Entity headers can be used with body parts that are not included within any Multipart/Related MIME entity. This allows, for example, an unpacker to add an entry to a catalog mapping the entity's public identifier to the file in which it stored the entity. 6. SGML Document Interchange Format (SDIF) A MIME encoding of SDIF [ISO-9069] is a conforming SDIF encod- ing [N-1781] and consists of one or more MIME encapsulated SGML documents. When more than one document is present the documents must be contained in an appropriate Multipart MIME Levinson Expires January 31, 1996 [Page 15] Internet Draft MIME-SGML entity. The following correspondence exists between MIME elements and SDIF ones. Data stream character set SGML-boot parameter (see note) SDIF Name data-stream-name Message-ID document-name Content-ID explanatory comments Content-Description Document descriptor MIME body part indicated by Multipart/Related Entity descriptor MIME body part SDIF Identifier Content-SGML-Entity 7. Security SGML documents, like other compound documents, may contain entities whose media-types present security concerns, e.g. Application/PostScript. Further SGML may contain explicit processing instructions for a presentation or composition sys- tem; use of such instructions present concerns similar to those of Application/PostScript. The use of active media-types with Notation declarations can provide an opportunity for the sender to execute a script or other code on the recipient's machine. Unpacking software should alert the user when such situations arise. 8. References [ISO-8824] ISO 8824, Information processing systems -- Open System Interconnection -- Specification of Abstract Syntax Notation One (ASN.1). [ISO-8879] ISO 8879:1986, Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML). [ISO-9069] ISO 9069:1988, Information Processing - SGML Sup- port Facilities -- SGML Document Interchange For- mat (SDIF). [ISO-10744] ISO/IEC 10744:1992, Information technology -- Hypermedia/Time-based Structuring Language (HyTime) (as modified by First Proposed Technical Corrigendum, ISO/IEC JTC1/SC18 N5027) [N-1781] ISO/IEC JTC1/SC18/WG8 N1781, "Clarification of the Requirements for Encoding the SGML Document Inter- change Format (SDIF, ISO 9069). [RFC-822] Crocker, D., Standard for the Format of ARPA Levinson Expires January 31, 1996 [Page 16] Internet Draft MIME-SGML Internet Text Messages, August 1982, University of Delaware, RFC 822. [RFC-1521] N. Borenstein, N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Inter- net Message Bodies", 09/23/1993. [RFC-1522] K. Moore, "MIME (Multipurpose Internet Mail Exten- sions) Part Two: Message Header Extensions for Non-ASCII Text", 09/23/1993. [RFC-1590] J. Postel, "Media Type Registration Procedure", 03/02/1994. [RFC-1642] D. Goldsmith, M. Davis, "UTF-7, A Mail-Safe Transformation Format of UNICODE", 07/13/1994 [RFC-REL] H. Alvestrand, E. Levinson, "The MIME Multipart/Related Content-type", Internet Draft, draft-ietf-mimesgml-related-00.txt, working draft. [RFC-ACTI] E. Levinson, "The Message/External-Body Content-ID Access Type", Internet Draft, draft-ietf- mimesgml-cid-00.txt, working draft. [TR9401] SGML Open Consortium Technical Resolution 9401:1994, "Entity Management", 08/09/1994 [US-ASCII] Coded Character Set -- 7-Bit American Standard Code for Information Interchange, ANSI X3.4-1986. 9. Acknowledgements The editor has borrowed freely from the suggestions of others and in particular lifted text from James J. Clark and Charles F. Goldfarb (Information Management Consulting) and benefitted from a number of discussions with them. If any errors occurred in translating their words into this text, rest assure that the misinterpretation was mine. The editor also acknowledges Terry Allen (O'Reilly & Associ- ates, Inc.), Harald T. Alvestrand (UniNett), Nathaniel Boren- stein (First Virtual Holdings Incorporated), Daniel W. Con- nolly (W3O), Steven DeRose (EBT), Roy Fielding (University of California, Irvine), Andy Gelsey (CSC), Paul Grosso (Arbor- Text, Inc.), John Klensin (MCI), Einar Stefferud (Network Management Associates, Inc), Don Stinchfield (EBT), and Erik Naggum (Naggum Software), for their suggestions, explanations, and encouragement. No errors or faults in this document can be ascribed to them, they're all mine. 10. Author's Address Levinson Expires January 31, 1996 [Page 17] Internet Draft MIME-SGML Ed Levinson ELevinson@Accurate.com Accurate Information Systems, Inc. 2 Industrial Way Eatontown, NJ 0772 Levinson Expires January 31, 1996 [Page 18] Internet Draft MIME-SGML APPENDIX A. An Extended Example This example presents a variety of SGML entity declarations and the corresponding Content-SGML-Entity headers. MIME-Version: 1.0 Content-Type: Multipart/Related; boundary=tiger-lily start=""; type="application/SGML" --tiger-lily Content-Type: Application/SGML Content-ID: ]> &chap1; &chap2; &chap3; --tiger-lily Content-Type: Text/SGML Content-SGML-Entity: decl-type=general; name=chap1; public-id="-//Acme//TEXT chapt1//EN"

This is chapter ONE ... --tiger-lily Content-Type: Text/SGML; Content-SGML-Entity: decl-type=general; name=chap2;

This is chapter TWO ... --tiger-lily Content-Type: Text/SGML Content-SGML-Entity: decl-type=general; name=chap3; system-id="chapt3.sgml"

This is chapter THREE ... --tiger-lily Content-Type: Application/SGML Content-SGML-Entity: decl-type=doctype; name=book; public-id="-//Acme//DTD Book//EN"; system-id="/home/users/sgml/dtds/book.dtd" <-- Acme Widget Company --> <-- Instruction Book DTD --> &Isolat1; Levinson Expires January 31, 1996 [Page 19] Internet Draft MIME-SGML &ISOamsa; --tiger-lily Content-Type: image/jpeg Content-Transfer-Encoding: BASE64 Content-SGML-Entity: decl-type=general; name=fig1; system-id="fig1.jxz"; notation-name=jxz [Base64 encoded binary image data] --tiger-lily-- Levinson Expires January 31, 1996 [Page 20] Internet Draft MIME-SGML APPENDIX B. Notes for Implementors An SGML document is encapsulated with the sender's references to her local storage objects intact. The receiving system's SGML Entity Manager may be able to translate those references to to its local storage objects. The recipient's storage objects must be provided by the MIME User Agent to the unpacker. Other SGML systems, not capable of translating the sender's references, must depend on the packer to parse the SGML docu- ment and replace the sender's references with valid local ones. Levinson Expires January 31, 1996 [Page 21] Internet Draft MIME-SGML APPENDIX C. ISO-10744 BCTF Values and Boot Attribute C.1. Bit Combination Transformation Format Values The following list Bit Combination Transformation Format (BCTF) values is provided as a convenience. The authoritive source is [ISO-10744]. identity Each bit combination is represented by a single octet; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 255. fixed-2 Each bit combination is represented by exactly 2 octets, with the more significant octet first; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 65535. fixed-3 Each bit combination is represented by exactly 3 octets, with a more significant octet preceding any less significant octets; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 16777215. fixed-4 Each bit combination is represented by exactly 4 octets, with a more significant octet preceding any less significant octets. utf-8 Each bit combination is represented by a variable number of octets according to UCS Transformation Format 8 defined in Annex P to be added by the first proposed drafted amendment (PDAM 1) to ISO/IEC 10646-1:1993. utf-7 Each bit combination is represented by a variable number of octets in the range 0 through 127 as described in [RFC-1642]; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 65535. euc-jp Each bit combination is treated as a pair of octets, most significant octet first, encoding a character using the Extended_UNIX_Code_Fixed_Width_for_Japanese charset, and is transformed into the variable length sequence of octets that would encode that character using the Extended_UNIX_Code_Packed_Format_for_Japanese char- set. sjis Each bit combination is treated as a pair of octets, most significant octet first, encoding a character Levinson Expires January 31, 1996 [Page 22] Internet Draft MIME-SGML using the Extended_UNIX_Code_Fixed_Width_for_Japanese charset, and is transformed into the variable length sequence of octets that would encode that character using the Shift_JIS charset. C.2. The Boot Attribute The body part specified by the SGML-boot parameter contains a sequence of triplets of positive integers separated by white space. The triplets correspond to the described character set portion [IS0-8879, 13.1.1.2] of the SGML declaration. SGML- boot provides the capability to identify the character set of the document's SGML declaration when it uses significant SGML characters [ibid., 4.298] in the SGML reference concrete syn- tax [ibid., 13.4] that have a character number [ibid., 4.44] in the document's character set that differs from us-ascii. The default value is "0 128 0", all characters are us-ascii. Notes: (1) The triplet, has the following meaning. Starting with character number dscn in the us-ascii character set, renumber noc characters starting at bscn and incrementing by one. Thus, 0 128 0, represents the identity mapping. (2) The document's declaration itself may also re- define the significant SGML characters; the boot attribute is intended to bootstrap the SGML system's parse of the declara- tion. Levinson Expires January 31, 1996 [Page 23]