MIMESGML Working Group E. Levinson Internet Draft: MIME/SGML ACCURATE Info. Sys. June 1, 1995 Encapsulating SGML Documents Using the Multipart/Related Content-Type This draft document is being circulated for comment. Please send your comments to the authors or to the sgml-internet mail list . Archives of the email discussions are available at ftp://ftp.naggum.no:/pub/SGML-internet filed by date and time. Status of this Memo This document is an Internet Draft; Internet Drafts are working documents of the Internet Engineering Task Force (IETF) its Areas, and Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. They may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress". Please check the abstract listing in each Internet Draft directory for the current status of this or any other Internet Draft. Abstract This draft describes the encapsulation of a Standard Generalized Markup Language (SGML) document withing a MIME message. It proposes new content sub-types of Text/SGML, Application/SGML, and Application/SGML-notation, and a new header, Content-SGML-Entity. This specification uses the proposed Multipart/Related Content-Type [RFC-REL] and access-type=content-id [RFC- ACTI] specifications. Multipart/Related provides the mechanism for treating the entire document as a single object and access-type=content-type allows a single MIME entity to appear several times without replicating the body of that MIME entity. Levinson December 1995 [Page 1] Internet Draft MIME-SGML Table of Contents Levinson December 1995 [Page 2] Internet Draft MIME-SGML 1. Introduction A need exists for the transfer of documents constructed using the Standard Generalized Markup Language (SGML) [ISO- 8879]. Those documents consist of a set of inter-related components whose structural relationship must be preserved independently of the system on which the document exists. The components and their relationships are often represented as files with explicit internal references to other components (files). The encapsulation described here permits such transfers using the Multipurpose Internet Mail Extensions (MIME) specification [RFC-1521]. The goals for the MIME encapsulation of SGML is to permit the receiving system to display (or process) the SGML document with minimal effort and maximum flexibility. In particular, multiple parses of the SGML document can be avoided by using the information from SGML entity and notation declarations. The Content-SGML-Entity header makes that information available. Sections 2 and 3 define the basic elements for labelling the SGML entities. Section 4 describes the encapsulation of the documents entities within a single Multipart MIME entity. The two sections following that describe the handling of incomplete or unparsable documents and the SGML Document Interchange Format (SDIF) [ISO-9069]. 1.1. Terminology Both SGML and MIME use the term "entity" to refer to their basic components. Here the use of "entity" generally connotes an SGML entity. For MIME entities, body part, is used; in some contexts that proves awkward and "MIME entity" is used instead. The context hopefully makes such usage clear. Two SGML terms, SGML Document and SGML Document Entity, are used in this paper and the difference between them is significant. An SGML Document [ISO-8879, 4.282] is the entire collection of objects or entities that make up a document. Those objects include markup definitions, text with SGML markup, plain text, image data, etc. An SGML document entity [ibid., 4.283], on the other hand, is the specific object with which an SGML system begins processing the SGML document. 1.2. Standard Generalized Markup Language (SGML) The Standard Generalized Markup Language (SGML) is used to encode document structure and layout. A rigorous description of SGML is left to [ISO-8879]. The terms used in the present document attempt to be consistent with SGML terminology and usage. Levinson December 1995 [Page 3] Internet Draft MIME-SGML An SGML document exists as a collection of one or more entities; entities are system independent analogues to files. Those SGML entities are mapped to storage objects or files and the mapping may be one-to-one, many-to-one, or one-to-many. The SGML document refers to the storage objects via entity declarations. The declarations may define the name and type of the storage object or provide a name by which a SGML system can map the declared entity to a storage object. Preservation of the structure of references from one entity to another, known in SGML as the entity structure, are key to the email exchange of SGML documents. For a person or application to receive and display a complete SGML document the mail message must carry a precise definition for each of the SGML entities. In the sender's environment the entities may reference standard names, called formal public identifiers, or specific local files, or both. Further, some SGML entities may refer to other entities, for example files containing text, images, or graphics. The identity and content of each entity must be available to the recipients to enable them to transform the sender's entity references into an equivalent local reference and to instantiate the entities locally. This document describes the MIME encapsulation of an SGML document that preserves the entity structure and permits the recipient of the encapsulated document to automatically instantiate it locally. 1.3. SGML Document Interchange Format (SDIF) SDIF [ISO-9069] defines a data stream structure for exchanging one or more SGML documents. A multipart MIME message consisting of one or more documents as described below is a conforming SDIF data stream [N1781]. 2. A Model for MIME/SGML Four issues must be addressed for the recipient's user agent to display a complete SGML document. The various document parts must be specified and entity references on the sender's systems must be resolved to corresponding references on the receiver's system. Similarly, notation declarations, that is, references to processors for non-SGML data, must be resolved into valid processes on the receiving system. An appropriate application, called the unpacker, must be in control to present the MIME body parts and the entity and notation information to the display software. Finally, the MIME encapsulated SGML text entities must be independent of the sender's system representation dependencies. The first three issues are addressed in the following manner. Two MIME media-types (content-types) are defined for SGML text, Text/SGML and Application/SGML, and one for Levinson December 1995 [Page 4] Internet Draft MIME-SGML conveying process associations, Application/SGML-notation. A new header, Content-SGML-Entity, provides the entity description for the body part containing the entity. Notation information, carried in an application/SGML- notation body part, associates notation declarations with MIME media-types. The entities that constitute the SGML document are contained within the same Multipart/Related MIME entity. These elements form the basis for the SGML MIME encapsulation. SGML defines an Entity Manager [ISO-8879, 4.123] that performs the mapping between SGML entities and the local file system. The specification of that mapping is system dependent. Consequently the each SGML entity shall be represented as one MIME entity. 2.1. The SGML Media-Types There are two media-types for SGML parsable data entities, Text/SGML and Application/SGML. Both have the same optional parameters and produce the same results for recipients with SGML capability. Text/SGML provides a fallback for those without SGML capability. Senders should base the choice between text and application media-types on the entity's content. Text is suggested for entities that would be meaningful to a human being without SGML processing. Application/SGML is recommended for all others. A third media-type, Application/SGML-notation, applies to non-SGML data and provides the connection between an SGML declaration and a MIME media-type. 2.1.1. Text/SGML MIME type name: Text MIME subtype name: SGML Required parameters: none Optional parameters: charset, SGML-bctf, SGML-boot Encoding considerations: may be encoded Security considerations: none Published specification: RFC-SGML Person and email address to contact for further information: E. Levinson The Text/SGML media-type can be employed when the contents of the SGML entity is intended to be read by a human and is in a readily comprehensible form. That is the content can be easily discerned by someone without SGML display software. Each record in the SGML entity, delimited by record start (RS) and record end (RE) codes, must correspond to a line in the Text/SGML body part. SGML entities that do not meet the above requirements should use the Application/SGML media-type. Levinson December 1995 [Page 5] Internet Draft MIME-SGML 2.1.2. Application/SGML MIME type name: Application MIME subtype name: SGML Required parameters: none Optional parameters: SGML-bctf, SGML-boot Encoding considerations: may be encoded Security considerations: none Published specification: RFC-SGML Person and email address to contact for further information: E. Levinson Use the Application/SGML media-type for SGML text entities that are not appropriate for Text/SGML. When used, each record start (RS) and record end (RE) character shal be explicitly represented by the bit combination specified in the SGML declaration. 2.1.3. Application/SGML-Notation MIME type name: Application MIME subtype name: SGML-Notation Required parameters: none Optional parameters: none Encoding considerations: none Security considerations: none Published specification: RFC-SGML Person and email address to contact for further information: E. Levinson The Application/SGML-Notation media-type provides the connection between the document's SGML notation declarations and MIME media-types. The MIME entity must contain a Content-SGML-Entity. The body of the SGML-Notation MIME entity contains a Content-Type header that specifies the media-type associated with the name parameter of the Content-SGML-Entity statement. Some SGML notation declarations may correspond to a script for an active media-type (e.g., safe-Tkl). In those cases a MIME entity with the corresponding media-type should be used. That MIME entity shall contain an appropriate Content-SGML-Entity header. 2.2. SGML Content-Type Parameters The parameters for the Text and Application SGML subtypes are defined below. charset The charset parameter is defined in [RFC-1521], the valid values and their meaning are registered by the Internet Assigned Numbers Authority (IANA) [RFC-1590]. The default charset value for all Levinson December 1995 [Page 6] Internet Draft MIME-SGML Text content-types is "us-ascii" [RFC-1521]. The charset parameter is provided to permit non- SGML capable systems to provide reasonable behavior when Text/SGML defaults to Text/Plain. SGML capable systems will use the SGML-bctf param- eter. SGML-bctf The SGML-bctf (SGML bit combination transformation format) parameter describes the method used to transform the sequence of constant width binary numbers (called "bit combinations" in [ISO 8879, 4.24]) that constitute the entity into the octet stream contained in the MIME body part. Valid values for SGML-bctf are the BCTF notation names defined in Annex C of [ISO-10744] and are reproduced for convenience in Appendix III. SGML-boot The SGML-boot parameter value is the content-ID of a MIME body part (Application/Octet-stream) that satisfies the requirements of the boot attribute in [ISO-10744]. Appendix III contains a summary of those requirements. 2.3. ABNF Specification of the SGML Media-Types sgml-type := sgml-tora parm-list / sgml-nttn sgml-tora := ( "text" / "application" ) "/" "SGML" sgml-nttn := "application" "/" "SGML-notation" parm-list := *( ";" sgml-parm ) sgml-parm := sgml-attr "=" value sgml-attr := "charset" / "SGML-bctf" / "SGML-boot" Note: The sgml-tora, sgml-nttn, and sgml-attr strings are case independent. 2.4. Data Entities Data entities (those that contain data tha may not be pares- able as SGML) shall be included as MIME body parts whose media-types reflect the data content, i.e., Text/Plain, Image/JPEG, etc. 3. The Content-SGML-Entity Header The Content-SGML-Entity (cse) header is required when encapsu- lating an SGML document within a Multipart/Related MIME Levinson December 1995 [Page 7] Internet Draft MIME-SGML entity. The header contains information from the SGML entity declaration corresponding to the entity contained in the body of the body part. A cse header is not required for an SGML entity that is not declared by any other entity in the MIME message. A catalog, defined in [TR9401], can be generated by the unpacker using the cse header data. When the same data is referred to by several SGML entity declarations, the data only need be present in the one MIME body part. Subsequent body parts can use the Message/External-Body access-type=content-id media-type [RFC- ACTI]. Each of those body parts must have its own Content- SGML-Entity header. The Content-SGML-Entity header is defined as follows. entity-header := "Content-SGML-Entity" ":" "decl-type" "=" decl-type *( ";" cse-parm ) decl-type := "doctype" / "linktype" / "general" / "parameter" / "baseset" / "capacity" / "syntax" cse-parm := cse-attr "=" value / cse-attr := "name" / "doctype" / "linktype" / "public-id" / "system-id" / "notation-name" / "content-type" / extension-token value := token / quoted-string ; c.f. [RFC-1521] extension-token := ( "X-" / "x-" ) token ; no intervening white space decl-type A string specifying the entity declaration type. Decl-type is a token specifying how the entity was declared. Within an SGML document or subdocument each entity type constitutes a unique name space. The possible values for decl-type are: doctype An entity containing an external DTD subset, declared by a doctype declara- tion; the name in this case would be the document type name. baseset An entity declared by a public identif- ier in a base character set [production 174, ISO-8879, 13.1.1.1]. Levinson December 1995 [Page 8] Internet Draft MIME-SGML capacity An entity declared by a public identif- ier in a capacity set [180, ISO-8879, 13.2]. general An entity declared in a entity declara- tion as a general entity linktype An entity containing an external Link Process Definition subset, declared in a linktype declaration; the name parameter is the link type name. notation The header describes a notation declara- tion and, for Application/SGML-notation, the body of the MIME body part contains a content-type header giving the MIME content type corresponding to the nota- tion name or, for other media-types, a description of the processing that the notation specifies. parameter An entity declared in an entity declara- tion as parameter entity. syntax An entity declared by a public identif- ier in a public concrete syntax [183, ISO-8879, 13.4]. doctype A string specifying the document type name of the DTD subset in which the entity was declared, if the entity was declared in a DTD subset other than the base DTD subset. This parameter applies only to entities with a decl-type of "general", "nota- tion", or "parameter". extension-token A parameter not defined in this document and agreed upon by the parties using it, a group of consenting adults. linktype A string specifying the link type name of the Link Process Definition (LPD) subset in which the entity was declared, if the entity was declared in an LPD subset. This parameter is required only for entities with a decl-type of "general" or "parameter". name A string giving the name of the entity and is omitted if the entity has no name. notation-name The notation name of an external entity. Not valid if the decl-type is "notation". The value of this parameter will be the value of the name Levinson December 1995 [Page 9] Internet Draft MIME-SGML parameter of a Content-SGML-Entity header. public-id The public identifier in the entity's declaration. public-id-ver The display version if a public text display ver- sion was not present in the public id. Use this the parameter only if a device dependent display version was used. system-id The system identifier in the entity's declaration. MIME headers, including the cse header only contain US-ASCII characters. SGML entity declarations, on the other hand, will contain characters from the SGML document's character set. Characters in that set that are not US-ASCII should be represented as an SGML numeric character reference in the reference concrete syntax (e.g., "&#nnn;", where "nnn" is the integer code position of the character). 4. Encapsulating SGML Documents SGML documents must be processed as a unit, handling the indi- vidula MIME body parts is not sufficient. The MIME Multipart/Related provides the framework for handling the SGML composite structure. An SGML document can recursively contain subdocuments each of which has its own entity structure. The name spaces for SGML entities are wholly contained within a subdocument. Conse- quently the entity names specified on a Content-SGML-Entity header must exist in an environment that preserves those name spaces. Recursively encapsulating each SGML subdocument within an Multipart/Related MIME entity accomplishes that. Thus a subdocument within the document occurs as its own Multipart/Related entity within the document's Mutltipart/Related entity. The recursive MIME Multipart structure directly mirrors by the recursive subdocument struc- ture. 4.1. The Multipart/Related Media-Type The Multipart/Related [RFC-REL] media-type contains a set of related body parts, in this case an SGML document, and its start parameter names the body part within the Multipart/Related MIME entity with which processing starts, the SGML document entity. It must contain a Content-ID header whose value corresponds to the one in the start parameter. If there is no start parameter then the first MIME entity in the Multipart/Related must be the SGML document entity. Below are sample excerpts of an encapsulated SGML document, An appendix contains the complete example. Levinson December 1995 [Page 10] Internet Draft MIME-SGML 4.2. Examples The following examples point out some of the key features of the MIME/SGML encapsulation. The examples cover a combined prolog and instance with an implied SGML declaration, the use of the Content-SGML-Entity, and Application/SGML-notation. 4.2.1. Implied SGML declaration Consider the following document instance which includes the SGML prolog, but which implies the SGML declaration. ]> &chap1; &chap2; &chap3; The Multipart/Related MIME entity's start parameter's value is the content-id of the MIME body part containing the document. MIME-Version: 1.0 Content-Type: Multipart/Related; boundary=tiger-lily start=""; type="application/SGML" --tiger-lily ... --tiger-lily Content-Type: Application/SGML Content-ID: ]> &chap1; &chap2; &chap3; --tiger-lily ... 4.2.2 An SGML Text Entity Levinson December 1995 [Page 11] Internet Draft MIME-SGML The entity "chap3" would be a MIME body part such as --tiger-lily Content-Type: Text/SGML Content-SGML-Entity: decl-type=general; name=chap3; system-id="chapt3.sgm"

This is chapter THREE ... --tiger-lily Here, as in most other situations the cse header describes the entity contained in the body part. 4.2.3 A Notation Declaration The notation declaration contained in the SGML prolog will be represented as a separate body part. --tiger-lily Content-Type: Application/SGML-notation Content-SGML-Entity; decl-type=notation; name=jxz; system-id="/usr/local/bin/jxz"; Content-type: Image/JPEG --tiger-lily Note: It can be argued that a separate MIME body part associ- ating an SGML notation declaration name with a MIME media-type is redundant; the association exists in the body part contain- ing the actual data. The content-type header gives the media-type and the Content-SGML-Entity header, the notation name. That, however, does not suffice for entities that are not included directly. There may be public entities that are not included in the encapsulation. 4.2.4. Script-based Notation Consider a notation declaration in which the non-SGML data is processed by a script interpreted by a local process. In this case we use a x-safe-Tkl script. ... --tiger-lily Content-Type: Application/X-safe-Tkl Content-SGML-Entity: decl-type=notation; name=stkl; system-id="/usr/local/bin/safe-Tkl" [safe-Tkl script] --tiger-lily ... 5. Partial or Incomplete Documents Levinson December 1995 [Page 12] Internet Draft MIME-SGML Independent SGML and data entities included in MIME messages constitute independent MIME body parts and are not included within a Multipart/Related MIME entity. The SGML entities shall have the SGML media-types appropriate to the data being sent; data entities shall use the media-type corresponding to their notation declaration. Content-SGML-Entity headers can be used with body parts that are not included within any Multipart/Related MIME entity. This allows, for example, an unpacker to add an entry to a catalog mapping the entity's public identifier to the file in which it stored the entity. 6. SGML Document Interchange Format (SDIF) A MIME encoding of SDIF [ISO-9069] is a conforming SDIF encod- ing [N-1781] and consists of one or more MIME encapsulated SGML documents. When more than one document is present the documents must be contained in an appropriate Multipart MIME entity. The following correspondence exists between MIME elements and SDIF ones. Data stream character set SGML-boot parameter (see note) SDIF Name data-stream-name Message-ID document-name Content-ID explanatory comments Content-Description Document descriptor MIME body part indicated by Multipart/Related Entity descriptor MIME body part SDIF Identifer Content-SGML-Entity Note: that the MIME encapsulation permits each document to have its own document character set. 7. Security SGML documents, like other compound documents, may contain entities whose media-types present security concerns, e.g. Application/PostScript. Further SGML may contain explicit processing instructions for a presentation or composition sys- tem; use of such instructions present concerns similar to those of Application/PostScript. The use of active media-types with Notation declarations can provide an opportunity for the sender to execute a script or other code on the recipient's machine. Unpacking software should alert the user when such situations arise. 8. References Levinson December 1995 [Page 13] Internet Draft MIME-SGML [ISO-8824] ISO 8824, Information processing systems -- Open System Interconnection -- Specification of Abstract Syntax Notation One (ASN.1). [ISO-8879] ISO 8879:1986, Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML). [ISO-9069] ISO 9069:1988, Information Processing - SGML Sup- port Facilities -- SGML Document Interchange For- mat (SDIF). [ISO-10744] ISO/IEC 10744:1992, Information technology -- Hypermedia/Time-based Structuring Language (HyTime) (as modified by First Proposed Technical Corrigendum, ISO/IEC JTC1/SC18 N5027) [N-1781] ISO/IEC JTC1/SC18/WG8 N1781, "Clarification of the Requirements for Encoding the SGML Document Inter- change Format (SDIF, ISO 9069). [RFC-822] Crocker, D., Standard for the Format of ARPA Internet Text Messages, August 1982, University of Delaware, RFC 822. [RFC-1521] N. Borenstein, N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Inter- net Message Bodies", 09/23/1993. [RFC-1522] K. Moore, "MIME (Multipurpose Internet Mail Exten- sions) Part Two: Message Header Extensions for Non-ASCII Text", 09/23/1993. [RFC-1590] J. Postel, "Media Type Registration Procedure", 03/02/1994. [RFC-1642] D. Goldsmith, M. Davis, "UTF-7, A Mail-Safe Transformation Format of UNICODE", 07/13/1994 [RFC-REL] H. Alvestrand, E. Levinson, "The MIME Multipart/Related Content-type", Internet Draft, draft-ietf-mimesgml-related-00.txt [RFC-ACTI] E. Levinson, "The Message/External-Body Content-ID Access Type", Internet Draft, draft-ietf- mimesgml-cid-00.txt [TR9401] SGML Open Consortium Technical Resolution 9401:1994, "Entity Management", 08/09/1994 [US-ASCII] Coded Character Set -- 7-Bit American Standard Code for Information Interchange, ANSI X3.4-1986. Levinson December 1995 [Page 14] Internet Draft MIME-SGML 9. Acknowledgements The editor has borrowed freely from the suggestions of others and in particular lifted text from James Clark and Charles F. Goldfarb (Information Management Consulting), and ideas from Roy Fielding (University of California, Irvine). If any errors occurred in translating their words into this text, rest assure that the misinterpretation was mine. The editor also acknowledges Terry Allen (O'Reilly & Associ- ates, Inc.), Harald T. Alvestrand (UniNett), Nathaniel Boren- stein (First Virtual Holdings Incorporated), Daniel W. Con- nolly (W3O), Steven DeRose (EBT), Andy Gelsey (CSC), Paul Grosso (ArborText, Inc.), John Klensin (MCI), Einar Stefferud (Network Management Associates, Inc), and Erik Naggum (Naggum Software), for their suggestions, explanations, and encourage- ment. No errors or faults in this document can be ascribed to them, they're all mine. 10. Author's Address Ed Levinson ELevinson@Accurate.com Accurate Information Systems, Inc. 2 Industrial Way Eatontown, NJ 0772 Levinson December 1995 [Page 15] Internet Draft MIME-SGML APPENDIX A. A Complete Example MIME-Version: 1.0 Content-Type: Multipart/Related; boundary=tiger-lily start=""; type="application/SGML" --tiger-lily Content-Type: Application/SGML Content-ID: ]> &chap1; &chap2; &chap3; --tiger-lily Content-Type: Text/SGML Content-SGML-Entity: decl-type=general; name=chap1; public-id="-//Acme//TEXT chapt1//EN"

This is chapter ONE ... --tiger-lily Content-Type: Text/SGML; Content-SGML-Entity: decl-type=general; name=chap2;

This is chapter TWO ... --tiger-lily Content-Type: Text/SGML Content-SGML-Entity: decl-type=general; name=chap3; system-id="chapt3.sgm"

This is chapter THREE ... --tiger-lily Content-Type: Application/SGML Content-SGML-Entity: decl-type=doctype; name=book; public-id="-//Acme//DTD Book//EN"; system-id="/home/users/sgml/dtds/book.dtd" <-- Acme Widget Company --> <-- Instruction Book DTD --> --tiger-lily Content-Type: image/jpeg Content-Transfer-Encoding: BASE64 Content-SGML-Entity: decl-type=general; name=fig1; Levinson December 1995 [Page 16] Internet Draft MIME-SGML system-id="fig1.jxz"; notation-name=jxz [Base64 encoded binary image data] --tiger-lily-- Levinson December 1995 [Page 17] Internet Draft MIME-SGML APPENDIX B. Notes for Implementors An SGML document is encapsulated with the sender's references to her local storage objects intact. The receiving system's SGML Entity Manager may be able to translate those references to to its local storage objects. The recipient's storage objects must be provided by the MIME User Agent to the unpacker. Other SGML systems, not capable of translating the sender's references, must depend on the packer to parse the SGML docu- ment and replace the sender's references with valid local ones. Levinson December 1995 [Page 18] Internet Draft MIME-SGML APPENDIX C. ISO-10744 BCTF Values and Boot Attribute C.1. Bit Combination Transformation Format Values The following list Bit Combination Transformation Format (BCTF) values is provided as a convenience. The authoritive source is [ISO-10744]. identity Each bit combination is represented by a single octet; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 255. fixed-2 Each bit combination is represented by exactly 2 octets, with the more significant octet first; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 65535. fixed-3 Each bit combination is represented by exactly 3 octets, with a more significant octet preceding any less significant octets; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 16777215. fixed-4 Each bit combination is represented by exactly 4 octets, with a more significant octet preceding any less significant octets. utf-8 Each bit combination is represented by a variable number of octets according to UCS Transformation Format 8 defined in Annex P to be added by the first proposed drafted amendment (PDAM 1) to ISO/IEC 10646-1:1993. utf-7 Each bit combination is represented by a variable number of octets in the range 0 through 127 as described in [RFC-1642]; this BCTF can be used only for entities all of whose bit combinations have a value not exceeding 65535. euc-jp Each bit combination is treated as a pair of octets, most significant octet first, encoding a character using the Extended_UNIX_Code_Fixed_Width_for_Japanese charset, and is transformed into the variable length sequence of octets that would encode that character using the Extended_UNIX_Code_Packed_Format_for_Japanese char- set. sjis Each bit combination is treated as a pair of octets, most significant octet first, encoding a character Levinson December 1995 [Page 19] Internet Draft MIME-SGML using the Extended_UNIX_Code_Fixed_Width_for_Japanese charset, and is transformed into the variable length sequence of octets that would encode that character using the Shift_JIS charset. C.2. The Boot Attribute The body part specified by the SGML-boot parameter contains a sequence of triplets of positive integers separated by white space. The triplets correspond to the described character set portion [IS0-8879, 13.1.1.2] of the SGML declaration. SGML-boot pro- vides the capability to identify the character set of the document's SGML declaration when it uses sig- nificant SGML characters [ibid., 4.298] in the SGML reference concrete syntax [ibid., 13.4] that have a character number [ibid., 4.44] in the document's character set that differs from us-ascii. The default value is "0 128 0", all characters are us- ascii. Notes: (1) The triplet, has the fol- lowing meaning. Starting with character number dscn in the us-ascii character set, renumber noc charac- ters starting at bscn and incrementing by one. Thus, 0 128 0, represents the identity mapping. (2) The document's declaration itself may also re- define the significant SGML characters; the boot attribute is intended to bootstrap the SGML system's parse of the declaration. Levinson December 1995 [Page 20]