Applications Area Working Group S. Leonard Internet-Draft Penango, Inc. Intended Status: Informational October 17, 2014 Expires: April 20, 2015 The text/markdown Media Type draft-ietf-appsawg-text-markdown-03 Abstract This document registers the text/markdown media type for use with Markdown, a family of plain text formatting syntaxes that optionally can be converted to formal markup languages such as HTML. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Leonard Exp. April 20, 2015 [Page 1] Internet-Draft The text/markdown Media Type October 2014 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. This Is Markdown! Or: Markup and Its Discontents . . . . . 2 1.2. Markdown Is About Writing and Editing . . . . . . . . . . . 3 1.3. RFC 2119 . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Markdown Media Type Registration Application . . . . . . . . . 5 3. Optional Parameters . . . . . . . . . . . . . . . . . . . . . 7 3.1. syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2. output-type . . . . . . . . . . . . . . . . . . . . . . . . 11 4. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . . 13 4.1. #t . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2. #o . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3. #l and #ldef . . . . . . . . . . . . . . . . . . . . . . . 13 4.4. Other Fragment Identifiers . . . . . . . . . . . . . . . . 14 5. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 6.1. Syntax Template . . . . . . . . . . . . . . . . . . . . . . 15 6.2. Initial Registration . . . . . . . . . . . . . . . . . . . 17 6.3. Reserved Identifiers . . . . . . . . . . . . . . . . . . . 18 6.4. Standard of Review . . . . . . . . . . . . . . . . . . . . 18 6.5. Provisional Registration . . . . . . . . . . . . . . . . . 19 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 19 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 8.1. Normative References . . . . . . . . . . . . . . . . . . . 19 8.2. Informative References . . . . . . . . . . . . . . . . . . 20 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 21 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 22 1. Introduction 1.1. This Is Markdown! Or: Markup and Its Discontents In computer systems, textual data is stored and processed using a continuum of techniques. On the one end is plain text: a linear sequence of characters in some character set (code), possibly interrupted by line breaks, page breaks, or other control characters. The repertoire of these control characters (a form of in-band signaling) is necessarily limited, and not particularly extensible. Because they are non-printing, these characters are also hard to enter with standard keyboards. Markup offers an alternative means to encode this signaling information by overloading certain characters with additional meanings. Therefore, markup languages allow for annotating a document in such a way that annotations are syntactically distinguishable from the printing information. Markup languages are (reasonably) well- Leonard Exp. April 20, 2015 [Page 2] Internet-Draft The text/markdown Media Type October 2014 specified and tend to follow (mostly) standardized syntax rules. Examples of formal markup languages include SGML, HTML, XML, and LaTeX. Standardized rules lead to interoperability between markup processors, but impose skill requirements on new users that lead to markup languages becoming less accessible to beginners. These rules also reify "validity": content that does not conform to the rules is treated differently (i.e., is rejected) than content that conforms. In contrast to formal markup languages, lightweight markup languages use simple syntaxes; they are designed to be easy for humans to enter and understand with basic text editors. Markdown, the subject of this document, began as an /informal/ plain text formatting syntax [MDSYNTAX] and Perl script HTML/XHTML processor [MARKDOWN] targeted at non-technical users using unspecialized tools, such as plain text e-mail clients. [MDSYNTAX] explicitly rejects the notion of validity: there is no such thing as "invalid" Markdown. If the Markdown content does not result in the "right" output (defined as output that the author wants, not output that adheres to some dictated system of rules), the expectation is that the author should continue experimenting by changing the content or the processor to achieve the desired output. Since its development in 2004 [MARKDOWN], a number of web- and Internet-facing applications have incorporated Markdown into their text entry systems, frequently with custom extensions. Markdown has thus evolved into a kind of Internet meme [INETMEME] as different communities encounter it and adapt the syntax for their specific use cases. Markdown now represents a family of related plain text formatting syntaxes that, while broadly compatible with humans [HUMANE], are intended to produce different kinds of outputs that push the boundaries of mutual intelligibility between software systems. To support identifying and conveying Markdown, this document defines a media type and parameters that indicate the author's intent on how to interpret the Markdown. This registration draws particular inspiration from text/troff [RFC4263], which is a plain text formatting syntax for typesetting based on tools from the 1960s ("RUNOFF") and 1970s ("nroff", et. al.). In that sense, Markdown is a kind of troff for modern computing. A companion document [MDMTUSES] provides additional Markdown background and philosophy. 1.2. Markdown Is About Writing and Editing "HTML is a *publishing* format; Markdown is a *writing* format. Thus, Markdown's formatting syntax only addresses issues that can be conveyed in plain text." [MDSYNTAX] Leonard Exp. April 20, 2015 [Page 3] Internet-Draft The text/markdown Media Type October 2014 The paradigmatic use case for text/markdown is the Markdown editor: an application that presents Markdown content (which looks like an e- mail or other piece of plain text writing) alongside a published format, so that an author can see results instantaneously and can tweak his or her input in real-time. A significant number of Markdown editors have adopted "split-screen view" (or "live preview") technology that looks like Figure 1: +----------------------------------------------------------------------+ | File Edit (Cloud Stuff) (Fork Me on GitHub) Help | +----------------------------------------------------------------------+ | [ such-and-such identifier ] [ useful statistics] | +----------------------------------++----------------------------------+ | (plain text, with || (text/html, likely | | syntax highlighting) || rendered to screen) | | || | |# Introduction ||

Introduction

| | || | |## Markdown Is About Writing and /|

Markdown Is About Writing and | / Editing ||Editing

| | || | |> HTML is a *publishing* format; ||

HTML is a | |> Markdown is a *writing* format. || publishing format; | |> Thus, Markdown's formatting || Markdown is a writing | |> syntax only addresses issues || format. Thus, Markdown's | |> that can be conveyed in plain <> formatting syntax only addresses | |> text. [MDSYNTAX][] || issues that can be conveyed in | | || plain text. MDSYNTAX | |presents Markdown content ||

| |... || | | ||

The paradigmatic use case for | |[MDSYNTAX]: http://daringfireball./| text/markdown is the| /net/projects/markdown/syntax#html || Markdown editor: an application | |"Markdown: Syntax: HTML" || that presents Markdown content | | || ...

| +----------------------------------++----------------------------------+ LEGEND: "/" embedded in a vertical line represents a line-continuation marker, since a line break is not supposed to occur in that content. Figure 1: Markdown Split-Screen/Live Preview Editor Users on diverse platforms SHOULD be able to collaborate with their tools of choice, whether those tools are desktop-based (MarkdownPad, MultiMarkdown Composer), browser-based (Dillinger, Markable), integrated Leonard Exp. April 20, 2015 [Page 4] Internet-Draft The text/markdown Media Type October 2014 widgets (Discourse, GitHub), general-purpose editors (emacs, vi), or plain old "Notepad". Additionally, users SHOULD be able to identify particular areas of Markdown content when the Markdown becomes appreciably large (e.g., book chapters and Internet-Drafts--not just blog posts). Users SHOULD be able to use text/markdown to convey their works in progress, not just their finished products (for which full- blown markups ranging from text/html to application/pdf are appropriate). This registration facilitates interoperability between these Markdown editors by conveying the syntax of the particular Markdown variant and the desired output format. 1.3. RFC 2119 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Markdown Media Type Registration Application This section provides the media type registration application for the text/markdown media type (see [RFC6838], Section 5.6). Type name: text Subtype name: markdown Required parameters: charset: Per Section 4.2.1 of [RFC6838], charset is REQUIRED. There is no default value. [MDSYNTAX] clearly describes Markdown as a writing format; its syntax rules operate on characters (specifically, on punctuation) rather than code points. Neither [MDSYNTAX] nor many popular implementations at the time of this registration actually require or assume any particular encoding. Many Markdown processors will get along just fine by operating on character codes that lie in printable US-ASCII, blissfully oblivious to coded values outside of that range. Optional parameters: The following parameters reflect the author's intent regarding the content. A detailed specification can be found in Section 3. syntax: The Markdown-derivative syntax of the content, with optional version and named extensions. Default value: none (receiver's choice). Leonard Exp. April 20, 2015 [Page 5] Internet-Draft The text/markdown Media Type October 2014 output-type: The Content-Type (Internet media type) of the output, with optional parameters. Default value: "text/html". Encoding considerations: Text. Security considerations: Markdown interpreted as plain text is relatively harmless. A text editor need only display the text. The editor SHOULD take care to handle control characters appropriately, and to limit the effect of the Markdown to the text editing area itself; malicious Unicode- based Markdown could, for example, surreptitiously change the directionality of the text. An editor for normal text would already take these control characters into consideration, however. Markdown interpreted as a precursor to other formats, such as HTML, carries all of the security considerations as the target formats. For example, HTML can contain instructions to execute scripts, redirect the user to other webpages, download remote content, and upload personally identifiable information. Markdown also can contain islands of formal markup, such as HTML. These islands of formal markup may be passed as-is, transformed, or ignored (perhaps because the islands are conditional or incompatible) when the Markdown is processed. Since Markdown may have different interpretations depending on the tool and the environment, a better approach is to analyze (and sanitize or block) the output markup, rather than attempting to analyze the Markdown. Security provides a significant motivator for the output-type parameter. Most Markdown processors emit byte (octet) streams. Without a well-defined means for a Markdown processor to pass metadata onwards, it is perilous for post-processing to assume that the content is always HTML or XHTML. A processor might emit PostScript (application/postscript) content, for example, in which case an HTML sanitizer would fail to excise dangerous instructions. Interoperability considerations: Markdown syntaxes are designed to be broadly compatible with humans ("humane"), but not necessarily with each other. Therefore, syntax in one Markdown derivative may be ignored or treated differently in another derivative. The overall effect is a general degradation of the output, proportional to the quantity of syntax-specific Markdown used in the text. When it is desirable to reflect the author's intent in the output, stick with the syntax identified in the syntax parameter. Published specification: This specification; [MDSYNTAX]. Leonard Exp. April 20, 2015 [Page 6] Internet-Draft The text/markdown Media Type October 2014 Applications that use this media type: Markdown conversion tools, Markdown WYSIWYG editors, and plain text editors and viewers; markup processor targets indirectly use Markdown (e.g., web browsers for Markdown converted to HTML). Fragment identifier considerations: Markdown content acts as a "bridge" between plain text and formal markup, so this specification permits fragment identifiers [[NB: used to be #i]] #t for the [[NB: used to be input]] source text and #o for the output content. The #l and #ldef fragment identifiers identify link references. A detailed specification can be found in Section 4. Additional information: Magic number(s): None File extension(s): .md, .markdown Macintosh file type code(s): TEXT. A uniform type identifier (UTI) of "net.daringfireball.markdown", which conforms to "public.plain- text", is RECOMMENDED [MDUTI]. Additionally, implementations SHOULD record syntax and output-type parameters along with the Markdown, such as in extended attributes; however, the exact manner of storage is a local matter. Person & email address to contact for further information: Sean Leonard Restrictions on usage: None. Author/Change controller: Sean Leonard Intended usage: COMMON Provisional registration? No 3. Optional Parameters The optional parameters "syntax" and "output-type" can be used by an author to indicate the author's intent regarding how the Markdown ought to be processed. All identifiers are case-sensitive; receivers MUST compare for exact equality. At the same time, identifiers MUST NOT be registered in the IANA registry (see Section 6) if another registration differs only in Leonard Exp. April 20, 2015 [Page 7] Internet-Draft The text/markdown Media Type October 2014 the casing, as these registrations may cause confusion. The following ABNF definitions are used in this section: EXTCHAR = REXTCHAR = Figure X: ABNF Used in This Section The discussion in this section presumes that the parameter values are discrete strings. When encoded in protocols such as MIME [RFC2045], however, the value strings MUST be escaped properly. [MDMTUSES] provides some strategies to preserve this information when it leaves the domain of IETF protocols. 3.1. syntax The syntax parameter indicates the Markdown-derivative syntax in which the author composed the content, without regard to any particular implementation. With reference to the "paradigmatic use case" (i.e., collaborative Markdown editing) in Section 1.3, the syntax parameter primarily affects the "left-hand" side of a Markdown editor. The entire parameter is case-sensitive. Syntaxes other than [MDSYNTAX] extend the original rules in some way. These extensions fall into broad categories: clarifying ambiguities in [MDSYNTAX], adding brand new features, repurposing [MDSYNTAX] for completely new use cases, and adding metadata or other structured data blocks. Occasionally new syntaxes directly contradict [MDSYNTAX] based on seasoned experience. A syntax identifier is composed of two or more characters excluding (Unicode) separators, control characters, the hyphen-minus "-", quotation marks """, and angle brackets "<" and ">"; however, ASCII characters alone SHOULD be used. To promote interoperability, only registered syntaxes are permissible. An IANA registry of syntaxes will be created as discussed in Section 6. When omitted, the default value is unspecified, which means that the syntax interpretation is up to the receiver. However, the receiver SHOULD NOT "guess" based on content-sniffing, as this methodology is error-prone. Generators SHOULD always specify a syntax, whether Leonard Exp. April 20, 2015 [Page 8] Internet-Draft The text/markdown Media Type October 2014 explicitly or by context in embedding protocols or formats. All implementations MUST support the syntax value "Original", with the meaning covered in Section 6. Generators MUST omit the syntax parameter rather than transmitting an empty string (""); the empty string is a syntax error per the ABNF below. The full ABNF of the syntax parameter is: syntax-param = syntax-id [ "-" version ] *( 1*WSP extension ) *WSP syntax-id = 2*sid-char version = 1*sid-char sid-char = %d33 / %d35-44 / %d46-59 / %d61 / %d63-126 / REXTCHAR extension = ext-name [ ":" ( ext-string / ext-uri ) ] ext-name = 1*( %d33 / %d35-57 / %d59 / %d61 / %d63-126 / REXTCHAR ) ext-string = ext-quoted [ ext-string ] / ( ext-safe-char / ">" ) *( ext-safe-char / "<" / ">" / ext-quoted ) ext-safe-char = %d33 / %d35-59 / %d61 / %d63-126 / REXTCHAR ; [[NB: Could be EXTCHAR ? depends on how we feel about Unicode ; high-order separators]] ext-quoted = DQUOTE *eqcontent DQUOTE ext-uri = "<" URI-reference ">" ; from [RFC3986] eqcontent = %d0-33 / %d35-127 / EXTCHAR / DQUOTE DQUOTE Figure X: ABNF of the syntax parameter 3.1.1. syntax version For better precision, an author MAY include the syntax version. The version is delimited from the syntax identifier with a hyphen-minus "-" and has the same repertoire as the syntax identifier. The version string itself is an opaque string of at least one character. Version strings (e.g., "2.0", "3.0.5") are registered and updated along with the syntax registration. Updates to syntax registrations SHOULD only add new versions when those new versions have a material difference Leonard Exp. April 20, 2015 [Page 9] Internet-Draft The text/markdown Media Type October 2014 on the interpretation of the Markdown content. If a syntax has a version "2014.10" and a version "2014.11", for example, but "2014.11" only fixes typos in the specification, the registration SHOULD NOT separately register the "2014.11" version. The repertoire of the version string is the same as the syntax identifier (and like the processor identifier, ASCII characters alone SHOULD be used). A receiver that recognizes the syntax but not the version MAY use any version of the syntax, preferably the latest version. 3.1.2. syntax extensions Some Markdown syntaxes are self-contained, with no options. However, others have optional rules or features that may be applied with discretion. For those syntax systems where optional rules are an integral feature, the author MAY indicate that those named extensions be applied in a whitespace-separated list. The syntax for extensions derives in significant part from pandoc [PANDOC]. All extensions for a particular syntax are to be registered as part of the syntax registration in Section 7. An extension identifier is composed of any sequence of characters excluding (Unicode) separators, control characters, the colon ":", quotation marks """, and angle brackets "<" and ">"; however, lowercase ASCII letters and the underscore "_" alone SHOULD be used, where the underscore SHOULD NOT be at the beginning or end. When present, an extension is "enabled", "enabled, with string", or "enabled, with URI". When absent, an extension is "disabled". An extension can have different semantics depending on whether a string or URI is supplied. For example, an extension "bullet" could specify whether and how to render bulleted lists. "Disabled" could mean "bulleted" lists do not have bullets; "enabled" could mean that the bullet is some default character; "enabled, with string" could mean that the string is used as the bullet; finally, "enabled, with URI" could mean that the image identified by URI is used as the bullet. 3.1.2.1. Enabled, with String According to the ABNF above, extensions are delimited by whitespace. Quotation marks are used to support zero-length strings, whitespace or quotation marks in a single string, or strings where the first character is "<". If a quotation mark appears anywhere in the string, the following text is considered quoted; two successive quotation marks "" within quoted text mean one quotation mark in the string. A single quotation mark ends the quoting. Generators MUST NOT generate unterminated quoted strings; however, parsers SHOULD treat an Leonard Exp. April 20, 2015 [Page 10] Internet-Draft The text/markdown Media Type October 2014 unterminated quoted string as if it were terminated. Because of this rule, quotation marks do not have to appear at the termini of a string; embedded quotation marks start (and end) quoting within a single argument. For example: a""b means: ab for the actual argument. In spite of this relaxed positioning rule, for human readability generators SHOULD quote the entire string in lieu of embedding quoted sub-strings. 3.1.2.2. Enabled, with URI Certain syntaxes can take supplementary content, such as metadata, from other resources. To support these workflows, an extension can use the URI delimiters "<" and ">" to signal a URI, such as a cid: or mid: URL [RFC2392] in the context of MIME messages. The URI MUST comply with [RFC3986], and MAY be a relative reference if the subject Markdown content has a base URI. The charset parameter specifies the character encoding that is relevant to the URI's semantics (to the extent that the URI needs it). 3.2. output-type The output-type parameter indicates the Internet media type (and parameters) of the output from the processor. With reference to the "paradigmatic use case" (i.e., collaborative Markdown editing) in Section 1.3, the outout-type parameter primarily affects the "right- hand" side of a Markdown editor. When omitted, the default value is "text/html". Implementations SHOULD anticipate and support HTML (text/html) and XHTML (application/xhtml+xml) output, to the extent that a syntax targets those markup languages. The default value of text/html ought to be suitable for the majority of current purposes. However, Markdown is increasingly becoming integral to workflows where HTML is not the target output; examples range from TeX, to PDF, to OPML, and even to entire e-books (e.g., [PANDOC]). Anticipated output types for a particular syntax are to be registered as part of the syntax registration in Section 7. 3.2.1. Value Format and Semantics The value of output-type is an Internet media type with optional parameters. The syntax (including case sensitivity considerations) is the same as specified in [RFC2045] for the Content-Type header (with updates over time), namely: Leonard Exp. April 20, 2015 [Page 11] Internet-Draft The text/markdown Media Type October 2014 type "/" subtype *(";" parameter) ; Matching of media type and subtype ; is ALWAYS case-insensitive. Figure X: Content-Type ABNF (from [RFC2045]) The Internet media type in the output-type parameter MUST be observed. Although arbitrary parameters may be passed along with the Internet media type, receivers are under no obligation to honor or interpret them in any particular way. For example, the parameter value "text/plain; format=flowed; charset=ISO-2022-JP" obligates the receiver to output text/plain (and to treat the output as plain text: no sneaking in or labeling the output as HTML!). In contrast, such a parameter value neither obligates the receiver to follow [RFC3676] (for flowed output) nor to output ISO-2022-JP Japanese character encoding (see [RFC1468]). The output-type parameter does not distinguish between fragment content and whole-document content. A Markdown processor MAY (and typically will) output HTML or XHTML fragment content, without preambles or postambles such as , , , , , , or elements. Receivers MUST be aware of this behavior and take appropriate precautions. Fragment vs. whole- document output considerations are appropriate for addressing in syntax specifications, either as part of the syntax or by a syntax extension. 3.2.2. text/markdown Special Value The author may specify the output-type "text/markdown", which has a special meaning. "text/markdown" means that the author does not want to invoke Markdown processing at all: the receiver SHOULD view the Markdown source as-is. This output-type is not the default because one generally assumes that Markdown is meant for composing rather than reading: readers expect to see the output format (or dual-display of the output and the Markdown). However, if authors are collaboratively editing a document or are discussing Markdown, "text/markdown" may make sense. Furthermore, "text/markdown" differs from "text/plain" in that "text/plain" encompasses a wide range of characters and formatting techniques (in Unicode, examples include bullet points, roman numerals, unambiguous line and paragraph separators, and interlinear annotation). While the optional parameter output-type may be used recursively (as a sneaky way to stash the author's follow-on or secondary intent), receivers are not obligated to recognize it; Leonard Exp. April 20, 2015 [Page 12] Internet-Draft The text/markdown Media Type October 2014 optional parameters internal to output-type MAY be ignored. 4. Fragment Identifiers 4.1. #t [[NB: This section used to say: The fragment #i refers to the content input into a Markdown processor, which for purposes of this fragment identifier, MUST be treated as plain text (text/plain).]] The fragment #t refers to the Markdown content treated as plain text (text/plain). A specific area of the text can be identified with a text/plain sub-fragment identifier (e.g., [RFC5147] or its successors) delimited by a second "#" character. For example: #t#line=10 identifies the eleventh line of Markdown input. Implementers should take heed that the "char" scheme counts by characters rather than octets (or, for that matter, code points); thus proper interpretation of the charset parameter is REQUIRED for interoperability of the "char" scheme. For example, "character" and "code point" are NOT synonymous in the Unicode Standard. 4.2. #o The fragment #o refers to the content output from a Markdown processor, which is governed by the output-type parameter. A specific area of the output can be identified with a sub-fragment identifier delimited by a second "#" character. The encoding and semantics of sub-fragment identifiers are also governed by the output-type parameter. Examples: when the output-type is text/html [RFC2854], #o#section6 identifies the named anchor "section6" specified by the input that the Markdown processor converts to .... When the output-type is application/pdf [RFC3778], #o#page=6 causes the sixth page to open. When the output-type is "text/markdown" (regardless of parameters), the #o fragment identifier has no semantics; generators MUST use #t in lieu of #o. 4.3. #l and #ldef The fragment prefix #l refers to links by their link identifiers. The sub-component of this identifier is delimited by a second "#" character, followed by the encoded link identifier, optionally followed by a 1-based index number. Without the index number, the fragment refers to all such identified links. Example: #l#eS matches links such as "The rain in [Spain][ES]" and "The word [es][] means 'is' in Spanish." #l#es#2 only matches the second instance of the "es" link identifier. Leonard Exp. April 20, 2015 [Page 13] Internet-Draft The text/markdown Media Type October 2014 The fragment prefix #ldef refers to link reference definitions. The sub-component of this identifier is delimited by a second "#" character, followed by the encoded link identifier. There is no index number; in the case of multiple link reference definitions, the last definition wins. Both the #l and #ldef REQUIRE that "#" characters be percent-encoded if they are part of the link identifier. The percent-encoding of other characters follow the regular rules of [RFC3986]. [MDSYNTAX] states that identifiers (or names) "may consist of letters, numbers, spaces, and punctuation--but they are NOT case sensitive." Characters outside of the URI character set SHALL be percent-encoded with the same encoding as the Markdown content. For maximum compatibility and readability, authors who intend to reference links in fragment identifiers SHOULD limit themselves to URI characters that do not require percent-encoding. 4.4. Other Fragment Identifiers Specific syntaxes may define additional fragment identifiers specific to the syntax. For example, a syntax that incorporates "header" information might consider #h to refer to the "header" part, and #b to refer to the "body" part. 5. Example The following is an example of Markdown as an e-mail attachment: MIME-Version: 1.0 Content-Type: text/markdown; charset=UTF-8; syntax=Original; output-type="application/xhtml+xml" Content-Disposition: attachment; filename=readme.md Sample HTML 4 Markdown ============= This is some sample Markdown. [Hooray!][foo] (Remember that link identifiers are not case-sensitive.) Bulleted Lists ------- Here are some bulleted lists... * One Potato * Two Potato * Three Potato Leonard Exp. April 20, 2015 [Page 14] Internet-Draft The text/markdown Media Type October 2014 - One Tomato - Two Tomato - Three Tomato More Information ----------- [.markdown, .md](http://daringfireball.net/projects/markdown/) has more information. [fOo]: http://example.com/loc 'Will Not Work with Markdown.pl-1.0.1' 6. IANA Considerations IANA is asked to register the media type text/markdown in the Standards tree using the application provided in Section 2 of this document. IANA is also asked to establish a subtype registry called "Markdown Syntaxes". Each entry in this registry shall consist of a syntax identifier and information about the syntax, as follows: 6.1. Syntax Template {if provisional} PROVISIONAL REGISTRATION EXPIRES [YYYY-MM-DD date format] Identifier: [Identifier] Description: [Concise, prose description of the syntax, with emphasis on its purpose and notable variations from [MDSYNTAX] or another syntax. If the syntax permits structured data, this fact ought to be included. Other Markdown syntaxes may be referenced by quoting their registered identifiers.] Documentation: [References to documentation.] Community of Use: [Concise, prose description of the community of use, such as "scholarly publications" or "screenwriting". "General" may be entered if the community encompasses general users of the Internet.] [[TODO: Users (screenwriters) or use cases (screenwriting)?]] Leonard Exp. April 20, 2015 [Page 15] Internet-Draft The text/markdown Media Type October 2014 [[NB: Should Versions: and Extensions: be {optional} and therefore omittable, or should they have "None." to indicate that no versions or extensions apply?]] Versions: {for each version} Identifier: [Identifier] Description: [Optional, concise, prose description of the version. "N/A" SHALL be used to indicate no description.] Extensions: {for each extension} Identifier: [Identifier] Syntax: {if Enabled} Enabled {if Enabled, with String} Enabled, with String: [prose description of what the string is (not what it does)] {if Enabled, with URI} Enabled, with URI: [prose description of what the URI is (not what it does)] Description: [Concise, prose description of the extension, i.e., what it does.] Documentation: [References to documentation.] Anticipated Output Types: {for each output-type} [media type] {optional} [prose description of parameter considerations] {optional} Additional Fragment Identifiers: [Prose description of additional fragment identifiers, sufficient for interoperability.] Responsible Parties: {for each party} ([type: individual, corporate, representative]) [Name] ... Currently Maintained? [Yes/No] {optional} Implementations: {for each implementation} Name: [Name] Version(s): [Significant version or versions that implement the syntax] Leonard Exp. April 20, 2015 [Page 16] Internet-Draft The text/markdown Media Type October 2014 Type: ["Processor" or some other type] References: ... Purpose: [Concise, prose description of the implementation.] A responsible party can be an individual author or maintainer, a corporate author or maintainer (plus an individual contact), or a representative of a community of interest dedicated to the Markdown syntax. The Versions, Extensions, Additional Fragment Identifiers, and Implementations sections are optional. 6.2. Initial Registration The registry shall have the following initial registration; implementations conforming to this document MUST handle this syntax. [MDMTUSES] provides additional exemplary syntaxes. Identifier: Original Description: Gruber's original Markdown syntax. Documentation: [MDSYNTAX]. For the "2004" version, the documentation is provided in HTML and in Markdown, as follows: syntax: Content-Type: text/html; charset=UTF-8 Accessed at October 12, 2014 8:27 PM (-0700) 38570 bytes SHA-256 hash: B2EC2A62 3257F164 FBC88AE8 C7E76F3F 80F16845 105D9F3E 3E8CE25B 6F0CB33B syntax.text: Content-Type: text/plain; charset=UTF-8 (actually text/markdown; syntax=Original; output-type="text/markdown") Accessed at October 12, 2014 8:27 PM (-0700) 27784 bytes SHA-256 hash: 01A6A07A F51838E1 8749454B 06D716BC B1BC0EAA A21B67B7 D6FB5A6B 4FFB5D5B Community of Use: General. Leonard Exp. April 20, 2015 [Page 17] Internet-Draft The text/markdown Media Type October 2014 Versions: Identifier: 2004 Description: [MDSYNTAX] as it (is rumored to have) existed since December 14, 2004, corresponding to Markdown.pl 1.0.1. The version "2004" SHOULD NOT be specified until further notice; is is only documented for completeness (in case Gruber revises the syntax with material contradictions). Anticipated Output Types: text/html application/xhtml+xml Responsible Parties: (individual) John Gruber Currently Maintained? No Implementations: Name: Markdown.pl Version(s): 1.0.1, 1.0.2b8 Type: Processor References: [MARKDOWN] Purpose: Converts Markdown to HTML or XHTML circa 2004. The argument "--html4tags" causes HTML output. 6.3. Reserved Identifiers The registry SHALL have the following identifiers RESERVED. No one is allowed to register them (or any case variations of them). Standard Common Markdown 6.4. Standard of Review Registrations are made on a First-Come, First-Served [RFC5226] basis by anyone with a need to interoperate. While documentation is required, any level of documentation is sufficient; thus, neither Specification Required nor Expert Review are warranted. The checks prescribed by this section can be performed automatically. Syntax, version, and extension identifiers MUST comply with the syntaxes specified in this document. Additionally, the identifier MUST NOT differ from other registered identifiers merely by case. Identifiers MUST conform to [[TODO: PRECIS? STRINGPREP?]]. The purpose of this requirement is to eliminate confusingly similar Leonard Exp. April 20, 2015 [Page 18] Internet-Draft The text/markdown Media Type October 2014 identifiers, placing the burden on the registration process rather than on syntax parameter parsers. All references (including contact information) MUST be verified as functional at the time of the registration. If a registration is being updated, the contact information MUST either match the prior registration and be verified, or the prior registrant MUST confirm that the updating registrant has authority to update the registration. As a special "escape valve", registrations can be updated with IETF Review [RFC5226]. [[NB: Two purposes: 1) to deal with "harmful" registrations (stale references are not a sufficient justification); 2) to deal with registrations that are IETF registrations, like RFC-related Markdown (but this could be handled by listing the IETF as the contact organization, right?).]] All fields may be updated except the syntax identifier, which is permanent: not even case may be changed. 6.5. Provisional Registration Any registrant may make a provisional registration to reserve a syntax identifier. Provisional registrations include the ALL-CAPS legend as shown in Section 6.1. All fields are optional except for the syntax identifier and contact information. Provisional registrations expire after three months, after which time the syntax identifier may be reused. 7. Security Considerations See the Security considerations entry in Section 2. 8. References 8.1. Normative References [MARKDOWN] Gruber, J., "Daring Fireball: Markdown", December 2004, . [MDSYNTAX] Gruber, J., "Daring Fireball: Markdown Syntax Documentation", December 2004, . [MDUTI] Gruber, J., "Daring Fireball: Uniform Type Identifier for Markdown", August 2011, . [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Leonard Exp. April 20, 2015 [Page 19] Internet-Draft The text/markdown Media Type October 2014 Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2854] Connolly, D. and L. Masinter, "The 'text/html' Media Type", RFC 2854, June 2000. [RFC3778] Taft, E., Pravetz, J., Zilles, S., and L. Masinter, "The application/pdf Media Type", RFC 3778, May 2004. [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005. [RFC5147] Wilde, E. and M. Duerst, "URI Fragment Identifiers for the text/plain Media Type", RFC 5147, April 2008. [RFC5226] Narten, T., and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", RFC 5226, May 2008. [RFC5322] Resnick, P., Ed., "Internet Message Format", RFC 5322, October 2008. [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, January 2013. 8.2. Informative References [HUMANE] Atwood, J., "Is HTML a Humane Markup Language?", May 2008, . [INETMEME] Solon, O., "Richard Dawkins on the internet's hijacking of the word 'meme'", June 2013, , . [MDMTUSES] Leonard, S., "text/markdown Use Cases", draft-seantek- text-markdown-use-cases-00 (work in progress), October 2014. [PANDOC] MacFarlane, J., "Pandoc", 2014, . [RAILFROG] Railfrog Team, "Railfrog", April 2009, Leonard Exp. April 20, 2015 [Page 20] Internet-Draft The text/markdown Media Type October 2014 . [RFC1468] Murai, J., Crispin, M., and E. van der Poel, "Japanese Character Encoding for Internet Messages", RFC 1468, June 1993. [RFC2392] Levinson, E., "Content-ID and Message-ID Uniform Resource Locators", RFC 2392, August 1998. [RFC3676] Gellens, R., "The Text/Plain Format and DelSp Parameters", RFC 3676, February 2004. [RFC4263] Lilly, B., "Media Subtype Registration for Media Type text/troff", RFC 4263, January 2006. [FOUNTAIN] Maschwitz, S. and J. August, "Fountain | A markup language for screenwriting.", 2014, . [FTSYNTAX] Maschwitz, S. and J. August, "Syntax - Fountain | A markup language for screenwriting.", 1.1, March 2014, . Appendix A. Change Log This draft is a continuation from draft-ietf-appsawg-text-markdown- 02.txt. These technical changes were made: 1. Proposed that the document be split into two documents: the main document (which is normative), and a second document. The second document (draft-seantek-text-markdown-use-cases-00) [MDMTUSES] provides additional background information, suggestions for preserving metadata, registration templates for common Markdown syntaxes, and examples for common Markdown syntaxes. RFC 2119 key words are not included in draft- seantek-text-markdown-use-cases because this content is not normative (at least, not as normative) compared with the main document. 2. De-emphasized Unicode (and UTF-8 encoding) after close consideration of the original [MDSYNTAX], and the various proposed extensions to Markdown in the intervening time. "CommonMark", for example, places stronger emphasizes on Unicode (and UTF-8). 3. Deleted processor parameter. 4. Renamed flavor parameter to syntax parameter. 5. Renamed "rules" to "extensions" in the syntax parameter. 6. Parameterized "extensions" so that it can have a string or a Leonard Exp. April 20, 2015 [Page 21] Internet-Draft The text/markdown Media Type October 2014 URI. 7. Simplified the syntax parameter (compared to draft-02, in any event) with fewer exceptional cases in the ABNF. 8. Rewrote significant parts of the output-type parameter, and gave text/markdown additional explanation. 9. Rewrote the introduction so that it is much shorter. 10. Moved the example towards the end. 11. Added Fragment Identifier Considerations. 12. Consolidated the Security Considerations into the registration template. 13. Rewrote the IANA Considerations section so that it only creates one new registry. 14. Redefined the flavors registry (now called the Markdown Syntaxes registry). 15. Rewrote the "Original" syntax registration to conform to the new registration template. 16. Added a discussion and example of the Paradigmatic Use Case (Markdown Editors). Author's Address Sean Leonard Penango, Inc. 5900 Wilshire Boulevard 21st Floor Los Angeles, CA 90036 USA EMail: dev+ietf@seantek.com URI: http://www.penango.com/ Leonard Exp. April 20, 2015 [Page 22]