Applications Area Working Group S. Leonard
Internet-Draft Penango, Inc.
Intended Status: Informational October 17, 2014
Expires: April 20, 2015
The text/markdown Media Type
draft-ietf-appsawg-text-markdown-03
Abstract
This document registers the text/markdown media type for use with
Markdown, a family of plain text formatting syntaxes that optionally
can be converted to formal markup languages such as HTML.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Leonard Exp. April 20, 2015 [Page 1]
Internet-Draft The text/markdown Media Type October 2014
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. This Is Markdown! Or: Markup and Its Discontents . . . . . 2
1.2. Markdown Is About Writing and Editing . . . . . . . . . . . 3
1.3. RFC 2119 . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Markdown Media Type Registration Application . . . . . . . . . 5
3. Optional Parameters . . . . . . . . . . . . . . . . . . . . . 7
3.1. syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2. output-type . . . . . . . . . . . . . . . . . . . . . . . . 11
4. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . . 13
4.1. #t . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2. #o . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3. #l and #ldef . . . . . . . . . . . . . . . . . . . . . . . 13
4.4. Other Fragment Identifiers . . . . . . . . . . . . . . . . 14
5. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15
6.1. Syntax Template . . . . . . . . . . . . . . . . . . . . . . 15
6.2. Initial Registration . . . . . . . . . . . . . . . . . . . 17
6.3. Reserved Identifiers . . . . . . . . . . . . . . . . . . . 18
6.4. Standard of Review . . . . . . . . . . . . . . . . . . . . 18
6.5. Provisional Registration . . . . . . . . . . . . . . . . . 19
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 19
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19
8.1. Normative References . . . . . . . . . . . . . . . . . . . 19
8.2. Informative References . . . . . . . . . . . . . . . . . . 20
Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 21
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction
1.1. This Is Markdown! Or: Markup and Its Discontents
In computer systems, textual data is stored and processed using a
continuum of techniques. On the one end is plain text: a linear
sequence of characters in some character set (code), possibly
interrupted by line breaks, page breaks, or other control characters.
The repertoire of these control characters (a form of in-band
signaling) is necessarily limited, and not particularly extensible.
Because they are non-printing, these characters are also hard to
enter with standard keyboards.
Markup offers an alternative means to encode this signaling
information by overloading certain characters with additional
meanings. Therefore, markup languages allow for annotating a document
in such a way that annotations are syntactically distinguishable from
the printing information. Markup languages are (reasonably) well-
Leonard Exp. April 20, 2015 [Page 2]
Internet-Draft The text/markdown Media Type October 2014
specified and tend to follow (mostly) standardized syntax rules.
Examples of formal markup languages include SGML, HTML, XML, and
LaTeX. Standardized rules lead to interoperability between markup
processors, but impose skill requirements on new users that lead to
markup languages becoming less accessible to beginners. These rules
also reify "validity": content that does not conform to the rules is
treated differently (i.e., is rejected) than content that conforms.
In contrast to formal markup languages, lightweight markup languages
use simple syntaxes; they are designed to be easy for humans to enter
and understand with basic text editors. Markdown, the subject of this
document, began as an /informal/ plain text formatting syntax
[MDSYNTAX] and Perl script HTML/XHTML processor [MARKDOWN] targeted
at non-technical users using unspecialized tools, such as plain text
e-mail clients. [MDSYNTAX] explicitly rejects the notion of validity:
there is no such thing as "invalid" Markdown. If the Markdown content
does not result in the "right" output (defined as output that the
author wants, not output that adheres to some dictated system of
rules), the expectation is that the author should continue
experimenting by changing the content or the processor to achieve the
desired output.
Since its development in 2004 [MARKDOWN], a number of web- and
Internet-facing applications have incorporated Markdown into their
text entry systems, frequently with custom extensions. Markdown has
thus evolved into a kind of Internet meme [INETMEME] as different
communities encounter it and adapt the syntax for their specific use
cases. Markdown now represents a family of related plain text
formatting syntaxes that, while broadly compatible with humans
[HUMANE], are intended to produce different kinds of outputs that
push the boundaries of mutual intelligibility between software
systems.
To support identifying and conveying Markdown, this document defines
a media type and parameters that indicate the author's intent on how
to interpret the Markdown. This registration draws particular
inspiration from text/troff [RFC4263], which is a plain text
formatting syntax for typesetting based on tools from the 1960s
("RUNOFF") and 1970s ("nroff", et. al.). In that sense, Markdown is a
kind of troff for modern computing. A companion document [MDMTUSES]
provides additional Markdown background and philosophy.
1.2. Markdown Is About Writing and Editing
"HTML is a *publishing* format; Markdown is a *writing* format.
Thus, Markdown's formatting syntax only addresses issues
that can be conveyed in plain text." [MDSYNTAX]
Leonard Exp. April 20, 2015 [Page 3]
Internet-Draft The text/markdown Media Type October 2014
The paradigmatic use case for text/markdown is the Markdown editor:
an application that presents Markdown content (which looks like an e-
mail or other piece of plain text writing) alongside a published
format, so that an author can see results instantaneously and can
tweak his or her input in real-time. A significant number of Markdown
editors have adopted "split-screen view" (or "live preview")
technology that looks like Figure 1:
+----------------------------------------------------------------------+
| File Edit (Cloud Stuff) (Fork Me on GitHub) Help |
+----------------------------------------------------------------------+
| [ such-and-such identifier ] [ useful statistics] |
+----------------------------------++----------------------------------+
| (plain text, with || (text/html, likely |
| syntax highlighting) || rendered to screen) |
| || |
|# Introduction ||
Introduction
|
| || |
|## Markdown Is About Writing and /|Markdown Is About Writing and |
/ Editing ||Editing
|
| || |
|> HTML is a *publishing* format; ||HTML is a |
|> Markdown is a *writing* format. || publishing format; |
|> Thus, Markdown's formatting || Markdown is a writing |
|> syntax only addresses issues || format. Thus, Markdown's |
|> that can be conveyed in plain <> formatting syntax only addresses |
|> text. [MDSYNTAX][] || issues that can be conveyed in |
| || plain text. MDSYNTAX |
|presents Markdown content ||
|
|... || |
| ||The paradigmatic use case for |
|[MDSYNTAX]: http://daringfireball./| text/markdown
is the|
/net/projects/markdown/syntax#html || Markdown editor: an application |
|"Markdown: Syntax: HTML" || that presents Markdown content |
| || ...
|
+----------------------------------++----------------------------------+
LEGEND: "/" embedded in a vertical line represents a line-continuation
marker, since a line break is not supposed to occur in that content.
Figure 1: Markdown Split-Screen/Live Preview Editor
Users on diverse platforms SHOULD be able to collaborate with their
tools of choice, whether those tools are desktop-based (MarkdownPad,
MultiMarkdown Composer), browser-based (Dillinger, Markable), integrated
Leonard Exp. April 20, 2015 [Page 4]
Internet-Draft The text/markdown Media Type October 2014
widgets (Discourse, GitHub), general-purpose editors (emacs, vi), or
plain old "Notepad". Additionally, users SHOULD be able to identify
particular areas of Markdown content when the Markdown becomes
appreciably large (e.g., book chapters and Internet-Drafts--not just
blog posts). Users SHOULD be able to use text/markdown to convey their
works in progress, not just their finished products (for which full-
blown markups ranging from text/html to application/pdf are
appropriate). This registration facilitates interoperability between
these Markdown editors by conveying the syntax of the particular
Markdown variant and the desired output format.
1.3. RFC 2119
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Markdown Media Type Registration Application
This section provides the media type registration application for the
text/markdown media type (see [RFC6838], Section 5.6).
Type name: text
Subtype name: markdown
Required parameters:
charset: Per Section 4.2.1 of [RFC6838], charset is REQUIRED. There
is no default value. [MDSYNTAX] clearly describes Markdown as a
writing format; its syntax rules operate on characters
(specifically, on punctuation) rather than code points. Neither
[MDSYNTAX] nor many popular implementations at the time of this
registration actually require or assume any particular encoding.
Many Markdown processors will get along just fine by operating on
character codes that lie in printable US-ASCII, blissfully
oblivious to coded values outside of that range.
Optional parameters:
The following parameters reflect the author's intent regarding the
content. A detailed specification can be found in Section 3.
syntax: The Markdown-derivative syntax of the content, with
optional version and named extensions. Default value: none
(receiver's choice).
Leonard Exp. April 20, 2015 [Page 5]
Internet-Draft The text/markdown Media Type October 2014
output-type: The Content-Type (Internet media type) of the output,
with optional parameters. Default value: "text/html".
Encoding considerations: Text.
Security considerations:
Markdown interpreted as plain text is relatively harmless. A text
editor need only display the text. The editor SHOULD take care to
handle control characters appropriately, and to limit the effect of
the Markdown to the text editing area itself; malicious Unicode-
based Markdown could, for example, surreptitiously change the
directionality of the text. An editor for normal text would already
take these control characters into consideration, however.
Markdown interpreted as a precursor to other formats, such as HTML,
carries all of the security considerations as the target formats.
For example, HTML can contain instructions to execute scripts,
redirect the user to other webpages, download remote content, and
upload personally identifiable information. Markdown also can
contain islands of formal markup, such as HTML. These islands of
formal markup may be passed as-is, transformed, or ignored (perhaps
because the islands are conditional or incompatible) when the
Markdown is processed. Since Markdown may have different
interpretations depending on the tool and the environment, a better
approach is to analyze (and sanitize or block) the output markup,
rather than attempting to analyze the Markdown.
Security provides a significant motivator for the output-type
parameter. Most Markdown processors emit byte (octet) streams.
Without a well-defined means for a Markdown processor to pass
metadata onwards, it is perilous for post-processing to assume that
the content is always HTML or XHTML. A processor might emit
PostScript (application/postscript) content, for example, in which
case an HTML sanitizer would fail to excise dangerous instructions.
Interoperability considerations:
Markdown syntaxes are designed to be broadly compatible with humans
("humane"), but not necessarily with each other. Therefore, syntax
in one Markdown derivative may be ignored or treated differently in
another derivative. The overall effect is a general degradation of
the output, proportional to the quantity of syntax-specific
Markdown used in the text. When it is desirable to reflect the
author's intent in the output, stick with the syntax identified in
the syntax parameter.
Published specification: This specification; [MDSYNTAX].
Leonard Exp. April 20, 2015 [Page 6]
Internet-Draft The text/markdown Media Type October 2014
Applications that use this media type:
Markdown conversion tools, Markdown WYSIWYG editors, and plain text
editors and viewers; markup processor targets indirectly use
Markdown (e.g., web browsers for Markdown converted to HTML).
Fragment identifier considerations:
Markdown content acts as a "bridge" between plain text and formal
markup, so this specification permits fragment identifiers [[NB:
used to be #i]] #t for the [[NB: used to be input]] source text and
#o for the output content. The #l and #ldef fragment identifiers
identify link references. A detailed specification can be found in
Section 4.
Additional information:
Magic number(s): None
File extension(s): .md, .markdown
Macintosh file type code(s):
TEXT. A uniform type identifier (UTI) of
"net.daringfireball.markdown", which conforms to "public.plain-
text", is RECOMMENDED [MDUTI]. Additionally, implementations
SHOULD record syntax and output-type parameters along with the
Markdown, such as in extended attributes; however, the exact
manner of storage is a local matter.
Person & email address to contact for further information:
Sean Leonard
Restrictions on usage: None.
Author/Change controller: Sean Leonard
Intended usage: COMMON
Provisional registration? No
3. Optional Parameters
The optional parameters "syntax" and "output-type" can be used by an
author to indicate the author's intent regarding how the Markdown
ought to be processed.
All identifiers are case-sensitive; receivers MUST compare for exact
equality. At the same time, identifiers MUST NOT be registered in the
IANA registry (see Section 6) if another registration differs only in
Leonard Exp. April 20, 2015 [Page 7]
Internet-Draft The text/markdown Media Type October 2014
the casing, as these registrations may cause confusion.
The following ABNF definitions are used in this section:
EXTCHAR =
REXTCHAR =
Figure X: ABNF Used in This Section
The discussion in this section presumes that the parameter values are
discrete strings. When encoded in protocols such as MIME [RFC2045],
however, the value strings MUST be escaped properly. [MDMTUSES]
provides some strategies to preserve this information when it leaves
the domain of IETF protocols.
3.1. syntax
The syntax parameter indicates the Markdown-derivative syntax in
which the author composed the content, without regard to any
particular implementation. With reference to the "paradigmatic use
case" (i.e., collaborative Markdown editing) in Section 1.3, the
syntax parameter primarily affects the "left-hand" side of a Markdown
editor. The entire parameter is case-sensitive.
Syntaxes other than [MDSYNTAX] extend the original rules in some way.
These extensions fall into broad categories: clarifying ambiguities
in [MDSYNTAX], adding brand new features, repurposing [MDSYNTAX] for
completely new use cases, and adding metadata or other structured
data blocks. Occasionally new syntaxes directly contradict [MDSYNTAX]
based on seasoned experience.
A syntax identifier is composed of two or more characters excluding
(Unicode) separators, control characters, the hyphen-minus "-",
quotation marks """, and angle brackets "<" and ">"; however, ASCII
characters alone SHOULD be used. To promote interoperability, only
registered syntaxes are permissible. An IANA registry of syntaxes
will be created as discussed in Section 6.
When omitted, the default value is unspecified, which means that the
syntax interpretation is up to the receiver. However, the receiver
SHOULD NOT "guess" based on content-sniffing, as this methodology is
error-prone. Generators SHOULD always specify a syntax, whether
Leonard Exp. April 20, 2015 [Page 8]
Internet-Draft The text/markdown Media Type October 2014
explicitly or by context in embedding protocols or formats. All
implementations MUST support the syntax value "Original", with the
meaning covered in Section 6. Generators MUST omit the syntax
parameter rather than transmitting an empty string (""); the empty
string is a syntax error per the ABNF below. The full ABNF of the
syntax parameter is:
syntax-param = syntax-id [ "-" version ]
*( 1*WSP extension ) *WSP
syntax-id = 2*sid-char
version = 1*sid-char
sid-char = %d33 / %d35-44 / %d46-59 / %d61 /
%d63-126 / REXTCHAR
extension = ext-name [ ":" ( ext-string / ext-uri ) ]
ext-name = 1*( %d33 / %d35-57 / %d59 / %d61 /
%d63-126 / REXTCHAR )
ext-string = ext-quoted [ ext-string ] /
( ext-safe-char / ">" )
*( ext-safe-char / "<" / ">" / ext-quoted )
ext-safe-char = %d33 / %d35-59 / %d61 / %d63-126 / REXTCHAR
; [[NB: Could be EXTCHAR ? depends on how we feel about Unicode
; high-order separators]]
ext-quoted = DQUOTE *eqcontent DQUOTE
ext-uri = "<" URI-reference ">" ; from [RFC3986]
eqcontent = %d0-33 / %d35-127 / EXTCHAR / DQUOTE DQUOTE
Figure X: ABNF of the syntax parameter
3.1.1. syntax version
For better precision, an author MAY include the syntax version. The
version is delimited from the syntax identifier with a hyphen-minus
"-" and has the same repertoire as the syntax identifier. The version
string itself is an opaque string of at least one character. Version
strings (e.g., "2.0", "3.0.5") are registered and updated along with
the syntax registration. Updates to syntax registrations SHOULD only
add new versions when those new versions have a material difference
Leonard Exp. April 20, 2015 [Page 9]
Internet-Draft The text/markdown Media Type October 2014
on the interpretation of the Markdown content. If a syntax has a
version "2014.10" and a version "2014.11", for example, but "2014.11"
only fixes typos in the specification, the registration SHOULD NOT
separately register the "2014.11" version. The repertoire of the
version string is the same as the syntax identifier (and like the
processor identifier, ASCII characters alone SHOULD be used).
A receiver that recognizes the syntax but not the version MAY use any
version of the syntax, preferably the latest version.
3.1.2. syntax extensions
Some Markdown syntaxes are self-contained, with no options. However,
others have optional rules or features that may be applied with
discretion. For those syntax systems where optional rules are an
integral feature, the author MAY indicate that those named extensions
be applied in a whitespace-separated list. The syntax for extensions
derives in significant part from pandoc [PANDOC].
All extensions for a particular syntax are to be registered as part
of the syntax registration in Section 7.
An extension identifier is composed of any sequence of characters
excluding (Unicode) separators, control characters, the colon ":",
quotation marks """, and angle brackets "<" and ">"; however,
lowercase ASCII letters and the underscore "_" alone SHOULD be used,
where the underscore SHOULD NOT be at the beginning or end.
When present, an extension is "enabled", "enabled, with string", or
"enabled, with URI". When absent, an extension is "disabled". An
extension can have different semantics depending on whether a string
or URI is supplied. For example, an extension "bullet" could specify
whether and how to render bulleted lists. "Disabled" could mean
"bulleted" lists do not have bullets; "enabled" could mean that the
bullet is some default character; "enabled, with string" could mean
that the string is used as the bullet; finally, "enabled, with URI"
could mean that the image identified by URI is used as the bullet.
3.1.2.1. Enabled, with String
According to the ABNF above, extensions are delimited by whitespace.
Quotation marks are used to support zero-length strings, whitespace
or quotation marks in a single string, or strings where the first
character is "<". If a quotation mark appears anywhere in the string,
the following text is considered quoted; two successive quotation
marks "" within quoted text mean one quotation mark in the string. A
single quotation mark ends the quoting. Generators MUST NOT generate
unterminated quoted strings; however, parsers SHOULD treat an
Leonard Exp. April 20, 2015 [Page 10]
Internet-Draft The text/markdown Media Type October 2014
unterminated quoted string as if it were terminated. Because of this
rule, quotation marks do not have to appear at the termini of a
string; embedded quotation marks start (and end) quoting within a
single argument. For example:
a""b
means:
ab
for the actual argument. In spite of this relaxed positioning rule,
for human readability generators SHOULD quote the entire string in
lieu of embedding quoted sub-strings.
3.1.2.2. Enabled, with URI
Certain syntaxes can take supplementary content, such as metadata,
from other resources. To support these workflows, an extension can
use the URI delimiters "<" and ">" to signal a URI, such as a cid: or
mid: URL [RFC2392] in the context of MIME messages. The URI MUST
comply with [RFC3986], and MAY be a relative reference if the subject
Markdown content has a base URI. The charset parameter specifies the
character encoding that is relevant to the URI's semantics (to the
extent that the URI needs it).
3.2. output-type
The output-type parameter indicates the Internet media type (and
parameters) of the output from the processor. With reference to the
"paradigmatic use case" (i.e., collaborative Markdown editing) in
Section 1.3, the outout-type parameter primarily affects the "right-
hand" side of a Markdown editor.
When omitted, the default value is "text/html". Implementations
SHOULD anticipate and support HTML (text/html) and XHTML
(application/xhtml+xml) output, to the extent that a syntax targets
those markup languages.
The default value of text/html ought to be suitable for the majority
of current purposes. However, Markdown is increasingly becoming
integral to workflows where HTML is not the target output; examples
range from TeX, to PDF, to OPML, and even to entire e-books (e.g.,
[PANDOC]). Anticipated output types for a particular syntax are to be
registered as part of the syntax registration in Section 7.
3.2.1. Value Format and Semantics
The value of output-type is an Internet media type with optional
parameters. The syntax (including case sensitivity considerations) is
the same as specified in [RFC2045] for the Content-Type header (with
updates over time), namely:
Leonard Exp. April 20, 2015 [Page 11]
Internet-Draft The text/markdown Media Type October 2014
type "/" subtype *(";" parameter)
; Matching of media type and subtype
; is ALWAYS case-insensitive.
Figure X: Content-Type ABNF (from [RFC2045])
The Internet media type in the output-type parameter MUST be
observed.
Although arbitrary parameters may be passed along with the Internet
media type, receivers are under no obligation to honor or interpret
them in any particular way. For example, the parameter value
"text/plain; format=flowed; charset=ISO-2022-JP" obligates the
receiver to output text/plain (and to treat the output as plain text:
no sneaking in or labeling the output as HTML!). In contrast, such a
parameter value neither obligates the receiver to follow [RFC3676]
(for flowed output) nor to output ISO-2022-JP Japanese character
encoding (see [RFC1468]).
The output-type parameter does not distinguish between fragment
content and whole-document content. A Markdown processor MAY (and
typically will) output HTML or XHTML fragment content, without
preambles or postambles such as , , , ,
, , or elements. Receivers MUST be aware of this
behavior and take appropriate precautions. Fragment vs. whole-
document output considerations are appropriate for addressing in
syntax specifications, either as part of the syntax or by a syntax
extension.
3.2.2. text/markdown Special Value
The author may specify the output-type "text/markdown", which has a
special meaning. "text/markdown" means that the author does not want
to invoke Markdown processing at all: the receiver SHOULD view the
Markdown source as-is.
This output-type is not the default because one generally assumes
that Markdown is meant for composing rather than reading: readers
expect to see the output format (or dual-display of the output and
the Markdown). However, if authors are collaboratively editing a
document or are discussing Markdown, "text/markdown" may make sense.
Furthermore, "text/markdown" differs from "text/plain" in that
"text/plain" encompasses a wide range of characters and formatting
techniques (in Unicode, examples include bullet points, roman
numerals, unambiguous line and paragraph separators, and interlinear
annotation). While the optional parameter output-type may be used
recursively (as a sneaky way to stash the author's follow-on or
secondary intent), receivers are not obligated to recognize it;
Leonard Exp. April 20, 2015 [Page 12]
Internet-Draft The text/markdown Media Type October 2014
optional parameters internal to output-type MAY be ignored.
4. Fragment Identifiers
4.1. #t
[[NB: This section used to say: The fragment #i refers to the content
input into a Markdown processor, which for purposes of this fragment
identifier, MUST be treated as plain text (text/plain).]]
The fragment #t refers to the Markdown content treated as plain text
(text/plain). A specific area of the text can be identified with a
text/plain sub-fragment identifier (e.g., [RFC5147] or its
successors) delimited by a second "#" character. For example:
#t#line=10 identifies the eleventh line of Markdown input.
Implementers should take heed that the "char" scheme counts by
characters rather than octets (or, for that matter, code points);
thus proper interpretation of the charset parameter is REQUIRED for
interoperability of the "char" scheme. For example, "character" and
"code point" are NOT synonymous in the Unicode Standard.
4.2. #o
The fragment #o refers to the content output from a Markdown
processor, which is governed by the output-type parameter. A specific
area of the output can be identified with a sub-fragment identifier
delimited by a second "#" character. The encoding and semantics of
sub-fragment identifiers are also governed by the output-type
parameter. Examples: when the output-type is text/html [RFC2854],
#o#section6 identifies the named anchor "section6" specified by the
input that the Markdown processor converts to .... When the output-type is application/pdf
[RFC3778], #o#page=6 causes the sixth page to open.
When the output-type is "text/markdown" (regardless of parameters),
the #o fragment identifier has no semantics; generators MUST use #t
in lieu of #o.
4.3. #l and #ldef
The fragment prefix #l refers to links by their link identifiers. The
sub-component of this identifier is delimited by a second "#"
character, followed by the encoded link identifier, optionally
followed by a 1-based index number. Without the index number, the
fragment refers to all such identified links. Example: #l#eS matches
links such as "The rain in [Spain][ES]" and "The word [es][] means
'is' in Spanish." #l#es#2 only matches the second instance of the
"es" link identifier.
Leonard Exp. April 20, 2015 [Page 13]
Internet-Draft The text/markdown Media Type October 2014
The fragment prefix #ldef refers to link reference definitions. The
sub-component of this identifier is delimited by a second "#"
character, followed by the encoded link identifier. There is no index
number; in the case of multiple link reference definitions, the last
definition wins.
Both the #l and #ldef REQUIRE that "#" characters be percent-encoded
if they are part of the link identifier. The percent-encoding of
other characters follow the regular rules of [RFC3986]. [MDSYNTAX]
states that identifiers (or names) "may consist of letters, numbers,
spaces, and punctuation--but they are NOT case sensitive." Characters
outside of the URI character set SHALL be percent-encoded with the
same encoding as the Markdown content. For maximum compatibility and
readability, authors who intend to reference links in fragment
identifiers SHOULD limit themselves to URI characters that do not
require percent-encoding.
4.4. Other Fragment Identifiers
Specific syntaxes may define additional fragment identifiers specific
to the syntax. For example, a syntax that incorporates "header"
information might consider #h to refer to the "header" part, and #b
to refer to the "body" part.
5. Example
The following is an example of Markdown as an e-mail attachment:
MIME-Version: 1.0
Content-Type: text/markdown; charset=UTF-8; syntax=Original;
output-type="application/xhtml+xml"
Content-Disposition: attachment; filename=readme.md
Sample HTML 4 Markdown
=============
This is some sample Markdown. [Hooray!][foo]
(Remember that link identifiers are not case-sensitive.)
Bulleted Lists
-------
Here are some bulleted lists...
* One Potato
* Two Potato
* Three Potato
Leonard Exp. April 20, 2015 [Page 14]
Internet-Draft The text/markdown Media Type October 2014
- One Tomato
- Two Tomato
- Three Tomato
More Information
-----------
[.markdown, .md](http://daringfireball.net/projects/markdown/)
has more information.
[fOo]: http://example.com/loc 'Will Not Work with Markdown.pl-1.0.1'
6. IANA Considerations
IANA is asked to register the media type text/markdown in the
Standards tree using the application provided in Section 2 of this
document.
IANA is also asked to establish a subtype registry called "Markdown
Syntaxes". Each entry in this registry shall consist of a syntax
identifier and information about the syntax, as follows:
6.1. Syntax Template
{if provisional}
PROVISIONAL REGISTRATION EXPIRES [YYYY-MM-DD date format]
Identifier: [Identifier]
Description: [Concise, prose description of the syntax, with
emphasis on its purpose and notable variations
from [MDSYNTAX] or another syntax. If the syntax
permits structured data, this fact ought to be
included. Other Markdown syntaxes may be referenced
by quoting their registered identifiers.]
Documentation: [References to documentation.]
Community of Use: [Concise, prose description of the
community of use, such as
"scholarly publications" or "screenwriting".
"General" may be entered if the community
encompasses general users of the Internet.]
[[TODO: Users (screenwriters) or use cases
(screenwriting)?]]
Leonard Exp. April 20, 2015 [Page 15]
Internet-Draft The text/markdown Media Type October 2014
[[NB: Should Versions: and Extensions: be {optional} and
therefore omittable, or should they have "None." to
indicate that no versions or extensions apply?]]
Versions:
{for each version}
Identifier: [Identifier]
Description: [Optional, concise, prose description of the
version. "N/A" SHALL be used to indicate no description.]
Extensions:
{for each extension}
Identifier: [Identifier]
Syntax:
{if Enabled}
Enabled
{if Enabled, with String}
Enabled, with String: [prose description of what the
string is (not what it does)]
{if Enabled, with URI}
Enabled, with URI: [prose description of what the URI
is (not what it does)]
Description: [Concise, prose description of the extension,
i.e., what it does.]
Documentation: [References to documentation.]
Anticipated Output Types:
{for each output-type}
[media type]
{optional} [prose description of parameter considerations]
{optional}
Additional Fragment Identifiers:
[Prose description of additional fragment identifiers,
sufficient for interoperability.]
Responsible Parties:
{for each party}
([type: individual, corporate, representative])
[Name] ...
Currently Maintained? [Yes/No]
{optional}
Implementations:
{for each implementation}
Name: [Name]
Version(s): [Significant version or versions that
implement the syntax]
Leonard Exp. April 20, 2015 [Page 16]
Internet-Draft The text/markdown Media Type October 2014
Type: ["Processor" or some other type]
References: ...
Purpose: [Concise, prose description of the implementation.]
A responsible party can be an individual author or maintainer, a
corporate author or maintainer (plus an individual contact), or a
representative of a community of interest dedicated to the Markdown
syntax.
The Versions, Extensions, Additional Fragment Identifiers, and
Implementations sections are optional.
6.2. Initial Registration
The registry shall have the following initial registration;
implementations conforming to this document MUST handle this syntax.
[MDMTUSES] provides additional exemplary syntaxes.
Identifier: Original
Description: Gruber's original Markdown syntax.
Documentation:
[MDSYNTAX]. For the "2004" version, the documentation is
provided in HTML and in Markdown, as follows:
syntax: Content-Type: text/html; charset=UTF-8
Accessed at October 12, 2014 8:27 PM (-0700)
38570 bytes
SHA-256 hash: B2EC2A62 3257F164 FBC88AE8 C7E76F3F
80F16845 105D9F3E 3E8CE25B 6F0CB33B
syntax.text: Content-Type: text/plain; charset=UTF-8
(actually text/markdown;
syntax=Original;
output-type="text/markdown")
Accessed at October 12, 2014 8:27 PM (-0700)
27784 bytes
SHA-256 hash: 01A6A07A F51838E1 8749454B 06D716BC
B1BC0EAA A21B67B7 D6FB5A6B 4FFB5D5B
Community of Use: General.
Leonard Exp. April 20, 2015 [Page 17]
Internet-Draft The text/markdown Media Type October 2014
Versions:
Identifier: 2004
Description: [MDSYNTAX] as it (is rumored to have) existed
since December 14, 2004, corresponding to
Markdown.pl 1.0.1. The version "2004" SHOULD NOT
be specified until further notice; is is only
documented for completeness (in case Gruber
revises the syntax with material contradictions).
Anticipated Output Types:
text/html
application/xhtml+xml
Responsible Parties:
(individual) John Gruber
Currently Maintained? No
Implementations:
Name: Markdown.pl
Version(s): 1.0.1, 1.0.2b8
Type: Processor
References: [MARKDOWN]
Purpose: Converts Markdown to HTML or XHTML circa 2004.
The argument "--html4tags" causes HTML output.
6.3. Reserved Identifiers
The registry SHALL have the following identifiers RESERVED. No one is
allowed to register them (or any case variations of them).
Standard
Common
Markdown
6.4. Standard of Review
Registrations are made on a First-Come, First-Served [RFC5226] basis
by anyone with a need to interoperate. While documentation is
required, any level of documentation is sufficient; thus, neither
Specification Required nor Expert Review are warranted. The checks
prescribed by this section can be performed automatically.
Syntax, version, and extension identifiers MUST comply with the
syntaxes specified in this document. Additionally, the identifier
MUST NOT differ from other registered identifiers merely by case.
Identifiers MUST conform to [[TODO: PRECIS? STRINGPREP?]]. The
purpose of this requirement is to eliminate confusingly similar
Leonard Exp. April 20, 2015 [Page 18]
Internet-Draft The text/markdown Media Type October 2014
identifiers, placing the burden on the registration process rather
than on syntax parameter parsers.
All references (including contact information) MUST be verified as
functional at the time of the registration.
If a registration is being updated, the contact information MUST
either match the prior registration and be verified, or the prior
registrant MUST confirm that the updating registrant has authority to
update the registration. As a special "escape valve", registrations
can be updated with IETF Review [RFC5226]. [[NB: Two purposes: 1) to
deal with "harmful" registrations (stale references are not a
sufficient justification); 2) to deal with registrations that are
IETF registrations, like RFC-related Markdown (but this could be
handled by listing the IETF as the contact organization, right?).]]
All fields may be updated except the syntax identifier, which is
permanent: not even case may be changed.
6.5. Provisional Registration
Any registrant may make a provisional registration to reserve a
syntax identifier. Provisional registrations include the ALL-CAPS
legend as shown in Section 6.1. All fields are optional except for
the syntax identifier and contact information. Provisional
registrations expire after three months, after which time the syntax
identifier may be reused.
7. Security Considerations
See the Security considerations entry in Section 2.
8. References
8.1. Normative References
[MARKDOWN] Gruber, J., "Daring Fireball: Markdown", December 2004,
.
[MDSYNTAX] Gruber, J., "Daring Fireball: Markdown Syntax
Documentation", December 2004,
.
[MDUTI] Gruber, J., "Daring Fireball: Uniform Type Identifier for
Markdown", August 2011,
.
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Leonard Exp. April 20, 2015 [Page 19]
Internet-Draft The text/markdown Media Type October 2014
Extensions (MIME) Part One: Format of Internet Message
Bodies", RFC 2045, November 1996.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2854] Connolly, D. and L. Masinter, "The 'text/html' Media
Type", RFC 2854, June 2000.
[RFC3778] Taft, E., Pravetz, J., Zilles, S., and L. Masinter, "The
application/pdf Media Type", RFC 3778, May 2004.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66, RFC
3986, January 2005.
[RFC5147] Wilde, E. and M. Duerst, "URI Fragment Identifiers for the
text/plain Media Type", RFC 5147, April 2008.
[RFC5226] Narten, T., and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", RFC 5226, May 2008.
[RFC5322] Resnick, P., Ed., "Internet Message Format", RFC 5322,
October 2008.
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type
Specifications and Registration Procedures", BCP 13, RFC
6838, January 2013.
8.2. Informative References
[HUMANE] Atwood, J., "Is HTML a Humane Markup Language?", May 2008,
.
[INETMEME] Solon, O., "Richard Dawkins on the internet's hijacking of
the word 'meme'", June 2013,
, .
[MDMTUSES] Leonard, S., "text/markdown Use Cases", draft-seantek-
text-markdown-use-cases-00 (work in progress), October
2014.
[PANDOC] MacFarlane, J., "Pandoc", 2014,
.
[RAILFROG] Railfrog Team, "Railfrog", April 2009,
Leonard Exp. April 20, 2015 [Page 20]
Internet-Draft The text/markdown Media Type October 2014
.
[RFC1468] Murai, J., Crispin, M., and E. van der Poel, "Japanese
Character Encoding for Internet Messages", RFC 1468, June
1993.
[RFC2392] Levinson, E., "Content-ID and Message-ID Uniform Resource
Locators", RFC 2392, August 1998.
[RFC3676] Gellens, R., "The Text/Plain Format and DelSp Parameters",
RFC 3676, February 2004.
[RFC4263] Lilly, B., "Media Subtype Registration for Media Type
text/troff", RFC 4263, January 2006.
[FOUNTAIN] Maschwitz, S. and J. August, "Fountain | A markup language
for screenwriting.", 2014, .
[FTSYNTAX] Maschwitz, S. and J. August, "Syntax - Fountain | A markup
language for screenwriting.", 1.1, March 2014,
.
Appendix A. Change Log
This draft is a continuation from draft-ietf-appsawg-text-markdown-
02.txt. These technical changes were made:
1. Proposed that the document be split into two documents: the
main document (which is normative), and a second document. The
second document (draft-seantek-text-markdown-use-cases-00)
[MDMTUSES] provides additional background information,
suggestions for preserving metadata, registration templates
for common Markdown syntaxes, and examples for common Markdown
syntaxes. RFC 2119 key words are not included in draft-
seantek-text-markdown-use-cases because this content is not
normative (at least, not as normative) compared with the main
document.
2. De-emphasized Unicode (and UTF-8 encoding) after close
consideration of the original [MDSYNTAX], and the various
proposed extensions to Markdown in the intervening time.
"CommonMark", for example, places stronger emphasizes on
Unicode (and UTF-8).
3. Deleted processor parameter.
4. Renamed flavor parameter to syntax parameter.
5. Renamed "rules" to "extensions" in the syntax parameter.
6. Parameterized "extensions" so that it can have a string or a
Leonard Exp. April 20, 2015 [Page 21]
Internet-Draft The text/markdown Media Type October 2014
URI.
7. Simplified the syntax parameter (compared to draft-02, in any
event) with fewer exceptional cases in the ABNF.
8. Rewrote significant parts of the output-type parameter, and
gave text/markdown additional explanation.
9. Rewrote the introduction so that it is much shorter.
10. Moved the example towards the end.
11. Added Fragment Identifier Considerations.
12. Consolidated the Security Considerations into the registration
template.
13. Rewrote the IANA Considerations section so that it only
creates one new registry.
14. Redefined the flavors registry (now called the Markdown
Syntaxes registry).
15. Rewrote the "Original" syntax registration to conform to the
new registration template.
16. Added a discussion and example of the Paradigmatic Use Case
(Markdown Editors).
Author's Address
Sean Leonard
Penango, Inc.
5900 Wilshire Boulevard
21st Floor
Los Angeles, CA 90036
USA
EMail: dev+ietf@seantek.com
URI: http://www.penango.com/
Leonard Exp. April 20, 2015 [Page 22]