Network Working Group P. Faltstrom, Ed. Internet-Draft Cisco Intended status: Standards Track May 21, 2007 Expires: November 22, 2007 The Unicode Codepoints and IDN draft-faltstrom-idnabis-tables-02.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on November 22, 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This document specifies rules for deciding whether a codepoint, considered in isolation, is a candidate for inclusion in an Internationalized Domain Name. It is part of the specification of IDNA200X. Faltstrom Expires November 22, 2007 [Page 1] Internet-Draft Unicode Codepoints May 2007 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. The rules used . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Rule A - Classes of characters . . . . . . . . . . . . . . 4 2.2. Rule B - Normalization . . . . . . . . . . . . . . . . . . 5 2.3. Rule C - Casefolding . . . . . . . . . . . . . . . . . . . 5 2.4. Rule D - Ignorables . . . . . . . . . . . . . . . . . . . 6 2.5. Rule E - Obsolete scripts . . . . . . . . . . . . . . . . 6 2.6. Rule F - Blocks of characters . . . . . . . . . . . . . . 6 2.7. Rule G - ASCII . . . . . . . . . . . . . . . . . . . . . . 7 2.8. Rule H - Stable scripts . . . . . . . . . . . . . . . . . 7 3. Calculation of the derived property . . . . . . . . . . . . . 7 4. Codepoints . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.1. Codepoints in Unicode Character Database (UCD) format . . 8 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 40 6. Security Considerations . . . . . . . . . . . . . . . . . . . 40 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 40 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 40 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 40 9.1. Normative References . . . . . . . . . . . . . . . . . . . 40 9.2. Informative References . . . . . . . . . . . . . . . . . . 41 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 41 Intellectual Property and Copyright Statements . . . . . . . . . . 42 Faltstrom Expires November 22, 2007 [Page 2] Internet-Draft Unicode Codepoints May 2007 1. Introduction RFC 4690 [RFC4690] suggests an inclusion based approach for selecting the codepoints from The Unicode Standard [Unicode5] that should be included in the list of codepoints that may be used in Internationalized Domain Names. Specifically, RFC 4690 [RFC4690] says the following: The IAB has concluded that there is a consensus within the broader community that lists of code points should be specified by the use of an inclusion-based mechanism (i.e., identifying the characters that are permitted), rather than by excluding a small number of characters from the total Unicode set as Stringprep [RFC3454] and Nameprep [RFC3491] do today. That conclusion should be reviewed by the IETF community and action taken as appropriate. This document reviews the collections of codepoints in Unicode from by looking at various properties of the codepoints, and defines a derived property that identify groups of characters. o Those that should clearly be included in IDNs o Those that should clearly not be included in IDNs o Those where no final determination can be made at this time There is a need for many categories and not only two because there are complex trade offs involved and sometimes just due to lack of sufficient information. It is based on Unicode 5.0, rather than the earlier Unicode 3.2, in order to take advantage of the expanded character repertoire and better definitions in the newer version. The mechanisms described here allow determination of the value of the property for characters added after Unicode 5.0, so should be suitable for newer revisions of Unicode, as long as the properties on which it is based remain stable. Some combinations of allowed codepoints are not advisable for use in IDNs, due to rules specific to a script or class of characters; these rules are described in other documents. Some codepoints need to be allowed in exceptional circumstances, but should be excluded in all other cases; these rules are also described in other documents. The most notable of these are the ZERO WIDTH JOINER (U+200D) and ZERO WIDTH NON-JOINER (U+200C). This document is part of a series that, together, constitute a preliminary proposal for updating the IDNA standards to resolve issues uncovered in recent years, cover a broader range of scripts, and provide for migration to newer versions of Unicode. See Faltstrom Expires November 22, 2007 [Page 3] Internet-Draft Unicode Codepoints May 2007 [IDNA-issues] for a broader discussion. 2. The rules used The derived property get its value based on a series of rules, rules that are based on core properties defined by the Unicode Standard as well as other data about the codepoint such as what script and block the codepoint belong to. It only in one of the rules set the value based on what explicit codepoint is tested, and that is for compatibility with "traditional DNS" where uppercase letters A-Z are allowed. For each rule, it is specified whether it is a rule that increase or decrease the value of the property (regarding likelihood to be included in a U-label), how the rule is calculated (where in the Unicode Standard the data for the rule is found), and sometimes other comments and notes related to the rule or the data used to calculate the rule. In many cases aliases are used in the data in the Unicode Standard. As long as possible this document tries to use the spelled out terms (for example Lowercase_Letter) and not the alias (Ll) although in the rules, the shorthand version is used. 2.1. Rule A - Classes of characters generalCategory(cp) is in {Ll, Lu, Lo, Lm, Mn, Mc, Nd} The rule is intended to include characters commonly used in identifiers, and will not be changed. The generalCategory of a codepoint is found in UnicodeData.txt in the third column. The mapping between the alias (for example Ll) and the name of the general category is found in PropertyValueAliases.txt under the heading General_Category. The categories used in this rule are: o Ll - Lowercase_Letter o Lu - Uppercase_Letter o Lo - Other_Letter o Lm - Modifier_Letter o Mn - Nonspacing_Mark o Mc - Spacing_Mark o Nd - Decimal_Number Faltstrom Expires November 22, 2007 [Page 4] Internet-Draft Unicode Codepoints May 2007 2.2. Rule B - Normalization NFKC(cp) != cp The rule is intended to exclude all characters that change under normalization, and will not be changed. The normalization algorithm NFKC is defined as NFKD (canonical) decomposition followed by canonical composition. Normalization rules are found in UnicodeData.txt in the sixth column. The data (sixth column) include both the normalization and, in the case of canonical and not compatibilty mapping, also the decomposition type (within angle brackets). An exclusion table for composition exists in CompositionExclusions.txt. Singleton decompostions are never composed. Hangul is decomposed and composed according to an algorithm specified in the Unicode Standard. It has been discussed whether NFC should be used instead of NFKC. Known codepoints that have issues with normalization include U+0140 (LATIN SMALL LETTER L WITH MIDDLE DOT) that belong to General Category Ll while the normalized data is U+006C U+00B7 (LATIN SMALL LETTER L followed by MIDDLE DOT) where U+00B7 is of General Category Po (Punctuation_Other). This implies that if this rule make the codepoint not be included, and instead the decomposed set of codepoints should be used, other rules that look at the General Category of the codepoint might reject one or more codepoints in the decomposed codepoint. 2.3. Rule C - Casefolding casefold(cp) != cp The rule is intended to exclude all characters that change when folded to lower case, and will not be changed. Case folding rules can in general be found in UnicodeData.txt in the 14th column. A table with special cases can be found in SpecialCasing.txt. The SpecialCasing.txt file do include both general casing rules, but also conditional mappings. Only unconditional mappings are included in the rule. One can see whether the mapping is conditional or not in the 6th column of SpecialCasing.txt. If the column is empty, the mapping is unconditional. Known codepoints with case folding that is difficult include U+00DF (LATIN SMALL LETTER SHARP S) and U+0130 (LATIN CAPITAL LETTER I WITH DOT ABOVE). Faltstrom Expires November 22, 2007 [Page 5] Internet-Draft Unicode Codepoints May 2007 2.4. Rule D - Ignorables property(cp) is in {Other_Default_Ignorable_Code_Point, Noncharacter_Code_Point} The rule is intended to exclude ignorable characters, and will not be changed. Properties codepoints have are listed in PropList.txt. Note that there are also derived properties in DerivedCoreProperties.txt, which are not used for reasons explained above. It has been discussed whether Default_Ignorable_Code_Point should be used, but as that is a derived property, it is not used. The definition is Other_Default_Ignorable_Code_Point + Cf + Cc + Cs + Noncharacters - White_Space - FFF9..FFFB (Annotation Characters) (see DerivedCoreProperties.txt) where Noncharacters is a property only existing in the NamesList.txt. The property seems to be similar to the Noncharacter_Code_Point defined in PropList.txt, but the properties are not aliases. 2.5. Rule E - Obsolete scripts script(cp) in {Cuneiform, Ugaritic, Old_Persian, Gothic, Old_Italic, Cypriot, Linear_B, Phoenician, Kharoshthi, Phags_Pa, Glagolitic, Shavian, Deseret, Osmanya, Ogham} The rule is intended to exclude obsolete scripts. Scripts may be added to or removed from the list in the future. What script a codepoint belongs to is listed in Scripts.txt. Note that aliases for scripts can be found in PropertyValueAliases.txt, so Xsux is for example an alias for the script Cuneiform. 2.6. Rule F - Blocks of characters block(cp) in {Combining_Diacritical_Marks_for_Symbols, Musical_Symbols, Ancient_Greek_Musical_Notation} The rule is intended to exclude characters not useful for identifiers. Blocks may be added to or removed from the list in the future. What block a codepoint belongs to is listed in Blocks.txt. Faltstrom Expires November 22, 2007 [Page 6] Internet-Draft Unicode Codepoints May 2007 2.7. Rule G - ASCII cp is in [-A-Z0-9] The rule is intended to make sure all ASCII characters are allowed, in conformance with rules for non-IDNA hostnames. It will not be changed. The characters in this rule is defined by the ASCII characters. What the rule say is that anything that was part of LDH (Letter, Digit, Hyphen) definition is also allowed in a U-label. 2.8. Rule H - Stable scripts script(cp) in {Latin, Greek, Cyrillic} The rule is intended to show which scripts have encodings are stable enough for use in IDNs. New scripts may be added in the future, possibly with additional documentation for whole-label rules relevant to that script, but no script will be removed. It has been seeded with the script that subsumes ASCII plus the two scripts most proximal to it, and it is expected that other scripts will be added. These scripts are stable enough so that the characters placed into the NEVER category by these rules cannot possibly be candidates for reassignment to the ALWAYS category. 3. Calculation of the derived property The possible values of the properties are: o ALWAYS o MAYBE YES o MAYBE NOT o NEVER The algorithm to calculate the value of the derived property is as follows: o If rule G is true, the value is ALWAYS. Processing terminates. o If rule E or F is true, the value is MAYBE NOT. Processing terminates. o If rule H is true, and at least one of rule B, C or D are true, the value is NEVER. o If rule H is false, and at least one of rule B, C or D are true, the value is MAYBE NOT. Faltstrom Expires November 22, 2007 [Page 7] Internet-Draft Unicode Codepoints May 2007 o If rule H is true, and rule A is false, the value is NEVER. o If rule H is false, and rule A is false, the value is MAYBE NOT. o If rule H is true, rule A is true, and all of rules B, C or D are false, the value is ALWAYS. o If rule H is false, rule A is true, and all of rules B, C and D are false, the value is MAYBE YES. The reasoning behind treating Latin, Greek and Cyrillic differently from other scripts in rules A, B, C and D is that we feel confident that we understand these scripts well enough to be sure that the characters placed into the NEVER category by these rules cannot possibly be candidates for reassignment to the ALWAYS category. As our confidence grows with regard to other scripts, we expect that they will be treated the same way. 4. Codepoints If one apply these rules to the codepoints 0x0000 to 0xFFFF, the result is as follows. One can also have a look at [codepoints] for a listing of all individual codepoints, per script. 4.1. Codepoints in Unicode Character Database (UCD) format 0000..002C ; MAYBE NOT # ..COMMA 002D ; ALWAYS # HYPHEN-MINUS 002E..002F ; MAYBE NOT # FULL STOP..SOLIDUS 0030..0039 ; ALWAYS # DIGIT ZERO..DIGIT NINE 003A..0040 ; MAYBE NOT # COLON..COMMERCIAL AT 0041..005A ; ALWAYS # LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z 005B..0060 ; MAYBE NOT # LEFT SQUARE BRACKET..GRAVE ACCENT 0061..007A ; ALWAYS # LATIN SMALL LETTER A..LATIN SMALL LETTER Z 007B..00A9 ; MAYBE NOT # LEFT CURLY BRACKET..COPYRIGHT SIGN 00AA ; NEVER # FEMININE ORDINAL INDICATOR 00AB..00B9 ; MAYBE NOT # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK..SUPE 00BA ; NEVER # MASCULINE ORDINAL INDICATOR 00BB..00BF ; MAYBE NOT # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK..INV 00C0..00D6 ; NEVER # LATIN CAPITAL LETTER A WITH GRAVE..LATIN CAPITA 00D7 ; MAYBE NOT # MULTIPLICATION SIGN 00D8..00DF ; NEVER # LATIN CAPITAL LETTER O WITH STROKE..LATIN SMALL 00E0..00F6 ; ALWAYS # LATIN SMALL LETTER A WITH GRAVE..LATIN SMALL LE 00F7 ; MAYBE NOT # DIVISION SIGN 00F8..00FF ; ALWAYS # LATIN SMALL LETTER O WITH STROKE..LATIN SMALL L 0100 ; NEVER # LATIN CAPITAL LETTER A WITH MACRON 0101 ; ALWAYS # LATIN SMALL LETTER A WITH MACRON 0102 ; NEVER # LATIN CAPITAL LETTER A WITH BREVE 0103 ; ALWAYS # LATIN SMALL LETTER A WITH BREVE 0104 ; NEVER # LATIN CAPITAL LETTER A WITH OGONEK Faltstrom Expires November 22, 2007 [Page 8] Internet-Draft Unicode Codepoints May 2007 0105 ; ALWAYS # LATIN SMALL LETTER A WITH OGONEK 0106 ; NEVER # LATIN CAPITAL LETTER C WITH ACUTE 0107 ; ALWAYS # LATIN SMALL LETTER C WITH ACUTE 0108 ; NEVER # LATIN CAPITAL LETTER C WITH CIRCUMFLEX 0109 ; ALWAYS # LATIN SMALL LETTER C WITH CIRCUMFLEX 010A ; NEVER # LATIN CAPITAL LETTER C WITH DOT ABOVE 010B ; ALWAYS # LATIN SMALL LETTER C WITH DOT ABOVE 010C ; NEVER # LATIN CAPITAL LETTER C WITH CARON 010D ; ALWAYS # LATIN SMALL LETTER C WITH CARON 010E ; NEVER # LATIN CAPITAL LETTER D WITH CARON 010F ; ALWAYS # LATIN SMALL LETTER D WITH CARON 0110 ; NEVER # LATIN CAPITAL LETTER D WITH STROKE 0111 ; ALWAYS # LATIN SMALL LETTER D WITH STROKE 0112 ; NEVER # LATIN CAPITAL LETTER E WITH MACRON 0113 ; ALWAYS # LATIN SMALL LETTER E WITH MACRON 0114 ; NEVER # LATIN CAPITAL LETTER E WITH BREVE 0115 ; ALWAYS # LATIN SMALL LETTER E WITH BREVE 0116 ; NEVER # LATIN CAPITAL LETTER E WITH DOT ABOVE 0117 ; ALWAYS # LATIN SMALL LETTER E WITH DOT ABOVE 0118 ; NEVER # LATIN CAPITAL LETTER E WITH OGONEK 0119 ; ALWAYS # LATIN SMALL LETTER E WITH OGONEK 011A ; NEVER # LATIN CAPITAL LETTER E WITH CARON 011B ; ALWAYS # LATIN SMALL LETTER E WITH CARON 011C ; NEVER # LATIN CAPITAL LETTER G WITH CIRCUMFLEX 011D ; ALWAYS # LATIN SMALL LETTER G WITH CIRCUMFLEX 011E ; NEVER # LATIN CAPITAL LETTER G WITH BREVE 011F ; ALWAYS # LATIN SMALL LETTER G WITH BREVE 0120 ; NEVER # LATIN CAPITAL LETTER G WITH DOT ABOVE 0121 ; ALWAYS # LATIN SMALL LETTER G WITH DOT ABOVE 0122 ; NEVER # LATIN CAPITAL LETTER G WITH CEDILLA 0123 ; ALWAYS # LATIN SMALL LETTER G WITH CEDILLA 0124 ; NEVER # LATIN CAPITAL LETTER H WITH CIRCUMFLEX 0125 ; ALWAYS # LATIN SMALL LETTER H WITH CIRCUMFLEX 0126 ; NEVER # LATIN CAPITAL LETTER H WITH STROKE 0127 ; ALWAYS # LATIN SMALL LETTER H WITH STROKE 0128 ; NEVER # LATIN CAPITAL LETTER I WITH TILDE 0129 ; ALWAYS # LATIN SMALL LETTER I WITH TILDE 012A ; NEVER # LATIN CAPITAL LETTER I WITH MACRON 012B ; ALWAYS # LATIN SMALL LETTER I WITH MACRON 012C ; NEVER # LATIN CAPITAL LETTER I WITH BREVE 012D ; ALWAYS # LATIN SMALL LETTER I WITH BREVE 012E ; NEVER # LATIN CAPITAL LETTER I WITH OGONEK 012F..0131 ; ALWAYS # LATIN SMALL LETTER I WITH OGONEK..LATIN SMALL L 0132..0134 ; NEVER # LATIN CAPITAL LIGATURE IJ..LATIN CAPITAL LETTER 0135 ; ALWAYS # LATIN SMALL LETTER J WITH CIRCUMFLEX 0136 ; NEVER # LATIN CAPITAL LETTER K WITH CEDILLA 0137..0138 ; ALWAYS # LATIN SMALL LETTER K WITH CEDILLA..LATIN SMALL 0139 ; NEVER # LATIN CAPITAL LETTER L WITH ACUTE Faltstrom Expires November 22, 2007 [Page 9] Internet-Draft Unicode Codepoints May 2007 013A ; ALWAYS # LATIN SMALL LETTER L WITH ACUTE 013B ; NEVER # LATIN CAPITAL LETTER L WITH CEDILLA 013C ; ALWAYS # LATIN SMALL LETTER L WITH CEDILLA 013D ; NEVER # LATIN CAPITAL LETTER L WITH CARON 013E ; ALWAYS # LATIN SMALL LETTER L WITH CARON 013F..0141 ; NEVER # LATIN CAPITAL LETTER L WITH MIDDLE DOT..LATIN C 0142 ; ALWAYS # LATIN SMALL LETTER L WITH STROKE 0143 ; NEVER # LATIN CAPITAL LETTER N WITH ACUTE 0144 ; ALWAYS # LATIN SMALL LETTER N WITH ACUTE 0145 ; NEVER # LATIN CAPITAL LETTER N WITH CEDILLA 0146 ; ALWAYS # LATIN SMALL LETTER N WITH CEDILLA 0147 ; NEVER # LATIN CAPITAL LETTER N WITH CARON 0148 ; ALWAYS # LATIN SMALL LETTER N WITH CARON 0149..014A ; NEVER # LATIN SMALL LETTER N PRECEDED BY APOSTROPHE..LA 014B ; ALWAYS # LATIN SMALL LETTER ENG 014C ; NEVER # LATIN CAPITAL LETTER O WITH MACRON 014D ; ALWAYS # LATIN SMALL LETTER O WITH MACRON 014E ; NEVER # LATIN CAPITAL LETTER O WITH BREVE 014F ; ALWAYS # LATIN SMALL LETTER O WITH BREVE 0150 ; NEVER # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE 0151 ; ALWAYS # LATIN SMALL LETTER O WITH DOUBLE ACUTE 0152 ; NEVER # LATIN CAPITAL LIGATURE OE 0153 ; ALWAYS # LATIN SMALL LIGATURE OE 0154 ; NEVER # LATIN CAPITAL LETTER R WITH ACUTE 0155 ; ALWAYS # LATIN SMALL LETTER R WITH ACUTE 0156 ; NEVER # LATIN CAPITAL LETTER R WITH CEDILLA 0157 ; ALWAYS # LATIN SMALL LETTER R WITH CEDILLA 0158 ; NEVER # LATIN CAPITAL LETTER R WITH CARON 0159 ; ALWAYS # LATIN SMALL LETTER R WITH CARON 015A ; NEVER # LATIN CAPITAL LETTER S WITH ACUTE 015B ; ALWAYS # LATIN SMALL LETTER S WITH ACUTE 015C ; NEVER # LATIN CAPITAL LETTER S WITH CIRCUMFLEX 015D ; ALWAYS # LATIN SMALL LETTER S WITH CIRCUMFLEX 015E ; NEVER # LATIN CAPITAL LETTER S WITH CEDILLA 015F ; ALWAYS # LATIN SMALL LETTER S WITH CEDILLA 0160 ; NEVER # LATIN CAPITAL LETTER S WITH CARON 0161 ; ALWAYS # LATIN SMALL LETTER S WITH CARON 0162 ; NEVER # LATIN CAPITAL LETTER T WITH CEDILLA 0163 ; ALWAYS # LATIN SMALL LETTER T WITH CEDILLA 0164 ; NEVER # LATIN CAPITAL LETTER T WITH CARON 0165 ; ALWAYS # LATIN SMALL LETTER T WITH CARON 0166 ; NEVER # LATIN CAPITAL LETTER T WITH STROKE 0167 ; ALWAYS # LATIN SMALL LETTER T WITH STROKE 0168 ; NEVER # LATIN CAPITAL LETTER U WITH TILDE 0169 ; ALWAYS # LATIN SMALL LETTER U WITH TILDE 016A ; NEVER # LATIN CAPITAL LETTER U WITH MACRON 016B ; ALWAYS # LATIN SMALL LETTER U WITH MACRON 016C ; NEVER # LATIN CAPITAL LETTER U WITH BREVE Faltstrom Expires November 22, 2007 [Page 10] Internet-Draft Unicode Codepoints May 2007 016D ; ALWAYS # LATIN SMALL LETTER U WITH BREVE 016E ; NEVER # LATIN CAPITAL LETTER U WITH RING ABOVE 016F ; ALWAYS # LATIN SMALL LETTER U WITH RING ABOVE 0170 ; NEVER # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE 0171 ; ALWAYS # LATIN SMALL LETTER U WITH DOUBLE ACUTE 0172 ; NEVER # LATIN CAPITAL LETTER U WITH OGONEK 0173 ; ALWAYS # LATIN SMALL LETTER U WITH OGONEK 0174 ; NEVER # LATIN CAPITAL LETTER W WITH CIRCUMFLEX 0175 ; ALWAYS # LATIN SMALL LETTER W WITH CIRCUMFLEX 0176 ; NEVER # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX 0177 ; ALWAYS # LATIN SMALL LETTER Y WITH CIRCUMFLEX 0178..0179 ; NEVER # LATIN CAPITAL LETTER Y WITH DIAERESIS..LATIN CA 017A ; ALWAYS # LATIN SMALL LETTER Z WITH ACUTE 017B ; NEVER # LATIN CAPITAL LETTER Z WITH DOT ABOVE 017C ; ALWAYS # LATIN SMALL LETTER Z WITH DOT ABOVE 017D ; NEVER # LATIN CAPITAL LETTER Z WITH CARON 017E ; ALWAYS # LATIN SMALL LETTER Z WITH CARON 017F ; NEVER # LATIN SMALL LETTER LONG S 0180 ; ALWAYS # LATIN SMALL LETTER B WITH STROKE 0181..0182 ; NEVER # LATIN CAPITAL LETTER B WITH HOOK..LATIN CAPITAL 0183 ; ALWAYS # LATIN SMALL LETTER B WITH TOPBAR 0184 ; NEVER # LATIN CAPITAL LETTER TONE SIX 0185 ; ALWAYS # LATIN SMALL LETTER TONE SIX 0186..0187 ; NEVER # LATIN CAPITAL LETTER OPEN O..LATIN CAPITAL LETT 0188 ; ALWAYS # LATIN SMALL LETTER C WITH HOOK 0189..018B ; NEVER # LATIN CAPITAL LETTER AFRICAN D..LATIN CAPITAL L 018C..018D ; ALWAYS # LATIN SMALL LETTER D WITH TOPBAR..LATIN SMALL L 018E..0191 ; NEVER # LATIN CAPITAL LETTER REVERSED E..LATIN CAPITAL 0192 ; ALWAYS # LATIN SMALL LETTER F WITH HOOK 0193..0194 ; NEVER # LATIN CAPITAL LETTER G WITH HOOK..LATIN CAPITAL 0195 ; ALWAYS # LATIN SMALL LETTER HV 0196..0198 ; NEVER # LATIN CAPITAL LETTER IOTA..LATIN CAPITAL LETTER 0199..019B ; ALWAYS # LATIN SMALL LETTER K WITH HOOK..LATIN SMALL LET 019C..019D ; NEVER # LATIN CAPITAL LETTER TURNED M..LATIN CAPITAL LE 019E ; ALWAYS # LATIN SMALL LETTER N WITH LONG RIGHT LEG 019F..01A0 ; NEVER # LATIN CAPITAL LETTER O WITH MIDDLE TILDE..LATIN 01A1 ; ALWAYS # LATIN SMALL LETTER O WITH HORN 01A2 ; NEVER # LATIN CAPITAL LETTER OI 01A3 ; ALWAYS # LATIN SMALL LETTER OI 01A4 ; NEVER # LATIN CAPITAL LETTER P WITH HOOK 01A5 ; ALWAYS # LATIN SMALL LETTER P WITH HOOK 01A6..01A7 ; NEVER # LATIN LETTER YR..LATIN CAPITAL LETTER TONE TWO 01A8 ; ALWAYS # LATIN SMALL LETTER TONE TWO 01A9 ; NEVER # LATIN CAPITAL LETTER ESH 01AA..01AB ; ALWAYS # LATIN LETTER REVERSED ESH LOOP..LATIN SMALL LET 01AC ; NEVER # LATIN CAPITAL LETTER T WITH HOOK 01AD ; ALWAYS # LATIN SMALL LETTER T WITH HOOK 01AE..01AF ; NEVER # LATIN CAPITAL LETTER T WITH RETROFLEX HOOK..LAT Faltstrom Expires November 22, 2007 [Page 11] Internet-Draft Unicode Codepoints May 2007 01B0 ; ALWAYS # LATIN SMALL LETTER U WITH HORN 01B1..01B3 ; NEVER # LATIN CAPITAL LETTER UPSILON..LATIN CAPITAL LET 01B4 ; ALWAYS # LATIN SMALL LETTER Y WITH HOOK 01B5 ; NEVER # LATIN CAPITAL LETTER Z WITH STROKE 01B6 ; ALWAYS # LATIN SMALL LETTER Z WITH STROKE 01B7..01B8 ; NEVER # LATIN CAPITAL LETTER EZH..LATIN CAPITAL LETTER 01B9..01BB ; ALWAYS # LATIN SMALL LETTER EZH REVERSED..LATIN LETTER T 01BC ; NEVER # LATIN CAPITAL LETTER TONE FIVE 01BD..01C3 ; ALWAYS # LATIN SMALL LETTER TONE FIVE..LATIN LETTER RETR 01C4..01CD ; NEVER # LATIN CAPITAL LETTER DZ WITH CARON..LATIN CAPIT 01CE ; ALWAYS # LATIN SMALL LETTER A WITH CARON 01CF ; NEVER # LATIN CAPITAL LETTER I WITH CARON 01D0 ; ALWAYS # LATIN SMALL LETTER I WITH CARON 01D1 ; NEVER # LATIN CAPITAL LETTER O WITH CARON 01D2 ; ALWAYS # LATIN SMALL LETTER O WITH CARON 01D3 ; NEVER # LATIN CAPITAL LETTER U WITH CARON 01D4 ; ALWAYS # LATIN SMALL LETTER U WITH CARON 01D5..01DC ; NEVER # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRO 01DD ; ALWAYS # LATIN SMALL LETTER TURNED E 01DE..01E2 ; NEVER # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRO 01E3 ; ALWAYS # LATIN SMALL LETTER AE WITH MACRON 01E4 ; NEVER # LATIN CAPITAL LETTER G WITH STROKE 01E5 ; ALWAYS # LATIN SMALL LETTER G WITH STROKE 01E6 ; NEVER # LATIN CAPITAL LETTER G WITH CARON 01E7 ; ALWAYS # LATIN SMALL LETTER G WITH CARON 01E8 ; NEVER # LATIN CAPITAL LETTER K WITH CARON 01E9 ; ALWAYS # LATIN SMALL LETTER K WITH CARON 01EA ; NEVER # LATIN CAPITAL LETTER O WITH OGONEK 01EB ; ALWAYS # LATIN SMALL LETTER O WITH OGONEK 01EC..01EE ; NEVER # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON.. 01EF ; ALWAYS # LATIN SMALL LETTER EZH WITH CARON 01F0..01F4 ; NEVER # LATIN SMALL LETTER J WITH CARON..LATIN CAPITAL 01F5 ; ALWAYS # LATIN SMALL LETTER G WITH ACUTE 01F6..01F8 ; NEVER # LATIN CAPITAL LETTER HWAIR..LATIN CAPITAL LETTE 01F9 ; ALWAYS # LATIN SMALL LETTER N WITH GRAVE 01FA..01FC ; NEVER # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUT 01FD ; ALWAYS # LATIN SMALL LETTER AE WITH ACUTE 01FE ; NEVER # LATIN CAPITAL LETTER O WITH STROKE AND ACUTE 01FF ; ALWAYS # LATIN SMALL LETTER O WITH STROKE AND ACUTE 0200 ; NEVER # LATIN CAPITAL LETTER A WITH DOUBLE GRAVE 0201 ; ALWAYS # LATIN SMALL LETTER A WITH DOUBLE GRAVE 0202 ; NEVER # LATIN CAPITAL LETTER A WITH INVERTED BREVE 0203 ; ALWAYS # LATIN SMALL LETTER A WITH INVERTED BREVE 0204 ; NEVER # LATIN CAPITAL LETTER E WITH DOUBLE GRAVE 0205 ; ALWAYS # LATIN SMALL LETTER E WITH DOUBLE GRAVE 0206 ; NEVER # LATIN CAPITAL LETTER E WITH INVERTED BREVE 0207 ; ALWAYS # LATIN SMALL LETTER E WITH INVERTED BREVE 0208 ; NEVER # LATIN CAPITAL LETTER I WITH DOUBLE GRAVE Faltstrom Expires November 22, 2007 [Page 12] Internet-Draft Unicode Codepoints May 2007 0209 ; ALWAYS # LATIN SMALL LETTER I WITH DOUBLE GRAVE 020A ; NEVER # LATIN CAPITAL LETTER I WITH INVERTED BREVE 020B ; ALWAYS # LATIN SMALL LETTER I WITH INVERTED BREVE 020C ; NEVER # LATIN CAPITAL LETTER O WITH DOUBLE GRAVE 020D ; ALWAYS # LATIN SMALL LETTER O WITH DOUBLE GRAVE 020E ; NEVER # LATIN CAPITAL LETTER O WITH INVERTED BREVE 020F ; ALWAYS # LATIN SMALL LETTER O WITH INVERTED BREVE 0210 ; NEVER # LATIN CAPITAL LETTER R WITH DOUBLE GRAVE 0211 ; ALWAYS # LATIN SMALL LETTER R WITH DOUBLE GRAVE 0212 ; NEVER # LATIN CAPITAL LETTER R WITH INVERTED BREVE 0213 ; ALWAYS # LATIN SMALL LETTER R WITH INVERTED BREVE 0214 ; NEVER # LATIN CAPITAL LETTER U WITH DOUBLE GRAVE 0215 ; ALWAYS # LATIN SMALL LETTER U WITH DOUBLE GRAVE 0216 ; NEVER # LATIN CAPITAL LETTER U WITH INVERTED BREVE 0217 ; ALWAYS # LATIN SMALL LETTER U WITH INVERTED BREVE 0218 ; NEVER # LATIN CAPITAL LETTER S WITH COMMA BELOW 0219 ; ALWAYS # LATIN SMALL LETTER S WITH COMMA BELOW 021A ; NEVER # LATIN CAPITAL LETTER T WITH COMMA BELOW 021B ; ALWAYS # LATIN SMALL LETTER T WITH COMMA BELOW 021C ; NEVER # LATIN CAPITAL LETTER YOGH 021D ; ALWAYS # LATIN SMALL LETTER YOGH 021E ; NEVER # LATIN CAPITAL LETTER H WITH CARON 021F ; ALWAYS # LATIN SMALL LETTER H WITH CARON 0220 ; NEVER # LATIN CAPITAL LETTER N WITH LONG RIGHT LEG 0221 ; ALWAYS # LATIN SMALL LETTER D WITH CURL 0222 ; NEVER # LATIN CAPITAL LETTER OU 0223 ; ALWAYS # LATIN SMALL LETTER OU 0224 ; NEVER # LATIN CAPITAL LETTER Z WITH HOOK 0225 ; ALWAYS # LATIN SMALL LETTER Z WITH HOOK 0226 ; NEVER # LATIN CAPITAL LETTER A WITH DOT ABOVE 0227 ; ALWAYS # LATIN SMALL LETTER A WITH DOT ABOVE 0228 ; NEVER # LATIN CAPITAL LETTER E WITH CEDILLA 0229 ; ALWAYS # LATIN SMALL LETTER E WITH CEDILLA 022A..022E ; NEVER # LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRO 022F ; ALWAYS # LATIN SMALL LETTER O WITH DOT ABOVE 0230..0232 ; NEVER # LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRO 0233..0239 ; ALWAYS # LATIN SMALL LETTER Y WITH MACRON..LATIN SMALL L 023A..023B ; NEVER # LATIN CAPITAL LETTER A WITH STROKE..LATIN CAPIT 023C ; ALWAYS # LATIN SMALL LETTER C WITH STROKE 023D..023E ; NEVER # LATIN CAPITAL LETTER L WITH BAR..LATIN CAPITAL 023F..0240 ; ALWAYS # LATIN SMALL LETTER S WITH SWASH TAIL..LATIN SMA 0241 ; NEVER # LATIN CAPITAL LETTER GLOTTAL STOP 0242 ; ALWAYS # LATIN SMALL LETTER GLOTTAL STOP 0243..0246 ; NEVER # LATIN CAPITAL LETTER B WITH STROKE..LATIN CAPIT 0247 ; ALWAYS # LATIN SMALL LETTER E WITH STROKE 0248 ; NEVER # LATIN CAPITAL LETTER J WITH STROKE 0249 ; ALWAYS # LATIN SMALL LETTER J WITH STROKE 024A ; NEVER # LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL Faltstrom Expires November 22, 2007 [Page 13] Internet-Draft Unicode Codepoints May 2007 024B ; ALWAYS # LATIN SMALL LETTER Q WITH HOOK TAIL 024C ; NEVER # LATIN CAPITAL LETTER R WITH STROKE 024D ; ALWAYS # LATIN SMALL LETTER R WITH STROKE 024E ; NEVER # LATIN CAPITAL LETTER Y WITH STROKE 024F..02AF ; ALWAYS # LATIN SMALL LETTER Y WITH STROKE..LATIN SMALL L 02B0..02B8 ; NEVER # MODIFIER LETTER SMALL H..MODIFIER LETTER SMALL 02B9..02C1 ; MAYBE YES # MODIFIER LETTER PRIME..MODIFIER LETTER REVERSED 02C2..02C5 ; MAYBE NOT # MODIFIER LETTER LEFT ARROWHEAD..MODIFIER LETTER 02C6..02D1 ; MAYBE YES # MODIFIER LETTER CIRCUMFLEX ACCENT..MODIFIER LET 02D2..02DF ; MAYBE NOT # MODIFIER LETTER CENTRED RIGHT HALF RING..MODIFI 02E0..02E4 ; NEVER # MODIFIER LETTER SMALL GAMMA..MODIFIER LETTER SM 02E5..02ED ; MAYBE NOT # MODIFIER LETTER EXTRA-HIGH TONE BAR..MODIFIER L 02EE ; MAYBE YES # MODIFIER LETTER DOUBLE APOSTROPHE 02EF..02FF ; MAYBE NOT # MODIFIER LETTER LOW DOWN ARROWHEAD..MODIFIER LE 0300..033F ; MAYBE YES # COMBINING GRAVE ACCENT..COMBINING DOUBLE OVERLI 0340..0341 ; MAYBE NOT # COMBINING GRAVE TONE MARK..COMBINING ACUTE TONE 0342 ; MAYBE YES # COMBINING GREEK PERISPOMENI 0343..0344 ; MAYBE NOT # COMBINING GREEK KORONIS..COMBINING GREEK DIALYT 0345..034E ; MAYBE YES # COMBINING GREEK YPOGEGRAMMENI..COMBINING UPWARD 034F ; MAYBE NOT # COMBINING GRAPHEME JOINER 0350..036F ; MAYBE YES # COMBINING RIGHT ARROWHEAD ABOVE..COMBINING LATI 0370..0373 ; MAYBE NOT # .. 0374..0375 ; NEVER # GREEK NUMERAL SIGN..GREEK LOWER NUMERAL SIGN 0376..0379 ; MAYBE NOT # .. 037A..037D ; ALWAYS # GREEK YPOGEGRAMMENI..GREEK SMALL REVERSED DOTTE 037E..0383 ; MAYBE NOT # GREEK QUESTION MARK.. 0384..0386 ; NEVER # GREEK TONOS..GREEK CAPITAL LETTER ALPHA WITH TO 0387 ; MAYBE NOT # GREEK ANO TELEIA 0388..038A ; NEVER # GREEK CAPITAL LETTER EPSILON WITH TONOS..GREEK 038B ; MAYBE NOT # 038C ; NEVER # GREEK CAPITAL LETTER OMICRON WITH TONOS 038D ; MAYBE NOT # 038E..03A1 ; NEVER # GREEK CAPITAL LETTER UPSILON WITH TONOS..GREEK 03A2 ; MAYBE NOT # 03A3..03AB ; NEVER # GREEK CAPITAL LETTER SIGMA..GREEK CAPITAL LETTE 03AC..03AF ; ALWAYS # GREEK SMALL LETTER ALPHA WITH TONOS..GREEK SMAL 03B0 ; NEVER # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND T 03B1..03CE ; ALWAYS # GREEK SMALL LETTER ALPHA..GREEK SMALL LETTER OM 03CF ; MAYBE NOT # 03D0..03D6 ; NEVER # GREEK BETA SYMBOL..GREEK PI SYMBOL 03D7 ; ALWAYS # GREEK KAI SYMBOL 03D8 ; NEVER # GREEK LETTER ARCHAIC KOPPA 03D9 ; ALWAYS # GREEK SMALL LETTER ARCHAIC KOPPA 03DA ; NEVER # GREEK LETTER STIGMA 03DB ; ALWAYS # GREEK SMALL LETTER STIGMA 03DC ; NEVER # GREEK LETTER DIGAMMA 03DD ; ALWAYS # GREEK SMALL LETTER DIGAMMA 03DE ; NEVER # GREEK LETTER KOPPA Faltstrom Expires November 22, 2007 [Page 14] Internet-Draft Unicode Codepoints May 2007 03DF ; ALWAYS # GREEK SMALL LETTER KOPPA 03E0 ; NEVER # GREEK LETTER SAMPI 03E1 ; ALWAYS # GREEK SMALL LETTER SAMPI 03E2 ; MAYBE NOT # COPTIC CAPITAL LETTER SHEI 03E3 ; MAYBE YES # COPTIC SMALL LETTER SHEI 03E4 ; MAYBE NOT # COPTIC CAPITAL LETTER FEI 03E5 ; MAYBE YES # COPTIC SMALL LETTER FEI 03E6 ; MAYBE NOT # COPTIC CAPITAL LETTER KHEI 03E7 ; MAYBE YES # COPTIC SMALL LETTER KHEI 03E8 ; MAYBE NOT # COPTIC CAPITAL LETTER HORI 03E9 ; MAYBE YES # COPTIC SMALL LETTER HORI 03EA ; MAYBE NOT # COPTIC CAPITAL LETTER GANGIA 03EB ; MAYBE YES # COPTIC SMALL LETTER GANGIA 03EC ; MAYBE NOT # COPTIC CAPITAL LETTER SHIMA 03ED ; MAYBE YES # COPTIC SMALL LETTER SHIMA 03EE ; MAYBE NOT # COPTIC CAPITAL LETTER DEI 03EF ; MAYBE YES # COPTIC SMALL LETTER DEI 03F0..03F2 ; NEVER # GREEK KAPPA SYMBOL..GREEK LUNATE SIGMA SYMBOL 03F3 ; ALWAYS # GREEK LETTER YOT 03F4..03F7 ; NEVER # GREEK CAPITAL THETA SYMBOL..GREEK CAPITAL LETTE 03F8 ; ALWAYS # GREEK SMALL LETTER SHO 03F9..03FA ; NEVER # GREEK CAPITAL LUNATE SIGMA SYMBOL..GREEK CAPITA 03FB..03FC ; ALWAYS # GREEK SMALL LETTER SAN..GREEK RHO WITH STROKE S 03FD..042F ; NEVER # GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL..CYR 0430..045F ; ALWAYS # CYRILLIC SMALL LETTER A..CYRILLIC SMALL LETTER 0460 ; NEVER # CYRILLIC CAPITAL LETTER OMEGA 0461 ; ALWAYS # CYRILLIC SMALL LETTER OMEGA 0462 ; NEVER # CYRILLIC CAPITAL LETTER YAT 0463 ; ALWAYS # CYRILLIC SMALL LETTER YAT 0464 ; NEVER # CYRILLIC CAPITAL LETTER IOTIFIED E 0465 ; ALWAYS # CYRILLIC SMALL LETTER IOTIFIED E 0466 ; NEVER # CYRILLIC CAPITAL LETTER LITTLE YUS 0467 ; ALWAYS # CYRILLIC SMALL LETTER LITTLE YUS 0468 ; NEVER # CYRILLIC CAPITAL LETTER IOTIFIED LITTLE YUS 0469 ; ALWAYS # CYRILLIC SMALL LETTER IOTIFIED LITTLE YUS 046A ; NEVER # CYRILLIC CAPITAL LETTER BIG YUS 046B ; ALWAYS # CYRILLIC SMALL LETTER BIG YUS 046C ; NEVER # CYRILLIC CAPITAL LETTER IOTIFIED BIG YUS 046D ; ALWAYS # CYRILLIC SMALL LETTER IOTIFIED BIG YUS 046E ; NEVER # CYRILLIC CAPITAL LETTER KSI 046F ; ALWAYS # CYRILLIC SMALL LETTER KSI 0470 ; NEVER # CYRILLIC CAPITAL LETTER PSI 0471 ; ALWAYS # CYRILLIC SMALL LETTER PSI 0472 ; NEVER # CYRILLIC CAPITAL LETTER FITA 0473 ; ALWAYS # CYRILLIC SMALL LETTER FITA 0474 ; NEVER # CYRILLIC CAPITAL LETTER IZHITSA 0475 ; ALWAYS # CYRILLIC SMALL LETTER IZHITSA 0476 ; NEVER # CYRILLIC CAPITAL LETTER IZHITSA WITH DOUBLE GRA Faltstrom Expires November 22, 2007 [Page 15] Internet-Draft Unicode Codepoints May 2007 0477 ; ALWAYS # CYRILLIC SMALL LETTER IZHITSA WITH DOUBLE GRAVE 0478 ; NEVER # CYRILLIC CAPITAL LETTER UK 0479 ; ALWAYS # CYRILLIC SMALL LETTER UK 047A ; NEVER # CYRILLIC CAPITAL LETTER ROUND OMEGA 047B ; ALWAYS # CYRILLIC SMALL LETTER ROUND OMEGA 047C ; NEVER # CYRILLIC CAPITAL LETTER OMEGA WITH TITLO 047D ; ALWAYS # CYRILLIC SMALL LETTER OMEGA WITH TITLO 047E ; NEVER # CYRILLIC CAPITAL LETTER OT 047F ; ALWAYS # CYRILLIC SMALL LETTER OT 0480 ; NEVER # CYRILLIC CAPITAL LETTER KOPPA 0481 ; ALWAYS # CYRILLIC SMALL LETTER KOPPA 0482 ; NEVER # CYRILLIC THOUSANDS SIGN 0483..0486 ; ALWAYS # COMBINING CYRILLIC TITLO..COMBINING CYRILLIC PS 0487 ; MAYBE NOT # 0488..048A ; NEVER # COMBINING CYRILLIC HUNDRED THOUSANDS SIGN..CYRI 048B ; ALWAYS # CYRILLIC SMALL LETTER SHORT I WITH TAIL 048C ; NEVER # CYRILLIC CAPITAL LETTER SEMISOFT SIGN 048D ; ALWAYS # CYRILLIC SMALL LETTER SEMISOFT SIGN 048E ; NEVER # CYRILLIC CAPITAL LETTER ER WITH TICK 048F ; ALWAYS # CYRILLIC SMALL LETTER ER WITH TICK 0490 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH UPTURN 0491 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH UPTURN 0492 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH STROKE 0493 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH STROKE 0494 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK 0495 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK 0496 ; NEVER # CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER 0497 ; ALWAYS # CYRILLIC SMALL LETTER ZHE WITH DESCENDER 0498 ; NEVER # CYRILLIC CAPITAL LETTER ZE WITH DESCENDER 0499 ; ALWAYS # CYRILLIC SMALL LETTER ZE WITH DESCENDER 049A ; NEVER # CYRILLIC CAPITAL LETTER KA WITH DESCENDER 049B ; ALWAYS # CYRILLIC SMALL LETTER KA WITH DESCENDER 049C ; NEVER # CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE 049D ; ALWAYS # CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE 049E ; NEVER # CYRILLIC CAPITAL LETTER KA WITH STROKE 049F ; ALWAYS # CYRILLIC SMALL LETTER KA WITH STROKE 04A0 ; NEVER # CYRILLIC CAPITAL LETTER BASHKIR KA 04A1 ; ALWAYS # CYRILLIC SMALL LETTER BASHKIR KA 04A2 ; NEVER # CYRILLIC CAPITAL LETTER EN WITH DESCENDER 04A3 ; ALWAYS # CYRILLIC SMALL LETTER EN WITH DESCENDER 04A4 ; NEVER # CYRILLIC CAPITAL LIGATURE EN GHE 04A5 ; ALWAYS # CYRILLIC SMALL LIGATURE EN GHE 04A6 ; NEVER # CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK 04A7 ; ALWAYS # CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK 04A8 ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN HA 04A9 ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN HA 04AA ; NEVER # CYRILLIC CAPITAL LETTER ES WITH DESCENDER 04AB ; ALWAYS # CYRILLIC SMALL LETTER ES WITH DESCENDER Faltstrom Expires November 22, 2007 [Page 16] Internet-Draft Unicode Codepoints May 2007 04AC ; NEVER # CYRILLIC CAPITAL LETTER TE WITH DESCENDER 04AD ; ALWAYS # CYRILLIC SMALL LETTER TE WITH DESCENDER 04AE ; NEVER # CYRILLIC CAPITAL LETTER STRAIGHT U 04AF ; ALWAYS # CYRILLIC SMALL LETTER STRAIGHT U 04B0 ; NEVER # CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE 04B1 ; ALWAYS # CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE 04B2 ; NEVER # CYRILLIC CAPITAL LETTER HA WITH DESCENDER 04B3 ; ALWAYS # CYRILLIC SMALL LETTER HA WITH DESCENDER 04B4 ; NEVER # CYRILLIC CAPITAL LIGATURE TE TSE 04B5 ; ALWAYS # CYRILLIC SMALL LIGATURE TE TSE 04B6 ; NEVER # CYRILLIC CAPITAL LETTER CHE WITH DESCENDER 04B7 ; ALWAYS # CYRILLIC SMALL LETTER CHE WITH DESCENDER 04B8 ; NEVER # CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROK 04B9 ; ALWAYS # CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE 04BA ; NEVER # CYRILLIC CAPITAL LETTER SHHA 04BB ; ALWAYS # CYRILLIC SMALL LETTER SHHA 04BC ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN CHE 04BD ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN CHE 04BE ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESC 04BF ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCEN 04C0..04C1 ; NEVER # CYRILLIC LETTER PALOCHKA..CYRILLIC CAPITAL LETT 04C2 ; ALWAYS # CYRILLIC SMALL LETTER ZHE WITH BREVE 04C3 ; NEVER # CYRILLIC CAPITAL LETTER KA WITH HOOK 04C4 ; ALWAYS # CYRILLIC SMALL LETTER KA WITH HOOK 04C5 ; NEVER # CYRILLIC CAPITAL LETTER EL WITH TAIL 04C6 ; ALWAYS # CYRILLIC SMALL LETTER EL WITH TAIL 04C7 ; NEVER # CYRILLIC CAPITAL LETTER EN WITH HOOK 04C8 ; ALWAYS # CYRILLIC SMALL LETTER EN WITH HOOK 04C9 ; NEVER # CYRILLIC CAPITAL LETTER EN WITH TAIL 04CA ; ALWAYS # CYRILLIC SMALL LETTER EN WITH TAIL 04CB ; NEVER # CYRILLIC CAPITAL LETTER KHAKASSIAN CHE 04CC ; ALWAYS # CYRILLIC SMALL LETTER KHAKASSIAN CHE 04CD ; NEVER # CYRILLIC CAPITAL LETTER EM WITH TAIL 04CE..04CF ; ALWAYS # CYRILLIC SMALL LETTER EM WITH TAIL..CYRILLIC SM 04D0 ; NEVER # CYRILLIC CAPITAL LETTER A WITH BREVE 04D1 ; ALWAYS # CYRILLIC SMALL LETTER A WITH BREVE 04D2 ; NEVER # CYRILLIC CAPITAL LETTER A WITH DIAERESIS 04D3 ; ALWAYS # CYRILLIC SMALL LETTER A WITH DIAERESIS 04D4 ; NEVER # CYRILLIC CAPITAL LIGATURE A IE 04D5 ; ALWAYS # CYRILLIC SMALL LIGATURE A IE 04D6 ; NEVER # CYRILLIC CAPITAL LETTER IE WITH BREVE 04D7 ; ALWAYS # CYRILLIC SMALL LETTER IE WITH BREVE 04D8 ; NEVER # CYRILLIC CAPITAL LETTER SCHWA 04D9 ; ALWAYS # CYRILLIC SMALL LETTER SCHWA 04DA ; NEVER # CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS 04DB ; ALWAYS # CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS 04DC ; NEVER # CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS 04DD ; ALWAYS # CYRILLIC SMALL LETTER ZHE WITH DIAERESIS Faltstrom Expires November 22, 2007 [Page 17] Internet-Draft Unicode Codepoints May 2007 04DE ; NEVER # CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS 04DF ; ALWAYS # CYRILLIC SMALL LETTER ZE WITH DIAERESIS 04E0 ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN DZE 04E1 ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN DZE 04E2 ; NEVER # CYRILLIC CAPITAL LETTER I WITH MACRON 04E3 ; ALWAYS # CYRILLIC SMALL LETTER I WITH MACRON 04E4 ; NEVER # CYRILLIC CAPITAL LETTER I WITH DIAERESIS 04E5 ; ALWAYS # CYRILLIC SMALL LETTER I WITH DIAERESIS 04E6 ; NEVER # CYRILLIC CAPITAL LETTER O WITH DIAERESIS 04E7 ; ALWAYS # CYRILLIC SMALL LETTER O WITH DIAERESIS 04E8 ; NEVER # CYRILLIC CAPITAL LETTER BARRED O 04E9 ; ALWAYS # CYRILLIC SMALL LETTER BARRED O 04EA ; NEVER # CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS 04EB ; ALWAYS # CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS 04EC ; NEVER # CYRILLIC CAPITAL LETTER E WITH DIAERESIS 04ED ; ALWAYS # CYRILLIC SMALL LETTER E WITH DIAERESIS 04EE ; NEVER # CYRILLIC CAPITAL LETTER U WITH MACRON 04EF ; ALWAYS # CYRILLIC SMALL LETTER U WITH MACRON 04F0 ; NEVER # CYRILLIC CAPITAL LETTER U WITH DIAERESIS 04F1 ; ALWAYS # CYRILLIC SMALL LETTER U WITH DIAERESIS 04F2 ; NEVER # CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE 04F3 ; ALWAYS # CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE 04F4 ; NEVER # CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS 04F5 ; ALWAYS # CYRILLIC SMALL LETTER CHE WITH DIAERESIS 04F6 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH DESCENDER 04F7 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH DESCENDER 04F8 ; NEVER # CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS 04F9 ; ALWAYS # CYRILLIC SMALL LETTER YERU WITH DIAERESIS 04FA ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH STROKE AND HOO 04FB ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH STROKE AND HOOK 04FC ; NEVER # CYRILLIC CAPITAL LETTER HA WITH HOOK 04FD ; ALWAYS # CYRILLIC SMALL LETTER HA WITH HOOK 04FE ; NEVER # CYRILLIC CAPITAL LETTER HA WITH STROKE 04FF ; ALWAYS # CYRILLIC SMALL LETTER HA WITH STROKE 0500 ; NEVER # CYRILLIC CAPITAL LETTER KOMI DE 0501 ; ALWAYS # CYRILLIC SMALL LETTER KOMI DE 0502 ; NEVER # CYRILLIC CAPITAL LETTER KOMI DJE 0503 ; ALWAYS # CYRILLIC SMALL LETTER KOMI DJE 0504 ; NEVER # CYRILLIC CAPITAL LETTER KOMI ZJE 0505 ; ALWAYS # CYRILLIC SMALL LETTER KOMI ZJE 0506 ; NEVER # CYRILLIC CAPITAL LETTER KOMI DZJE 0507 ; ALWAYS # CYRILLIC SMALL LETTER KOMI DZJE 0508 ; NEVER # CYRILLIC CAPITAL LETTER KOMI LJE 0509 ; ALWAYS # CYRILLIC SMALL LETTER KOMI LJE 050A ; NEVER # CYRILLIC CAPITAL LETTER KOMI NJE 050B ; ALWAYS # CYRILLIC SMALL LETTER KOMI NJE 050C ; NEVER # CYRILLIC CAPITAL LETTER KOMI SJE 050D ; ALWAYS # CYRILLIC SMALL LETTER KOMI SJE Faltstrom Expires November 22, 2007 [Page 18] Internet-Draft Unicode Codepoints May 2007 050E ; NEVER # CYRILLIC CAPITAL LETTER KOMI TJE 050F ; ALWAYS # CYRILLIC SMALL LETTER KOMI TJE 0510 ; NEVER # CYRILLIC CAPITAL LETTER REVERSED ZE 0511 ; ALWAYS # CYRILLIC SMALL LETTER REVERSED ZE 0512 ; NEVER # CYRILLIC CAPITAL LETTER EL WITH HOOK 0513 ; ALWAYS # CYRILLIC SMALL LETTER EL WITH HOOK 0514..0558 ; MAYBE NOT # .. 0559 ; MAYBE YES # ARMENIAN MODIFIER LETTER LEFT HALF RING 055A..0560 ; MAYBE NOT # ARMENIAN APOSTROPHE.. 0561..0586 ; MAYBE YES # ARMENIAN SMALL LETTER AYB..ARMENIAN SMALL LETTE 0587..0590 ; MAYBE NOT # ARMENIAN SMALL LIGATURE ECH YIWN.. 0591..05BD ; MAYBE YES # HEBREW ACCENT ETNAHTA..HEBREW POINT METEG 05BE ; MAYBE NOT # HEBREW PUNCTUATION MAQAF 05BF ; MAYBE YES # HEBREW POINT RAFE 05C0 ; MAYBE NOT # HEBREW PUNCTUATION PASEQ 05C1..05C2 ; MAYBE YES # HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT 05C3 ; MAYBE NOT # HEBREW PUNCTUATION SOF PASUQ 05C4..05C5 ; MAYBE YES # HEBREW MARK UPPER DOT..HEBREW MARK LOWER DOT 05C6 ; MAYBE NOT # HEBREW PUNCTUATION NUN HAFUKHA 05C7 ; MAYBE YES # HEBREW POINT QAMATS QATAN 05C8..05CF ; MAYBE NOT # .. 05D0..05EA ; MAYBE YES # HEBREW LETTER ALEF..HEBREW LETTER TAV 05EB..05EF ; MAYBE NOT # .. 05F0..05F2 ; MAYBE YES # HEBREW LIGATURE YIDDISH DOUBLE VAV..HEBREW LIGA 05F3..060F ; MAYBE NOT # HEBREW PUNCTUATION GERESH..ARABIC SIGN MISRA 0610..0615 ; MAYBE YES # ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..ARABI 0616..0620 ; MAYBE NOT # .. 0621..063A ; MAYBE YES # ARABIC LETTER HAMZA..ARABIC LETTER GHAIN 063B..063F ; MAYBE NOT # .. 0640..065E ; MAYBE YES # ARABIC TATWEEL..ARABIC FATHA WITH TWO DOTS 065F ; MAYBE NOT # 0660..0669 ; MAYBE YES # ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT NIN 066A..066D ; MAYBE NOT # ARABIC PERCENT SIGN..ARABIC FIVE POINTED STAR 066E..0674 ; MAYBE YES # ARABIC LETTER DOTLESS BEH..ARABIC LETTER HIGH H 0675..0678 ; MAYBE NOT # ARABIC LETTER HIGH HAMZA ALEF..ARABIC LETTER HI 0679..06D3 ; MAYBE YES # ARABIC LETTER TTEH..ARABIC LETTER YEH BARREE WI 06D4 ; MAYBE NOT # ARABIC FULL STOP 06D5..06DC ; MAYBE YES # ARABIC LETTER AE..ARABIC SMALL HIGH SEEN 06DD..06DE ; MAYBE NOT # ARABIC END OF AYAH..ARABIC START OF RUB EL HIZB 06DF..06E8 ; MAYBE YES # ARABIC SMALL HIGH ROUNDED ZERO..ARABIC SMALL HI 06E9 ; MAYBE NOT # ARABIC PLACE OF SAJDAH 06EA..06FC ; MAYBE YES # ARABIC EMPTY CENTRE LOW STOP..ARABIC LETTER GHA 06FD..06FE ; MAYBE NOT # ARABIC SIGN SINDHI AMPERSAND..ARABIC SIGN SINDH 06FF ; MAYBE YES # ARABIC LETTER HEH WITH INVERTED V 0700..070F ; MAYBE NOT # SYRIAC END OF PARAGRAPH..SYRIAC ABBREVIATION MA 0710..074A ; MAYBE YES # SYRIAC LETTER ALAPH..SYRIAC BARREKH 074B..074C ; MAYBE NOT # .. 074D..076D ; MAYBE YES # SYRIAC LETTER SOGDIAN ZHAIN..ARABIC LETTER SEEN Faltstrom Expires November 22, 2007 [Page 19] Internet-Draft Unicode Codepoints May 2007 076E..077F ; MAYBE NOT # .. 0780..07B1 ; MAYBE YES # THAANA LETTER HAA..THAANA LETTER NAA 07B2..07BF ; MAYBE NOT # .. 07C0..07F5 ; MAYBE YES # NKO DIGIT ZERO..NKO LOW TONE APOSTROPHE 07F6..07F9 ; MAYBE NOT # NKO SYMBOL OO DENNEN..NKO EXCLAMATION MARK 07FA ; MAYBE YES # NKO LAJANYALAN 07FB..0900 ; MAYBE NOT # .. 0901..0939 ; MAYBE YES # DEVANAGARI SIGN CANDRABINDU..DEVANAGARI LETTER 093A..093B ; MAYBE NOT # .. 093C..094D ; MAYBE YES # DEVANAGARI SIGN NUKTA..DEVANAGARI SIGN VIRAMA 094E..094F ; MAYBE NOT # .. 0950..0954 ; MAYBE YES # DEVANAGARI OM..DEVANAGARI ACUTE ACCENT 0955..095F ; MAYBE NOT # ..DEVANAGARI LETTER YYA 0960..0963 ; MAYBE YES # DEVANAGARI LETTER VOCALIC RR..DEVANAGARI VOWEL 0964..0965 ; MAYBE NOT # DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA 0966..096F ; MAYBE YES # DEVANAGARI DIGIT ZERO..DEVANAGARI DIGIT NINE 0970..097A ; MAYBE NOT # DEVANAGARI ABBREVIATION SIGN.. 097B..097F ; MAYBE YES # DEVANAGARI LETTER GGA..DEVANAGARI LETTER BBA 0980 ; MAYBE NOT # 0981..0983 ; MAYBE YES # BENGALI SIGN CANDRABINDU..BENGALI SIGN VISARGA 0984 ; MAYBE NOT # 0985..098C ; MAYBE YES # BENGALI LETTER A..BENGALI LETTER VOCALIC L 098D..098E ; MAYBE NOT # .. 098F..0990 ; MAYBE YES # BENGALI LETTER E..BENGALI LETTER AI 0991..0992 ; MAYBE NOT # .. 0993..09A8 ; MAYBE YES # BENGALI LETTER O..BENGALI LETTER NA 09A9 ; MAYBE NOT # 09AA..09B0 ; MAYBE YES # BENGALI LETTER PA..BENGALI LETTER RA 09B1 ; MAYBE NOT # 09B2 ; MAYBE YES # BENGALI LETTER LA 09B3..09B5 ; MAYBE NOT # .. 09B6..09B9 ; MAYBE YES # BENGALI LETTER SHA..BENGALI LETTER HA 09BA..09BB ; MAYBE NOT # .. 09BC..09C4 ; MAYBE YES # BENGALI SIGN NUKTA..BENGALI VOWEL SIGN VOCALIC 09C5..09C6 ; MAYBE NOT # .. 09C7..09C8 ; MAYBE YES # BENGALI VOWEL SIGN E..BENGALI VOWEL SIGN AI 09C9..09CC ; MAYBE NOT # ..BENGALI VOWEL SIGN AU 09CD..09CE ; MAYBE YES # BENGALI SIGN VIRAMA..BENGALI LETTER KHANDA TA 09CF..09D6 ; MAYBE NOT # .. 09D7 ; MAYBE YES # BENGALI AU LENGTH MARK 09D8..09DF ; MAYBE NOT # ..BENGALI LETTER YYA 09E0..09E3 ; MAYBE YES # BENGALI LETTER VOCALIC RR..BENGALI VOWEL SIGN V 09E4..09E5 ; MAYBE NOT # .. 09E6..09F1 ; MAYBE YES # BENGALI DIGIT ZERO..BENGALI LETTER RA WITH LOWE 09F2..0A00 ; MAYBE NOT # BENGALI RUPEE MARK.. 0A01..0A03 ; MAYBE YES # GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN VISARGA 0A04 ; MAYBE NOT # 0A05..0A0A ; MAYBE YES # GURMUKHI LETTER A..GURMUKHI LETTER UU Faltstrom Expires November 22, 2007 [Page 20] Internet-Draft Unicode Codepoints May 2007 0A0B..0A0E ; MAYBE NOT # .. 0A0F..0A10 ; MAYBE YES # GURMUKHI LETTER EE..GURMUKHI LETTER AI 0A11..0A12 ; MAYBE NOT # .. 0A13..0A28 ; MAYBE YES # GURMUKHI LETTER OO..GURMUKHI LETTER NA 0A29 ; MAYBE NOT # 0A2A..0A30 ; MAYBE YES # GURMUKHI LETTER PA..GURMUKHI LETTER RA 0A31 ; MAYBE NOT # 0A32 ; MAYBE YES # GURMUKHI LETTER LA 0A33..0A34 ; MAYBE NOT # GURMUKHI LETTER LLA.. 0A35 ; MAYBE YES # GURMUKHI LETTER VA 0A36..0A37 ; MAYBE NOT # GURMUKHI LETTER SHA.. 0A38..0A39 ; MAYBE YES # GURMUKHI LETTER SA..GURMUKHI LETTER HA 0A3A..0A3B ; MAYBE NOT # .. 0A3C ; MAYBE YES # GURMUKHI SIGN NUKTA 0A3D ; MAYBE NOT # 0A3E..0A42 ; MAYBE YES # GURMUKHI VOWEL SIGN AA..GURMUKHI VOWEL SIGN UU 0A43..0A46 ; MAYBE NOT # .. 0A47..0A48 ; MAYBE YES # GURMUKHI VOWEL SIGN EE..GURMUKHI VOWEL SIGN AI 0A49..0A4A ; MAYBE NOT # .. 0A4B..0A4D ; MAYBE YES # GURMUKHI VOWEL SIGN OO..GURMUKHI SIGN VIRAMA 0A4E..0A5B ; MAYBE NOT # ..GURMUKHI LETTER ZA 0A5C ; MAYBE YES # GURMUKHI LETTER RRA 0A5D..0A65 ; MAYBE NOT # .. 0A66..0A74 ; MAYBE YES # GURMUKHI DIGIT ZERO..GURMUKHI EK ONKAR 0A75..0A80 ; MAYBE NOT # .. 0A81..0A83 ; MAYBE YES # GUJARATI SIGN CANDRABINDU..GUJARATI SIGN VISARG 0A84 ; MAYBE NOT # 0A85..0A8D ; MAYBE YES # GUJARATI LETTER A..GUJARATI VOWEL CANDRA E 0A8E ; MAYBE NOT # 0A8F..0A91 ; MAYBE YES # GUJARATI LETTER E..GUJARATI VOWEL CANDRA O 0A92 ; MAYBE NOT # 0A93..0AA8 ; MAYBE YES # GUJARATI LETTER O..GUJARATI LETTER NA 0AA9 ; MAYBE NOT # 0AAA..0AB0 ; MAYBE YES # GUJARATI LETTER PA..GUJARATI LETTER RA 0AB1 ; MAYBE NOT # 0AB2..0AB3 ; MAYBE YES # GUJARATI LETTER LA..GUJARATI LETTER LLA 0AB4 ; MAYBE NOT # 0AB5..0AB9 ; MAYBE YES # GUJARATI LETTER VA..GUJARATI LETTER HA 0ABA..0ABB ; MAYBE NOT # .. 0ABC..0AC5 ; MAYBE YES # GUJARATI SIGN NUKTA..GUJARATI VOWEL SIGN CANDRA 0AC6 ; MAYBE NOT # 0AC7..0AC9 ; MAYBE YES # GUJARATI VOWEL SIGN E..GUJARATI VOWEL SIGN CAND 0ACA ; MAYBE NOT # 0ACB..0ACD ; MAYBE YES # GUJARATI VOWEL SIGN O..GUJARATI SIGN VIRAMA 0ACE..0ACF ; MAYBE NOT # .. 0AD0 ; MAYBE YES # GUJARATI OM 0AD1..0ADF ; MAYBE NOT # .. 0AE0..0AE3 ; MAYBE YES # GUJARATI LETTER VOCALIC RR..GUJARATI VOWEL SIGN Faltstrom Expires November 22, 2007 [Page 21] Internet-Draft Unicode Codepoints May 2007 0AE4..0AE5 ; MAYBE NOT # .. 0AE6..0AEF ; MAYBE YES # GUJARATI DIGIT ZERO..GUJARATI DIGIT NINE 0AF0..0B00 ; MAYBE NOT # .. 0B01..0B03 ; MAYBE YES # ORIYA SIGN CANDRABINDU..ORIYA SIGN VISARGA 0B04 ; MAYBE NOT # 0B05..0B0C ; MAYBE YES # ORIYA LETTER A..ORIYA LETTER VOCALIC L 0B0D..0B0E ; MAYBE NOT # .. 0B0F..0B10 ; MAYBE YES # ORIYA LETTER E..ORIYA LETTER AI 0B11..0B12 ; MAYBE NOT # .. 0B13..0B28 ; MAYBE YES # ORIYA LETTER O..ORIYA LETTER NA 0B29 ; MAYBE NOT # 0B2A..0B30 ; MAYBE YES # ORIYA LETTER PA..ORIYA LETTER RA 0B31 ; MAYBE NOT # 0B32..0B33 ; MAYBE YES # ORIYA LETTER LA..ORIYA LETTER LLA 0B34 ; MAYBE NOT # 0B35..0B39 ; MAYBE YES # ORIYA LETTER VA..ORIYA LETTER HA 0B3A..0B3B ; MAYBE NOT # .. 0B3C..0B43 ; MAYBE YES # ORIYA SIGN NUKTA..ORIYA VOWEL SIGN VOCALIC R 0B44..0B46 ; MAYBE NOT # .. 0B47 ; MAYBE YES # ORIYA VOWEL SIGN E 0B48..0B4C ; MAYBE NOT # ORIYA VOWEL SIGN AI..ORIYA VOWEL SIGN AU 0B4D ; MAYBE YES # ORIYA SIGN VIRAMA 0B4E..0B55 ; MAYBE NOT # .. 0B56..0B57 ; MAYBE YES # ORIYA AI LENGTH MARK..ORIYA AU LENGTH MARK 0B58..0B5E ; MAYBE NOT # .. 0B5F..0B61 ; MAYBE YES # ORIYA LETTER YYA..ORIYA LETTER VOCALIC LL 0B62..0B65 ; MAYBE NOT # .. 0B66..0B6F ; MAYBE YES # ORIYA DIGIT ZERO..ORIYA DIGIT NINE 0B70 ; MAYBE NOT # ORIYA ISSHAR 0B71 ; MAYBE YES # ORIYA LETTER WA 0B72..0B81 ; MAYBE NOT # .. 0B82..0B83 ; MAYBE YES # TAMIL SIGN ANUSVARA..TAMIL SIGN VISARGA 0B84 ; MAYBE NOT # 0B85..0B8A ; MAYBE YES # TAMIL LETTER A..TAMIL LETTER UU 0B8B..0B8D ; MAYBE NOT # .. 0B8E..0B90 ; MAYBE YES # TAMIL LETTER E..TAMIL LETTER AI 0B91 ; MAYBE NOT # 0B92..0B93 ; MAYBE YES # TAMIL LETTER O..TAMIL LETTER OO 0B94 ; MAYBE NOT # TAMIL LETTER AU 0B95 ; MAYBE YES # TAMIL LETTER KA 0B96..0B98 ; MAYBE NOT # .. 0B99..0B9A ; MAYBE YES # TAMIL LETTER NGA..TAMIL LETTER CA 0B9B ; MAYBE NOT # 0B9C ; MAYBE YES # TAMIL LETTER JA 0B9D ; MAYBE NOT # 0B9E..0B9F ; MAYBE YES # TAMIL LETTER NYA..TAMIL LETTER TTA 0BA0..0BA2 ; MAYBE NOT # .. 0BA3..0BA4 ; MAYBE YES # TAMIL LETTER NNA..TAMIL LETTER TA Faltstrom Expires November 22, 2007 [Page 22] Internet-Draft Unicode Codepoints May 2007 0BA5..0BA7 ; MAYBE NOT # .. 0BA8..0BAA ; MAYBE YES # TAMIL LETTER NA..TAMIL LETTER PA 0BAB..0BAD ; MAYBE NOT # .. 0BAE..0BB9 ; MAYBE YES # TAMIL LETTER MA..TAMIL LETTER HA 0BBA..0BBD ; MAYBE NOT # .. 0BBE..0BC2 ; MAYBE YES # TAMIL VOWEL SIGN AA..TAMIL VOWEL SIGN UU 0BC3..0BC5 ; MAYBE NOT # .. 0BC6..0BC8 ; MAYBE YES # TAMIL VOWEL SIGN E..TAMIL VOWEL SIGN AI 0BC9..0BCC ; MAYBE NOT # ..TAMIL VOWEL SIGN AU 0BCD ; MAYBE YES # TAMIL SIGN VIRAMA 0BCE..0BD6 ; MAYBE NOT # .. 0BD7 ; MAYBE YES # TAMIL AU LENGTH MARK 0BD8..0BE5 ; MAYBE NOT # .. 0BE6..0BEF ; MAYBE YES # TAMIL DIGIT ZERO..TAMIL DIGIT NINE 0BF0..0C00 ; MAYBE NOT # TAMIL NUMBER TEN.. 0C01..0C03 ; MAYBE YES # TELUGU SIGN CANDRABINDU..TELUGU SIGN VISARGA 0C04 ; MAYBE NOT # 0C05..0C0C ; MAYBE YES # TELUGU LETTER A..TELUGU LETTER VOCALIC L 0C0D ; MAYBE NOT # 0C0E..0C10 ; MAYBE YES # TELUGU LETTER E..TELUGU LETTER AI 0C11 ; MAYBE NOT # 0C12..0C28 ; MAYBE YES # TELUGU LETTER O..TELUGU LETTER NA 0C29 ; MAYBE NOT # 0C2A..0C33 ; MAYBE YES # TELUGU LETTER PA..TELUGU LETTER LLA 0C34 ; MAYBE NOT # 0C35..0C39 ; MAYBE YES # TELUGU LETTER VA..TELUGU LETTER HA 0C3A..0C3D ; MAYBE NOT # .. 0C3E..0C44 ; MAYBE YES # TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN VOCALIC 0C45 ; MAYBE NOT # 0C46..0C48 ; MAYBE YES # TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN AI 0C49 ; MAYBE NOT # 0C4A..0C4D ; MAYBE YES # TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA 0C4E..0C54 ; MAYBE NOT # .. 0C55..0C56 ; MAYBE YES # TELUGU LENGTH MARK..TELUGU AI LENGTH MARK 0C57..0C5F ; MAYBE NOT # .. 0C60..0C61 ; MAYBE YES # TELUGU LETTER VOCALIC RR..TELUGU LETTER VOCALIC 0C62..0C65 ; MAYBE NOT # .. 0C66..0C6F ; MAYBE YES # TELUGU DIGIT ZERO..TELUGU DIGIT NINE 0C70..0C81 ; MAYBE NOT # .. 0C82..0C83 ; MAYBE YES # KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA 0C84 ; MAYBE NOT # 0C85..0C8C ; MAYBE YES # KANNADA LETTER A..KANNADA LETTER VOCALIC L 0C8D ; MAYBE NOT # 0C8E..0C90 ; MAYBE YES # KANNADA LETTER E..KANNADA LETTER AI 0C91 ; MAYBE NOT # 0C92..0CA8 ; MAYBE YES # KANNADA LETTER O..KANNADA LETTER NA 0CA9 ; MAYBE NOT # 0CAA..0CB3 ; MAYBE YES # KANNADA LETTER PA..KANNADA LETTER LLA Faltstrom Expires November 22, 2007 [Page 23] Internet-Draft Unicode Codepoints May 2007 0CB4 ; MAYBE NOT # 0CB5..0CB9 ; MAYBE YES # KANNADA LETTER VA..KANNADA LETTER HA 0CBA..0CBB ; MAYBE NOT # .. 0CBC..0CBF ; MAYBE YES # KANNADA SIGN NUKTA..KANNADA VOWEL SIGN I 0CC0 ; MAYBE NOT # KANNADA VOWEL SIGN II 0CC1..0CC4 ; MAYBE YES # KANNADA VOWEL SIGN U..KANNADA VOWEL SIGN VOCALI 0CC5 ; MAYBE NOT # 0CC6 ; MAYBE YES # KANNADA VOWEL SIGN E 0CC7..0CCB ; MAYBE NOT # KANNADA VOWEL SIGN EE..KANNADA VOWEL SIGN OO 0CCC..0CCD ; MAYBE YES # KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA 0CCE..0CD4 ; MAYBE NOT # .. 0CD5..0CD6 ; MAYBE YES # KANNADA LENGTH MARK..KANNADA AI LENGTH MARK 0CD7..0CDD ; MAYBE NOT # .. 0CDE ; MAYBE YES # KANNADA LETTER FA 0CDF ; MAYBE NOT # 0CE0..0CE3 ; MAYBE YES # KANNADA LETTER VOCALIC RR..KANNADA VOWEL SIGN V 0CE4..0CE5 ; MAYBE NOT # .. 0CE6..0CEF ; MAYBE YES # KANNADA DIGIT ZERO..KANNADA DIGIT NINE 0CF0..0D01 ; MAYBE NOT # .. 0D02..0D03 ; MAYBE YES # MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA 0D04 ; MAYBE NOT # 0D05..0D0C ; MAYBE YES # MALAYALAM LETTER A..MALAYALAM LETTER VOCALIC L 0D0D ; MAYBE NOT # 0D0E..0D10 ; MAYBE YES # MALAYALAM LETTER E..MALAYALAM LETTER AI 0D11 ; MAYBE NOT # 0D12..0D28 ; MAYBE YES # MALAYALAM LETTER O..MALAYALAM LETTER NA 0D29 ; MAYBE NOT # 0D2A..0D39 ; MAYBE YES # MALAYALAM LETTER PA..MALAYALAM LETTER HA 0D3A..0D3D ; MAYBE NOT # .. 0D3E..0D43 ; MAYBE YES # MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN V 0D44..0D45 ; MAYBE NOT # .. 0D46..0D48 ; MAYBE YES # MALAYALAM VOWEL SIGN E..MALAYALAM VOWEL SIGN AI 0D49..0D4C ; MAYBE NOT # ..MALAYALAM VOWEL SIGN AU 0D4D ; MAYBE YES # MALAYALAM SIGN VIRAMA 0D4E..0D56 ; MAYBE NOT # .. 0D57 ; MAYBE YES # MALAYALAM AU LENGTH MARK 0D58..0D5F ; MAYBE NOT # .. 0D60..0D61 ; MAYBE YES # MALAYALAM LETTER VOCALIC RR..MALAYALAM LETTER V 0D62..0D65 ; MAYBE NOT # .. 0D66..0D6F ; MAYBE YES # MALAYALAM DIGIT ZERO..MALAYALAM DIGIT NINE 0D70..0D81 ; MAYBE NOT # .. 0D82..0D83 ; MAYBE YES # SINHALA SIGN ANUSVARAYA..SINHALA SIGN VISARGAYA 0D84 ; MAYBE NOT # 0D85..0D96 ; MAYBE YES # SINHALA LETTER AYANNA..SINHALA LETTER AUYANNA 0D97..0D99 ; MAYBE NOT # .. 0D9A..0DB1 ; MAYBE YES # SINHALA LETTER ALPAPRAANA KAYANNA..SINHALA LETT 0DB2 ; MAYBE NOT # 0DB3..0DBB ; MAYBE YES # SINHALA LETTER SANYAKA DAYANNA..SINHALA LETTER Faltstrom Expires November 22, 2007 [Page 24] Internet-Draft Unicode Codepoints May 2007 0DBC ; MAYBE NOT # 0DBD ; MAYBE YES # SINHALA LETTER DANTAJA LAYANNA 0DBE..0DBF ; MAYBE NOT # .. 0DC0..0DC6 ; MAYBE YES # SINHALA LETTER VAYANNA..SINHALA LETTER FAYANNA 0DC7..0DC9 ; MAYBE NOT # .. 0DCA ; MAYBE YES # SINHALA SIGN AL-LAKUNA 0DCB..0DCE ; MAYBE NOT # .. 0DCF..0DD4 ; MAYBE YES # SINHALA VOWEL SIGN AELA-PILLA..SINHALA VOWEL SI 0DD5 ; MAYBE NOT # 0DD6 ; MAYBE YES # SINHALA VOWEL SIGN DIGA PAA-PILLA 0DD7 ; MAYBE NOT # 0DD8..0DDB ; MAYBE YES # SINHALA VOWEL SIGN GAETTA-PILLA..SINHALA VOWEL 0DDC..0DDE ; MAYBE NOT # SINHALA VOWEL SIGN KOMBUVA HAA AELA-PILLA..SINH 0DDF ; MAYBE YES # SINHALA VOWEL SIGN GAYANUKITTA 0DE0..0DF1 ; MAYBE NOT # .. 0DF2..0DF3 ; MAYBE YES # SINHALA VOWEL SIGN DIGA GAETTA-PILLA..SINHALA V 0DF4..0E00 ; MAYBE NOT # SINHALA PUNCTUATION KUNDDALIYA.. 0E01..0E32 ; MAYBE YES # THAI CHARACTER KO KAI..THAI CHARACTER SARA AA 0E33 ; MAYBE NOT # THAI CHARACTER SARA AM 0E34..0E3A ; MAYBE YES # THAI CHARACTER SARA I..THAI CHARACTER PHINTHU 0E3B..0E3F ; MAYBE NOT # ..THAI CURRENCY SYMBOL BAHT 0E40..0E4E ; MAYBE YES # THAI CHARACTER SARA E..THAI CHARACTER YAMAKKAN 0E4F ; MAYBE NOT # THAI CHARACTER FONGMAN 0E50..0E59 ; MAYBE YES # THAI DIGIT ZERO..THAI DIGIT NINE 0E5A..0E80 ; MAYBE NOT # THAI CHARACTER ANGKHANKHU.. 0E81..0E82 ; MAYBE YES # LAO LETTER KO..LAO LETTER KHO SUNG 0E83 ; MAYBE NOT # 0E84 ; MAYBE YES # LAO LETTER KHO TAM 0E85..0E86 ; MAYBE NOT # .. 0E87..0E88 ; MAYBE YES # LAO LETTER NGO..LAO LETTER CO 0E89 ; MAYBE NOT # 0E8A ; MAYBE YES # LAO LETTER SO TAM 0E8B..0E8C ; MAYBE NOT # .. 0E8D ; MAYBE YES # LAO LETTER NYO 0E8E..0E93 ; MAYBE NOT # .. 0E94..0E97 ; MAYBE YES # LAO LETTER DO..LAO LETTER THO TAM 0E98 ; MAYBE NOT # 0E99..0E9F ; MAYBE YES # LAO LETTER NO..LAO LETTER FO SUNG 0EA0 ; MAYBE NOT # 0EA1..0EA3 ; MAYBE YES # LAO LETTER MO..LAO LETTER LO LING 0EA4 ; MAYBE NOT # 0EA5 ; MAYBE YES # LAO LETTER LO LOOT 0EA6 ; MAYBE NOT # 0EA7 ; MAYBE YES # LAO LETTER WO 0EA8..0EA9 ; MAYBE NOT # .. 0EAA..0EAB ; MAYBE YES # LAO LETTER SO SUNG..LAO LETTER HO SUNG 0EAC ; MAYBE NOT # 0EAD..0EB2 ; MAYBE YES # LAO LETTER O..LAO VOWEL SIGN AA Faltstrom Expires November 22, 2007 [Page 25] Internet-Draft Unicode Codepoints May 2007 0EB3 ; MAYBE NOT # LAO VOWEL SIGN AM 0EB4..0EB9 ; MAYBE YES # LAO VOWEL SIGN I..LAO VOWEL SIGN UU 0EBA ; MAYBE NOT # 0EBB..0EBD ; MAYBE YES # LAO VOWEL SIGN MAI KON..LAO SEMIVOWEL SIGN NYO 0EBE..0EBF ; MAYBE NOT # .. 0EC0..0EC4 ; MAYBE YES # LAO VOWEL SIGN E..LAO VOWEL SIGN AI 0EC5 ; MAYBE NOT # 0EC6 ; MAYBE YES # LAO KO LA 0EC7 ; MAYBE NOT # 0EC8..0ECD ; MAYBE YES # LAO TONE MAI EK..LAO NIGGAHITA 0ECE..0ECF ; MAYBE NOT # .. 0ED0..0ED9 ; MAYBE YES # LAO DIGIT ZERO..LAO DIGIT NINE 0EDA..0EFF ; MAYBE NOT # .. 0F00 ; MAYBE YES # TIBETAN SYLLABLE OM 0F01..0F17 ; MAYBE NOT # TIBETAN MARK GTER YIG MGO TRUNCATED A..TIBETAN 0F18..0F19 ; MAYBE YES # TIBETAN ASTROLOGICAL SIGN -KHYUD PA..TIBETAN AS 0F1A..0F1F ; MAYBE NOT # TIBETAN SIGN RDEL DKAR GCIG..TIBETAN SIGN RDEL 0F20..0F29 ; MAYBE YES # TIBETAN DIGIT ZERO..TIBETAN DIGIT NINE 0F2A..0F34 ; MAYBE NOT # TIBETAN DIGIT HALF ONE..TIBETAN MARK BSDUS RTAG 0F35 ; MAYBE YES # TIBETAN MARK NGAS BZUNG NYI ZLA 0F36 ; MAYBE NOT # TIBETAN MARK CARET -DZUD RTAGS BZHI MIG CAN 0F37 ; MAYBE YES # TIBETAN MARK NGAS BZUNG SGOR RTAGS 0F38 ; MAYBE NOT # TIBETAN MARK CHE MGO 0F39 ; MAYBE YES # TIBETAN MARK TSA -PHRU 0F3A..0F3D ; MAYBE NOT # TIBETAN MARK GUG RTAGS GYON..TIBETAN MARK ANG K 0F3E..0F42 ; MAYBE YES # TIBETAN SIGN YAR TSHES..TIBETAN LETTER GA 0F43 ; MAYBE NOT # TIBETAN LETTER GHA 0F44..0F47 ; MAYBE YES # TIBETAN LETTER NGA..TIBETAN LETTER JA 0F48 ; MAYBE NOT # 0F49..0F4C ; MAYBE YES # TIBETAN LETTER NYA..TIBETAN LETTER DDA 0F4D ; MAYBE NOT # TIBETAN LETTER DDHA 0F4E..0F51 ; MAYBE YES # TIBETAN LETTER NNA..TIBETAN LETTER DA 0F52 ; MAYBE NOT # TIBETAN LETTER DHA 0F53..0F56 ; MAYBE YES # TIBETAN LETTER NA..TIBETAN LETTER BA 0F57 ; MAYBE NOT # TIBETAN LETTER BHA 0F58..0F5B ; MAYBE YES # TIBETAN LETTER MA..TIBETAN LETTER DZA 0F5C ; MAYBE NOT # TIBETAN LETTER DZHA 0F5D..0F68 ; MAYBE YES # TIBETAN LETTER WA..TIBETAN LETTER A 0F69 ; MAYBE NOT # TIBETAN LETTER KSSA 0F6A ; MAYBE YES # TIBETAN LETTER FIXED-FORM RA 0F6B..0F70 ; MAYBE NOT # .. 0F71..0F72 ; MAYBE YES # TIBETAN VOWEL SIGN AA..TIBETAN VOWEL SIGN I 0F73 ; MAYBE NOT # TIBETAN VOWEL SIGN II 0F74 ; MAYBE YES # TIBETAN VOWEL SIGN U 0F75..0F79 ; MAYBE NOT # TIBETAN VOWEL SIGN UU..TIBETAN VOWEL SIGN VOCAL 0F7A..0F80 ; MAYBE YES # TIBETAN VOWEL SIGN E..TIBETAN VOWEL SIGN REVERS 0F81 ; MAYBE NOT # TIBETAN VOWEL SIGN REVERSED II 0F82..0F84 ; MAYBE YES # TIBETAN SIGN NYI ZLA NAA DA..TIBETAN MARK HALAN Faltstrom Expires November 22, 2007 [Page 26] Internet-Draft Unicode Codepoints May 2007 0F85 ; MAYBE NOT # TIBETAN MARK PALUTA 0F86..0F8B ; MAYBE YES # TIBETAN SIGN LCI RTAGS..TIBETAN SIGN GRU MED RG 0F8C..0F8F ; MAYBE NOT # .. 0F90..0F92 ; MAYBE YES # TIBETAN SUBJOINED LETTER KA..TIBETAN SUBJOINED 0F93 ; MAYBE NOT # TIBETAN SUBJOINED LETTER GHA 0F94..0F97 ; MAYBE YES # TIBETAN SUBJOINED LETTER NGA..TIBETAN SUBJOINED 0F98 ; MAYBE NOT # 0F99..0F9C ; MAYBE YES # TIBETAN SUBJOINED LETTER NYA..TIBETAN SUBJOINED 0F9D ; MAYBE NOT # TIBETAN SUBJOINED LETTER DDHA 0F9E..0FA1 ; MAYBE YES # TIBETAN SUBJOINED LETTER NNA..TIBETAN SUBJOINED 0FA2 ; MAYBE NOT # TIBETAN SUBJOINED LETTER DHA 0FA3..0FA6 ; MAYBE YES # TIBETAN SUBJOINED LETTER NA..TIBETAN SUBJOINED 0FA7 ; MAYBE NOT # TIBETAN SUBJOINED LETTER BHA 0FA8..0FAB ; MAYBE YES # TIBETAN SUBJOINED LETTER MA..TIBETAN SUBJOINED 0FAC ; MAYBE NOT # TIBETAN SUBJOINED LETTER DZHA 0FAD..0FB8 ; MAYBE YES # TIBETAN SUBJOINED LETTER WA..TIBETAN SUBJOINED 0FB9 ; MAYBE NOT # TIBETAN SUBJOINED LETTER KSSA 0FBA..0FBC ; MAYBE YES # TIBETAN SUBJOINED LETTER FIXED-FORM WA..TIBETAN 0FBD..0FC5 ; MAYBE NOT # ..TIBETAN SYMBOL RDO RJE 0FC6 ; MAYBE YES # TIBETAN SYMBOL PADMA GDAN 0FC7..0FFF ; MAYBE NOT # TIBETAN SYMBOL RDO RJE RGYA GRAM.. 1000..1021 ; MAYBE YES # MYANMAR LETTER KA..MYANMAR LETTER A 1022 ; MAYBE NOT # 1023..1025 ; MAYBE YES # MYANMAR LETTER I..MYANMAR LETTER U 1026 ; MAYBE NOT # MYANMAR LETTER UU 1027 ; MAYBE YES # MYANMAR LETTER E 1028 ; MAYBE NOT # 1029..102A ; MAYBE YES # MYANMAR LETTER O..MYANMAR LETTER AU 102B ; MAYBE NOT # 102C..1032 ; MAYBE YES # MYANMAR VOWEL SIGN AA..MYANMAR VOWEL SIGN AI 1033..1035 ; MAYBE NOT # .. 1036..1039 ; MAYBE YES # MYANMAR SIGN ANUSVARA..MYANMAR SIGN VIRAMA 103A..103F ; MAYBE NOT # .. 1040..1049 ; MAYBE YES # MYANMAR DIGIT ZERO..MYANMAR DIGIT NINE 104A..104F ; MAYBE NOT # MYANMAR SIGN LITTLE SECTION..MYANMAR SYMBOL GEN 1050..1059 ; MAYBE YES # MYANMAR LETTER SHA..MYANMAR VOWEL SIGN VOCALIC 105A..10CF ; MAYBE NOT # .. 10D0..10FA ; MAYBE YES # GEORGIAN LETTER AN..GEORGIAN LETTER AIN 10FB..10FF ; MAYBE NOT # GEORGIAN PARAGRAPH SEPARATOR.. 1100..1159 ; MAYBE YES # HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG YEORINH 115A..1160 ; MAYBE NOT # ..HANGUL JUNGSEONG FILLER 1161..11A2 ; MAYBE YES # HANGUL JUNGSEONG A..HANGUL JUNGSEONG SSANGARAEA 11A3..11A7 ; MAYBE NOT # .. 11A8..11F9 ; MAYBE YES # HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORI 11FA..11FF ; MAYBE NOT # .. 1200..1248 ; MAYBE YES # ETHIOPIC SYLLABLE HA..ETHIOPIC SYLLABLE QWA 1249 ; MAYBE NOT # 124A..124D ; MAYBE YES # ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE Faltstrom Expires November 22, 2007 [Page 27] Internet-Draft Unicode Codepoints May 2007 124E..124F ; MAYBE NOT # .. 1250..1256 ; MAYBE YES # ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO 1257 ; MAYBE NOT # 1258 ; MAYBE YES # ETHIOPIC SYLLABLE QHWA 1259 ; MAYBE NOT # 125A..125D ; MAYBE YES # ETHIOPIC SYLLABLE QHWI..ETHIOPIC SYLLABLE QHWE 125E..125F ; MAYBE NOT # .. 1260..1288 ; MAYBE YES # ETHIOPIC SYLLABLE BA..ETHIOPIC SYLLABLE XWA 1289 ; MAYBE NOT # 128A..128D ; MAYBE YES # ETHIOPIC SYLLABLE XWI..ETHIOPIC SYLLABLE XWE 128E..128F ; MAYBE NOT # .. 1290..12B0 ; MAYBE YES # ETHIOPIC SYLLABLE NA..ETHIOPIC SYLLABLE KWA 12B1 ; MAYBE NOT # 12B2..12B5 ; MAYBE YES # ETHIOPIC SYLLABLE KWI..ETHIOPIC SYLLABLE KWE 12B6..12B7 ; MAYBE NOT # .. 12B8..12BE ; MAYBE YES # ETHIOPIC SYLLABLE KXA..ETHIOPIC SYLLABLE KXO 12BF ; MAYBE NOT # 12C0 ; MAYBE YES # ETHIOPIC SYLLABLE KXWA 12C1 ; MAYBE NOT # 12C2..12C5 ; MAYBE YES # ETHIOPIC SYLLABLE KXWI..ETHIOPIC SYLLABLE KXWE 12C6..12C7 ; MAYBE NOT # .. 12C8..12D6 ; MAYBE YES # ETHIOPIC SYLLABLE WA..ETHIOPIC SYLLABLE PHARYNG 12D7 ; MAYBE NOT # 12D8..1310 ; MAYBE YES # ETHIOPIC SYLLABLE ZA..ETHIOPIC SYLLABLE GWA 1311 ; MAYBE NOT # 1312..1315 ; MAYBE YES # ETHIOPIC SYLLABLE GWI..ETHIOPIC SYLLABLE GWE 1316..1317 ; MAYBE NOT # .. 1318..135A ; MAYBE YES # ETHIOPIC SYLLABLE GGA..ETHIOPIC SYLLABLE FYA 135B..135E ; MAYBE NOT # .. 135F ; MAYBE YES # ETHIOPIC COMBINING GEMINATION MARK 1360..137F ; MAYBE NOT # ETHIOPIC SECTION MARK.. 1380..138F ; MAYBE YES # ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLA 1390..139F ; MAYBE NOT # ETHIOPIC TONAL MARK YIZET.. 13A0..13F4 ; MAYBE YES # CHEROKEE LETTER A..CHEROKEE LETTER YV 13F5..1400 ; MAYBE NOT # .. 1401..166C ; MAYBE YES # CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIE 166D..166E ; MAYBE NOT # CANADIAN SYLLABICS CHI SIGN..CANADIAN SYLLABICS 166F..1676 ; MAYBE YES # CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS NNGA 1677..169F ; MAYBE NOT # .. 16A0..16EA ; MAYBE YES # RUNIC LETTER FEHU FEOH FE F..RUNIC LETTER X 16EB..16FF ; MAYBE NOT # RUNIC SINGLE PUNCTUATION.. 1700..170C ; MAYBE YES # TAGALOG LETTER A..TAGALOG LETTER YA 170D ; MAYBE NOT # 170E..1714 ; MAYBE YES # TAGALOG LETTER LA..TAGALOG SIGN VIRAMA 1715..171F ; MAYBE NOT # .. 1720..1734 ; MAYBE YES # HANUNOO LETTER A..HANUNOO SIGN PAMUDPOD 1735..173F ; MAYBE NOT # PHILIPPINE SINGLE PUNCTUATION.. 1740..1753 ; MAYBE YES # BUHID LETTER A..BUHID VOWEL SIGN U Faltstrom Expires November 22, 2007 [Page 28] Internet-Draft Unicode Codepoints May 2007 1754..175F ; MAYBE NOT # .. 1760..176C ; MAYBE YES # TAGBANWA LETTER A..TAGBANWA LETTER YA 176D ; MAYBE NOT # 176E..1770 ; MAYBE YES # TAGBANWA LETTER LA..TAGBANWA LETTER SA 1771 ; MAYBE NOT # 1772..1773 ; MAYBE YES # TAGBANWA VOWEL SIGN I..TAGBANWA VOWEL SIGN U 1774..177F ; MAYBE NOT # .. 1780..17B3 ; MAYBE YES # KHMER LETTER KA..KHMER INDEPENDENT VOWEL QAU 17B4..17B5 ; MAYBE NOT # KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT A 17B6..17D3 ; MAYBE YES # KHMER VOWEL SIGN AA..KHMER SIGN BATHAMASAT 17D4..17D6 ; MAYBE NOT # KHMER SIGN KHAN..KHMER SIGN CAMNUC PII KUUH 17D7 ; MAYBE YES # KHMER SIGN LEK TOO 17D8..17DB ; MAYBE NOT # KHMER SIGN BEYYAL..KHMER CURRENCY SYMBOL RIEL 17DC..17DD ; MAYBE YES # KHMER SIGN AVAKRAHASANYA..KHMER SIGN ATTHACAN 17DE..17DF ; MAYBE NOT # .. 17E0..17E9 ; MAYBE YES # KHMER DIGIT ZERO..KHMER DIGIT NINE 17EA..180A ; MAYBE NOT # ..MONGOLIAN NIRUGU 180B..180D ; MAYBE YES # MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIA 180E..180F ; MAYBE NOT # MONGOLIAN VOWEL SEPARATOR.. 1810..1819 ; MAYBE YES # MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE 181A..181F ; MAYBE NOT # .. 1820..1877 ; MAYBE YES # MONGOLIAN LETTER A..MONGOLIAN LETTER MANCHU ZHA 1878..187F ; MAYBE NOT # .. 1880..18A9 ; MAYBE YES # MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLI 18AA..18FF ; MAYBE NOT # .. 1900..191C ; MAYBE YES # LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER HA 191D..191F ; MAYBE NOT # .. 1920..192B ; MAYBE YES # LIMBU VOWEL SIGN A..LIMBU SUBJOINED LETTER WA 192C..192F ; MAYBE NOT # .. 1930..193B ; MAYBE YES # LIMBU SMALL LETTER KA..LIMBU SIGN SA-I 193C..1945 ; MAYBE NOT # ..LIMBU QUESTION MARK 1946..196D ; MAYBE YES # LIMBU DIGIT ZERO..TAI LE LETTER AI 196E..196F ; MAYBE NOT # .. 1970..1974 ; MAYBE YES # TAI LE LETTER TONE-2..TAI LE LETTER TONE-6 1975..197F ; MAYBE NOT # .. 1980..19A9 ; MAYBE YES # NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER 19AA..19AF ; MAYBE NOT # .. 19B0..19C9 ; MAYBE YES # NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI 19CA..19CF ; MAYBE NOT # .. 19D0..19D9 ; MAYBE YES # NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE 19DA..19FF ; MAYBE NOT # ..KHMER SYMBOL DAP-PRAM ROC 1A00..1A1B ; MAYBE YES # BUGINESE LETTER KA..BUGINESE VOWEL SIGN AE 1A1C..1AFF ; MAYBE NOT # .. 1B00..1B05 ; MAYBE YES # BALINESE SIGN ULU RICEM..BALINESE LETTER AKARA 1B06 ; MAYBE NOT # BALINESE LETTER AKARA TEDUNG 1B07 ; MAYBE YES # BALINESE LETTER IKARA 1B08 ; MAYBE NOT # BALINESE LETTER IKARA TEDUNG 1B09 ; MAYBE YES # BALINESE LETTER UKARA Faltstrom Expires November 22, 2007 [Page 29] Internet-Draft Unicode Codepoints May 2007 1B0A ; MAYBE NOT # BALINESE LETTER UKARA TEDUNG 1B0B ; MAYBE YES # BALINESE LETTER RA REPA 1B0C ; MAYBE NOT # BALINESE LETTER RA REPA TEDUNG 1B0D ; MAYBE YES # BALINESE LETTER LA LENGA 1B0E ; MAYBE NOT # BALINESE LETTER LA LENGA TEDUNG 1B0F..1B11 ; MAYBE YES # BALINESE LETTER EKARA..BALINESE LETTER OKARA 1B12 ; MAYBE NOT # BALINESE LETTER OKARA TEDUNG 1B13..1B3A ; MAYBE YES # BALINESE LETTER KA..BALINESE VOWEL SIGN RA REPA 1B3B ; MAYBE NOT # BALINESE VOWEL SIGN RA REPA TEDUNG 1B3C ; MAYBE YES # BALINESE VOWEL SIGN LA LENGA 1B3D ; MAYBE NOT # BALINESE VOWEL SIGN LA LENGA TEDUNG 1B3E..1B3F ; MAYBE YES # BALINESE VOWEL SIGN TALING..BALINESE VOWEL SIGN 1B40..1B41 ; MAYBE NOT # BALINESE VOWEL SIGN TALING TEDUNG..BALINESE VOW 1B42 ; MAYBE YES # BALINESE VOWEL SIGN PEPET 1B43 ; MAYBE NOT # BALINESE VOWEL SIGN PEPET TEDUNG 1B44..1B4B ; MAYBE YES # BALINESE ADEG ADEG..BALINESE LETTER ASYURA SASA 1B4C..1B4F ; MAYBE NOT # .. 1B50..1B59 ; MAYBE YES # BALINESE DIGIT ZERO..BALINESE DIGIT NINE 1B5A..1B6A ; MAYBE NOT # BALINESE PANTI..BALINESE MUSICAL SYMBOL DANG GE 1B6B..1B73 ; MAYBE YES # BALINESE MUSICAL SYMBOL COMBINING TEGEH..BALINE 1B74..1CFF ; MAYBE NOT # BALINESE MUSICAL SYMBOL RIGHT-HAND OPEN DUG.... 1DFE..1DFF ; MAYBE YES # COMBINING LEFT ARROWHEAD ABOVE..COMBINING RIGHT 1E00 ; NEVER # LATIN CAPITAL LETTER A WITH RING BELOW 1E01 ; ALWAYS # LATIN SMALL LETTER A WITH RING BELOW 1E02 ; NEVER # LATIN CAPITAL LETTER B WITH DOT ABOVE 1E03 ; ALWAYS # LATIN SMALL LETTER B WITH DOT ABOVE 1E04 ; NEVER # LATIN CAPITAL LETTER B WITH DOT BELOW 1E05 ; ALWAYS # LATIN SMALL LETTER B WITH DOT BELOW 1E06 ; NEVER # LATIN CAPITAL LETTER B WITH LINE BELOW 1E07 ; ALWAYS # LATIN SMALL LETTER B WITH LINE BELOW 1E08..1E0A ; NEVER # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE.. 1E0B ; ALWAYS # LATIN SMALL LETTER D WITH DOT ABOVE 1E0C ; NEVER # LATIN CAPITAL LETTER D WITH DOT BELOW 1E0D ; ALWAYS # LATIN SMALL LETTER D WITH DOT BELOW Faltstrom Expires November 22, 2007 [Page 30] Internet-Draft Unicode Codepoints May 2007 1E0E ; NEVER # LATIN CAPITAL LETTER D WITH LINE BELOW 1E0F ; ALWAYS # LATIN SMALL LETTER D WITH LINE BELOW 1E10 ; NEVER # LATIN CAPITAL LETTER D WITH CEDILLA 1E11 ; ALWAYS # LATIN SMALL LETTER D WITH CEDILLA 1E12 ; NEVER # LATIN CAPITAL LETTER D WITH CIRCUMFLEX BELOW 1E13 ; ALWAYS # LATIN SMALL LETTER D WITH CIRCUMFLEX BELOW 1E14..1E18 ; NEVER # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE..L 1E19 ; ALWAYS # LATIN SMALL LETTER E WITH CIRCUMFLEX BELOW 1E1A ; NEVER # LATIN CAPITAL LETTER E WITH TILDE BELOW 1E1B ; ALWAYS # LATIN SMALL LETTER E WITH TILDE BELOW 1E1C..1E1E ; NEVER # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE.. 1E1F ; ALWAYS # LATIN SMALL LETTER F WITH DOT ABOVE 1E20 ; NEVER # LATIN CAPITAL LETTER G WITH MACRON 1E21 ; ALWAYS # LATIN SMALL LETTER G WITH MACRON 1E22 ; NEVER # LATIN CAPITAL LETTER H WITH DOT ABOVE 1E23 ; ALWAYS # LATIN SMALL LETTER H WITH DOT ABOVE 1E24 ; NEVER # LATIN CAPITAL LETTER H WITH DOT BELOW 1E25 ; ALWAYS # LATIN SMALL LETTER H WITH DOT BELOW 1E26 ; NEVER # LATIN CAPITAL LETTER H WITH DIAERESIS 1E27 ; ALWAYS # LATIN SMALL LETTER H WITH DIAERESIS 1E28 ; NEVER # LATIN CAPITAL LETTER H WITH CEDILLA 1E29 ; ALWAYS # LATIN SMALL LETTER H WITH CEDILLA 1E2A ; NEVER # LATIN CAPITAL LETTER H WITH BREVE BELOW 1E2B ; ALWAYS # LATIN SMALL LETTER H WITH BREVE BELOW 1E2C ; NEVER # LATIN CAPITAL LETTER I WITH TILDE BELOW 1E2D ; ALWAYS # LATIN SMALL LETTER I WITH TILDE BELOW 1E2E..1E30 ; NEVER # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE 1E31 ; ALWAYS # LATIN SMALL LETTER K WITH ACUTE 1E32 ; NEVER # LATIN CAPITAL LETTER K WITH DOT BELOW 1E33 ; ALWAYS # LATIN SMALL LETTER K WITH DOT BELOW 1E34 ; NEVER # LATIN CAPITAL LETTER K WITH LINE BELOW 1E35 ; ALWAYS # LATIN SMALL LETTER K WITH LINE BELOW 1E36 ; NEVER # LATIN CAPITAL LETTER L WITH DOT BELOW 1E37 ; ALWAYS # LATIN SMALL LETTER L WITH DOT BELOW 1E38..1E3A ; NEVER # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRO 1E3B ; ALWAYS # LATIN SMALL LETTER L WITH LINE BELOW 1E3C ; NEVER # LATIN CAPITAL LETTER L WITH CIRCUMFLEX BELOW 1E3D ; ALWAYS # LATIN SMALL LETTER L WITH CIRCUMFLEX BELOW 1E3E ; NEVER # LATIN CAPITAL LETTER M WITH ACUTE 1E3F ; ALWAYS # LATIN SMALL LETTER M WITH ACUTE 1E40 ; NEVER # LATIN CAPITAL LETTER M WITH DOT ABOVE 1E41 ; ALWAYS # LATIN SMALL LETTER M WITH DOT ABOVE 1E42 ; NEVER # LATIN CAPITAL LETTER M WITH DOT BELOW 1E43 ; ALWAYS # LATIN SMALL LETTER M WITH DOT BELOW 1E44 ; NEVER # LATIN CAPITAL LETTER N WITH DOT ABOVE 1E45 ; ALWAYS # LATIN SMALL LETTER N WITH DOT ABOVE 1E46 ; NEVER # LATIN CAPITAL LETTER N WITH DOT BELOW 1E47 ; ALWAYS # LATIN SMALL LETTER N WITH DOT BELOW Faltstrom Expires November 22, 2007 [Page 31] Internet-Draft Unicode Codepoints May 2007 1E48 ; NEVER # LATIN CAPITAL LETTER N WITH LINE BELOW 1E49 ; ALWAYS # LATIN SMALL LETTER N WITH LINE BELOW 1E4A ; NEVER # LATIN CAPITAL LETTER N WITH CIRCUMFLEX BELOW 1E4B ; ALWAYS # LATIN SMALL LETTER N WITH CIRCUMFLEX BELOW 1E4C..1E54 ; NEVER # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE..LA 1E55 ; ALWAYS # LATIN SMALL LETTER P WITH ACUTE 1E56 ; NEVER # LATIN CAPITAL LETTER P WITH DOT ABOVE 1E57 ; ALWAYS # LATIN SMALL LETTER P WITH DOT ABOVE 1E58 ; NEVER # LATIN CAPITAL LETTER R WITH DOT ABOVE 1E59 ; ALWAYS # LATIN SMALL LETTER R WITH DOT ABOVE 1E5A ; NEVER # LATIN CAPITAL LETTER R WITH DOT BELOW 1E5B ; ALWAYS # LATIN SMALL LETTER R WITH DOT BELOW 1E5C..1E5E ; NEVER # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRO 1E5F ; ALWAYS # LATIN SMALL LETTER R WITH LINE BELOW 1E60 ; NEVER # LATIN CAPITAL LETTER S WITH DOT ABOVE 1E61 ; ALWAYS # LATIN SMALL LETTER S WITH DOT ABOVE 1E62 ; NEVER # LATIN CAPITAL LETTER S WITH DOT BELOW 1E63 ; ALWAYS # LATIN SMALL LETTER S WITH DOT BELOW 1E64..1E6A ; NEVER # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE 1E6B ; ALWAYS # LATIN SMALL LETTER T WITH DOT ABOVE 1E6C ; NEVER # LATIN CAPITAL LETTER T WITH DOT BELOW 1E6D ; ALWAYS # LATIN SMALL LETTER T WITH DOT BELOW 1E6E ; NEVER # LATIN CAPITAL LETTER T WITH LINE BELOW 1E6F ; ALWAYS # LATIN SMALL LETTER T WITH LINE BELOW 1E70 ; NEVER # LATIN CAPITAL LETTER T WITH CIRCUMFLEX BELOW 1E71 ; ALWAYS # LATIN SMALL LETTER T WITH CIRCUMFLEX BELOW 1E72 ; NEVER # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW 1E73 ; ALWAYS # LATIN SMALL LETTER U WITH DIAERESIS BELOW 1E74 ; NEVER # LATIN CAPITAL LETTER U WITH TILDE BELOW 1E75 ; ALWAYS # LATIN SMALL LETTER U WITH TILDE BELOW 1E76 ; NEVER # LATIN CAPITAL LETTER U WITH CIRCUMFLEX BELOW 1E77 ; ALWAYS # LATIN SMALL LETTER U WITH CIRCUMFLEX BELOW 1E78..1E7C ; NEVER # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE..LA 1E7D ; ALWAYS # LATIN SMALL LETTER V WITH TILDE 1E7E ; NEVER # LATIN CAPITAL LETTER V WITH DOT BELOW 1E7F ; ALWAYS # LATIN SMALL LETTER V WITH DOT BELOW 1E80 ; NEVER # LATIN CAPITAL LETTER W WITH GRAVE 1E81 ; ALWAYS # LATIN SMALL LETTER W WITH GRAVE 1E82 ; NEVER # LATIN CAPITAL LETTER W WITH ACUTE 1E83 ; ALWAYS # LATIN SMALL LETTER W WITH ACUTE 1E84 ; NEVER # LATIN CAPITAL LETTER W WITH DIAERESIS 1E85 ; ALWAYS # LATIN SMALL LETTER W WITH DIAERESIS 1E86 ; NEVER # LATIN CAPITAL LETTER W WITH DOT ABOVE 1E87 ; ALWAYS # LATIN SMALL LETTER W WITH DOT ABOVE 1E88 ; NEVER # LATIN CAPITAL LETTER W WITH DOT BELOW 1E89 ; ALWAYS # LATIN SMALL LETTER W WITH DOT BELOW 1E8A ; NEVER # LATIN CAPITAL LETTER X WITH DOT ABOVE 1E8B ; ALWAYS # LATIN SMALL LETTER X WITH DOT ABOVE Faltstrom Expires November 22, 2007 [Page 32] Internet-Draft Unicode Codepoints May 2007 1E8C ; NEVER # LATIN CAPITAL LETTER X WITH DIAERESIS 1E8D ; ALWAYS # LATIN SMALL LETTER X WITH DIAERESIS 1E8E ; NEVER # LATIN CAPITAL LETTER Y WITH DOT ABOVE 1E8F ; ALWAYS # LATIN SMALL LETTER Y WITH DOT ABOVE 1E90 ; NEVER # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX 1E91 ; ALWAYS # LATIN SMALL LETTER Z WITH CIRCUMFLEX 1E92 ; NEVER # LATIN CAPITAL LETTER Z WITH DOT BELOW 1E93 ; ALWAYS # LATIN SMALL LETTER Z WITH DOT BELOW 1E94 ; NEVER # LATIN CAPITAL LETTER Z WITH LINE BELOW 1E95 ; ALWAYS # LATIN SMALL LETTER Z WITH LINE BELOW 1E96..1E9B ; NEVER # LATIN SMALL LETTER H WITH LINE BELOW..LATIN SMA 1E9C..1E9F ; MAYBE NOT # .. 1EA0 ; NEVER # LATIN CAPITAL LETTER A WITH DOT BELOW 1EA1 ; ALWAYS # LATIN SMALL LETTER A WITH DOT BELOW 1EA2 ; NEVER # LATIN CAPITAL LETTER A WITH HOOK ABOVE 1EA3 ; ALWAYS # LATIN SMALL LETTER A WITH HOOK ABOVE 1EA4..1EB8 ; NEVER # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUT 1EB9 ; ALWAYS # LATIN SMALL LETTER E WITH DOT BELOW 1EBA ; NEVER # LATIN CAPITAL LETTER E WITH HOOK ABOVE 1EBB ; ALWAYS # LATIN SMALL LETTER E WITH HOOK ABOVE 1EBC ; NEVER # LATIN CAPITAL LETTER E WITH TILDE 1EBD ; ALWAYS # LATIN SMALL LETTER E WITH TILDE 1EBE..1EC8 ; NEVER # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUT 1EC9 ; ALWAYS # LATIN SMALL LETTER I WITH HOOK ABOVE 1ECA ; NEVER # LATIN CAPITAL LETTER I WITH DOT BELOW 1ECB ; ALWAYS # LATIN SMALL LETTER I WITH DOT BELOW 1ECC ; NEVER # LATIN CAPITAL LETTER O WITH DOT BELOW 1ECD ; ALWAYS # LATIN SMALL LETTER O WITH DOT BELOW 1ECE ; NEVER # LATIN CAPITAL LETTER O WITH HOOK ABOVE 1ECF ; ALWAYS # LATIN SMALL LETTER O WITH HOOK ABOVE 1ED0..1EE4 ; NEVER # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUT 1EE5 ; ALWAYS # LATIN SMALL LETTER U WITH DOT BELOW 1EE6 ; NEVER # LATIN CAPITAL LETTER U WITH HOOK ABOVE 1EE7 ; ALWAYS # LATIN SMALL LETTER U WITH HOOK ABOVE 1EE8..1EF2 ; NEVER # LATIN CAPITAL LETTER U WITH HORN AND ACUTE..LAT 1EF3 ; ALWAYS # LATIN SMALL LETTER Y WITH GRAVE 1EF4 ; NEVER # LATIN CAPITAL LETTER Y WITH DOT BELOW 1EF5 ; ALWAYS # LATIN SMALL LETTER Y WITH DOT BELOW 1EF6 ; NEVER # LATIN CAPITAL LETTER Y WITH HOOK ABOVE 1EF7 ; ALWAYS # LATIN SMALL LETTER Y WITH HOOK ABOVE 1EF8 ; NEVER # LATIN CAPITAL LETTER Y WITH TILDE 1EF9 ; ALWAYS # LATIN SMALL LETTER Y WITH TILDE 1EFA..1EFF ; MAYBE NOT # .. 1F00..1F01 ; ALWAYS # GREEK SMALL LETTER ALPHA WITH PSILI..GREEK SMAL 1F02..1F0F ; NEVER # GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA.. 1F10..1F11 ; ALWAYS # GREEK SMALL LETTER EPSILON WITH PSILI..GREEK SM 1F12..1F15 ; NEVER # GREEK SMALL LETTER EPSILON WITH PSILI AND VARIA 1F16..1F17 ; MAYBE NOT # .. Faltstrom Expires November 22, 2007 [Page 33] Internet-Draft Unicode Codepoints May 2007 1F18..1F1D ; NEVER # GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK 1F1E..1F1F ; MAYBE NOT # .. 1F20..1F21 ; ALWAYS # GREEK SMALL LETTER ETA WITH PSILI..GREEK SMALL 1F22..1F2F ; NEVER # GREEK SMALL LETTER ETA WITH PSILI AND VARIA..GR 1F30..1F31 ; ALWAYS # GREEK SMALL LETTER IOTA WITH PSILI..GREEK SMALL 1F32..1F3F ; NEVER # GREEK SMALL LETTER IOTA WITH PSILI AND VARIA..G 1F40..1F41 ; ALWAYS # GREEK SMALL LETTER OMICRON WITH PSILI..GREEK SM 1F42..1F45 ; NEVER # GREEK SMALL LETTER OMICRON WITH PSILI AND VARIA 1F46..1F47 ; MAYBE NOT # .. 1F48..1F4D ; NEVER # GREEK CAPITAL LETTER OMICRON WITH PSILI..GREEK 1F4E..1F4F ; MAYBE NOT # .. 1F50 ; NEVER # GREEK SMALL LETTER UPSILON WITH PSILI 1F51 ; ALWAYS # GREEK SMALL LETTER UPSILON WITH DASIA 1F52..1F57 ; NEVER # GREEK SMALL LETTER UPSILON WITH PSILI AND VARIA 1F58 ; MAYBE NOT # 1F59 ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA 1F5A ; MAYBE NOT # 1F5B ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA AND VAR 1F5C ; MAYBE NOT # 1F5D ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXI 1F5E ; MAYBE NOT # 1F5F ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA AND PER 1F60..1F61 ; ALWAYS # GREEK SMALL LETTER OMEGA WITH PSILI..GREEK SMAL 1F62..1F6F ; NEVER # GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA.. 1F70 ; ALWAYS # GREEK SMALL LETTER ALPHA WITH VARIA 1F71 ; NEVER # GREEK SMALL LETTER ALPHA WITH OXIA 1F72 ; ALWAYS # GREEK SMALL LETTER EPSILON WITH VARIA 1F73 ; NEVER # GREEK SMALL LETTER EPSILON WITH OXIA 1F74 ; ALWAYS # GREEK SMALL LETTER ETA WITH VARIA 1F75 ; NEVER # GREEK SMALL LETTER ETA WITH OXIA 1F76 ; ALWAYS # GREEK SMALL LETTER IOTA WITH VARIA 1F77 ; NEVER # GREEK SMALL LETTER IOTA WITH OXIA 1F78 ; ALWAYS # GREEK SMALL LETTER OMICRON WITH VARIA 1F79 ; NEVER # GREEK SMALL LETTER OMICRON WITH OXIA 1F7A ; ALWAYS # GREEK SMALL LETTER UPSILON WITH VARIA 1F7B ; NEVER # GREEK SMALL LETTER UPSILON WITH OXIA 1F7C ; ALWAYS # GREEK SMALL LETTER OMEGA WITH VARIA 1F7D ; NEVER # GREEK SMALL LETTER OMEGA WITH OXIA 1F7E..1F7F ; MAYBE NOT # .. 1F80..1FAF ; NEVER # GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGR 1FB0..1FB1 ; ALWAYS # GREEK SMALL LETTER ALPHA WITH VRACHY..GREEK SMA 1FB2..1FB4 ; NEVER # GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGR 1FB5 ; MAYBE NOT # 1FB6..1FC4 ; NEVER # GREEK SMALL LETTER ALPHA WITH PERISPOMENI..GREE 1FC5 ; MAYBE NOT # 1FC6..1FCF ; NEVER # GREEK SMALL LETTER ETA WITH PERISPOMENI..GREEK 1FD0..1FD1 ; ALWAYS # GREEK SMALL LETTER IOTA WITH VRACHY..GREEK SMAL 1FD2..1FD3 ; NEVER # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARI Faltstrom Expires November 22, 2007 [Page 34] Internet-Draft Unicode Codepoints May 2007 1FD4..1FD5 ; MAYBE NOT # .. 1FD6..1FDB ; NEVER # GREEK SMALL LETTER IOTA WITH PERISPOMENI..GREEK 1FDC ; MAYBE NOT # 1FDD..1FDF ; NEVER # GREEK DASIA AND VARIA..GREEK DASIA AND PERISPOM 1FE0..1FE1 ; ALWAYS # GREEK SMALL LETTER UPSILON WITH VRACHY..GREEK S 1FE2..1FE4 ; NEVER # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND V 1FE5 ; ALWAYS # GREEK SMALL LETTER RHO WITH DASIA 1FE6..1FEF ; NEVER # GREEK SMALL LETTER UPSILON WITH PERISPOMENI..GR 1FF0..1FF1 ; MAYBE NOT # .. 1FF2..1FF4 ; NEVER # GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGR 1FF5 ; MAYBE NOT # 1FF6..1FFE ; NEVER # GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREE 1FFF..2070 ; MAYBE NOT # ..SUPERSCRIPT ZERO 2071 ; NEVER # SUPERSCRIPT LATIN SMALL LETTER I 2072..207E ; MAYBE NOT # ..SUPERSCRIPT RIGHT PARENTHESIS 207F ; NEVER # SUPERSCRIPT LATIN SMALL LETTER N 2080..208F ; MAYBE NOT # SUBSCRIPT ZERO.. 2090..2094 ; NEVER # LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT 2095..2125 ; MAYBE NOT # ..OUNCE SIGN 2126 ; NEVER # OHM SIGN 2127..2129 ; MAYBE NOT # INVERTED OHM SIGN..TURNED GREEK SMALL LETTER IO 212A..212B ; NEVER # KELVIN SIGN..ANGSTROM SIGN 212C..2131 ; MAYBE NOT # SCRIPT CAPITAL B..SCRIPT CAPITAL F 2132 ; NEVER # TURNED CAPITAL F 2133..214D ; MAYBE NOT # SCRIPT CAPITAL M..AKTIESELSKAB 214E ; ALWAYS # TURNED SMALL F 214F..2183 ; MAYBE NOT # ..ROMAN NUMERAL REVERSED ONE HUNDRED 2184 ; ALWAYS # LATIN SMALL LETTER REVERSED C 2185..2C5F ; MAYBE NOT # .. 2C60 ; NEVER # LATIN CAPITAL LETTER L WITH DOUBLE BAR 2C61 ; ALWAYS # LATIN SMALL LETTER L WITH DOUBLE BAR 2C62..2C64 ; NEVER # LATIN CAPITAL LETTER L WITH MIDDLE TILDE..LATIN 2C65..2C66 ; ALWAYS # LATIN SMALL LETTER A WITH STROKE..LATIN SMALL L 2C67 ; NEVER # LATIN CAPITAL LETTER H WITH DESCENDER 2C68 ; ALWAYS # LATIN SMALL LETTER H WITH DESCENDER 2C69 ; NEVER # LATIN CAPITAL LETTER K WITH DESCENDER 2C6A ; ALWAYS # LATIN SMALL LETTER K WITH DESCENDER 2C6B ; NEVER # LATIN CAPITAL LETTER Z WITH DESCENDER 2C6C ; ALWAYS # LATIN SMALL LETTER Z WITH DESCENDER 2C6D..2C73 ; MAYBE NOT # .. 2C74 ; ALWAYS # LATIN SMALL LETTER V WITH CURL 2C75 ; NEVER # LATIN CAPITAL LETTER HALF H 2C76..2C77 ; ALWAYS # LATIN SMALL LETTER HALF H..LATIN SMALL LETTER T 2C78..2C80 ; MAYBE NOT # ..COPTIC CAPITAL LETTER ALFA 2C81 ; MAYBE YES # COPTIC SMALL LETTER ALFA 2C82 ; MAYBE NOT # COPTIC CAPITAL LETTER VIDA 2C83 ; MAYBE YES # COPTIC SMALL LETTER VIDA 2C84 ; MAYBE NOT # COPTIC CAPITAL LETTER GAMMA Faltstrom Expires November 22, 2007 [Page 35] Internet-Draft Unicode Codepoints May 2007 2C85 ; MAYBE YES # COPTIC SMALL LETTER GAMMA 2C86 ; MAYBE NOT # COPTIC CAPITAL LETTER DALDA 2C87 ; MAYBE YES # COPTIC SMALL LETTER DALDA 2C88 ; MAYBE NOT # COPTIC CAPITAL LETTER EIE 2C89 ; MAYBE YES # COPTIC SMALL LETTER EIE 2C8A ; MAYBE NOT # COPTIC CAPITAL LETTER SOU 2C8B ; MAYBE YES # COPTIC SMALL LETTER SOU 2C8C ; MAYBE NOT # COPTIC CAPITAL LETTER ZATA 2C8D ; MAYBE YES # COPTIC SMALL LETTER ZATA 2C8E ; MAYBE NOT # COPTIC CAPITAL LETTER HATE 2C8F ; MAYBE YES # COPTIC SMALL LETTER HATE 2C90 ; MAYBE NOT # COPTIC CAPITAL LETTER THETHE 2C91 ; MAYBE YES # COPTIC SMALL LETTER THETHE 2C92 ; MAYBE NOT # COPTIC CAPITAL LETTER IAUDA 2C93 ; MAYBE YES # COPTIC SMALL LETTER IAUDA 2C94 ; MAYBE NOT # COPTIC CAPITAL LETTER KAPA 2C95 ; MAYBE YES # COPTIC SMALL LETTER KAPA 2C96 ; MAYBE NOT # COPTIC CAPITAL LETTER LAULA 2C97 ; MAYBE YES # COPTIC SMALL LETTER LAULA 2C98 ; MAYBE NOT # COPTIC CAPITAL LETTER MI 2C99 ; MAYBE YES # COPTIC SMALL LETTER MI 2C9A ; MAYBE NOT # COPTIC CAPITAL LETTER NI 2C9B ; MAYBE YES # COPTIC SMALL LETTER NI 2C9C ; MAYBE NOT # COPTIC CAPITAL LETTER KSI 2C9D ; MAYBE YES # COPTIC SMALL LETTER KSI 2C9E ; MAYBE NOT # COPTIC CAPITAL LETTER O 2C9F ; MAYBE YES # COPTIC SMALL LETTER O 2CA0 ; MAYBE NOT # COPTIC CAPITAL LETTER PI 2CA1 ; MAYBE YES # COPTIC SMALL LETTER PI 2CA2 ; MAYBE NOT # COPTIC CAPITAL LETTER RO 2CA3 ; MAYBE YES # COPTIC SMALL LETTER RO 2CA4 ; MAYBE NOT # COPTIC CAPITAL LETTER SIMA 2CA5 ; MAYBE YES # COPTIC SMALL LETTER SIMA 2CA6 ; MAYBE NOT # COPTIC CAPITAL LETTER TAU 2CA7 ; MAYBE YES # COPTIC SMALL LETTER TAU 2CA8 ; MAYBE NOT # COPTIC CAPITAL LETTER UA 2CA9 ; MAYBE YES # COPTIC SMALL LETTER UA 2CAA ; MAYBE NOT # COPTIC CAPITAL LETTER FI 2CAB ; MAYBE YES # COPTIC SMALL LETTER FI 2CAC ; MAYBE NOT # COPTIC CAPITAL LETTER KHI 2CAD ; MAYBE YES # COPTIC SMALL LETTER KHI 2CAE ; MAYBE NOT # COPTIC CAPITAL LETTER PSI 2CAF ; MAYBE YES # COPTIC SMALL LETTER PSI 2CB0 ; MAYBE NOT # COPTIC CAPITAL LETTER OOU 2CB1 ; MAYBE YES # COPTIC SMALL LETTER OOU 2CB2 ; MAYBE NOT # COPTIC CAPITAL LETTER DIALECT-P ALEF 2CB3 ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P ALEF 2CB4 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC AIN Faltstrom Expires November 22, 2007 [Page 36] Internet-Draft Unicode Codepoints May 2007 2CB5 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC AIN 2CB6 ; MAYBE NOT # COPTIC CAPITAL LETTER CRYPTOGRAMMIC EIE 2CB7 ; MAYBE YES # COPTIC SMALL LETTER CRYPTOGRAMMIC EIE 2CB8 ; MAYBE NOT # COPTIC CAPITAL LETTER DIALECT-P KAPA 2CB9 ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P KAPA 2CBA ; MAYBE NOT # COPTIC CAPITAL LETTER DIALECT-P NI 2CBB ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P NI 2CBC ; MAYBE NOT # COPTIC CAPITAL LETTER CRYPTOGRAMMIC NI 2CBD ; MAYBE YES # COPTIC SMALL LETTER CRYPTOGRAMMIC NI 2CBE ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC OOU 2CBF ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC OOU 2CC0 ; MAYBE NOT # COPTIC CAPITAL LETTER SAMPI 2CC1 ; MAYBE YES # COPTIC SMALL LETTER SAMPI 2CC2 ; MAYBE NOT # COPTIC CAPITAL LETTER CROSSED SHEI 2CC3 ; MAYBE YES # COPTIC SMALL LETTER CROSSED SHEI 2CC4 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC SHEI 2CC5 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC SHEI 2CC6 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC ESH 2CC7 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC ESH 2CC8 ; MAYBE NOT # COPTIC CAPITAL LETTER AKHMIMIC KHEI 2CC9 ; MAYBE YES # COPTIC SMALL LETTER AKHMIMIC KHEI 2CCA ; MAYBE NOT # COPTIC CAPITAL LETTER DIALECT-P HORI 2CCB ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P HORI 2CCC ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC HORI 2CCD ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HORI 2CCE ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC HA 2CCF ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HA 2CD0 ; MAYBE NOT # COPTIC CAPITAL LETTER L-SHAPED HA 2CD1 ; MAYBE YES # COPTIC SMALL LETTER L-SHAPED HA 2CD2 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC HEI 2CD3 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HEI 2CD4 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC HAT 2CD5 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HAT 2CD6 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC GANGIA 2CD7 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC GANGIA 2CD8 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC DJA 2CD9 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC DJA 2CDA ; MAYBE NOT # COPTIC CAPITAL LETTER OLD COPTIC SHIMA 2CDB ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC SHIMA 2CDC ; MAYBE NOT # COPTIC CAPITAL LETTER OLD NUBIAN SHIMA 2CDD ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN SHIMA 2CDE ; MAYBE NOT # COPTIC CAPITAL LETTER OLD NUBIAN NGI 2CDF ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN NGI 2CE0 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD NUBIAN NYI 2CE1 ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN NYI 2CE2 ; MAYBE NOT # COPTIC CAPITAL LETTER OLD NUBIAN WAU 2CE3..2CE4 ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN WAU..COPTIC SYMB 2CE5..2CFF ; MAYBE NOT # COPTIC SYMBOL MI RO..COPTIC MORPHOLOGICAL DIVID Faltstrom Expires November 22, 2007 [Page 37] Internet-Draft Unicode Codepoints May 2007 2D00..2D25 ; MAYBE YES # GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER 2D26..2D2F ; MAYBE NOT # .. 2D30..2D65 ; MAYBE YES # TIFINAGH LETTER YA..TIFINAGH LETTER YAZZ 2D66..2D7F ; MAYBE NOT # .. 2D80..2D96 ; MAYBE YES # ETHIOPIC SYLLABLE LOA..ETHIOPIC SYLLABLE GGWE 2D97..2D9F ; MAYBE NOT # .. 2DA0..2DA6 ; MAYBE YES # ETHIOPIC SYLLABLE SSA..ETHIOPIC SYLLABLE SSO 2DA7 ; MAYBE NOT # 2DA8..2DAE ; MAYBE YES # ETHIOPIC SYLLABLE CCA..ETHIOPIC SYLLABLE CCO 2DAF ; MAYBE NOT # 2DB0..2DB6 ; MAYBE YES # ETHIOPIC SYLLABLE ZZA..ETHIOPIC SYLLABLE ZZO 2DB7 ; MAYBE NOT # 2DB8..2DBE ; MAYBE YES # ETHIOPIC SYLLABLE CCHA..ETHIOPIC SYLLABLE CCHO 2DBF ; MAYBE NOT # 2DC0..2DC6 ; MAYBE YES # ETHIOPIC SYLLABLE QYA..ETHIOPIC SYLLABLE QYO 2DC7 ; MAYBE NOT # 2DC8..2DCE ; MAYBE YES # ETHIOPIC SYLLABLE KYA..ETHIOPIC SYLLABLE KYO 2DCF ; MAYBE NOT # 2DD0..2DD6 ; MAYBE YES # ETHIOPIC SYLLABLE XYA..ETHIOPIC SYLLABLE XYO 2DD7 ; MAYBE NOT # 2DD8..2DDE ; MAYBE YES # ETHIOPIC SYLLABLE GYA..ETHIOPIC SYLLABLE GYO 2DDF..3004 ; MAYBE NOT # ..JAPANESE INDUSTRIAL STANDARD SYMBOL 3005..3006 ; MAYBE YES # IDEOGRAPHIC ITERATION MARK..IDEOGRAPHIC CLOSING 3007..3029 ; MAYBE NOT # IDEOGRAPHIC NUMBER ZERO..HANGZHOU NUMERAL NINE 302A..302F ; MAYBE YES # IDEOGRAPHIC LEVEL TONE MARK..HANGUL DOUBLE DOT 3030 ; MAYBE NOT # WAVY DASH 3031..3035 ; MAYBE YES # VERTICAL KANA REPEAT MARK..VERTICAL KANA REPEAT 3036..303A ; MAYBE NOT # CIRCLED POSTAL MARK..HANGZHOU NUMERAL THIRTY 303B..303C ; MAYBE YES # VERTICAL IDEOGRAPHIC ITERATION MARK..MASU MARK 303D..3040 ; MAYBE NOT # PART ALTERNATION MARK.. 3041..3096 ; MAYBE YES # HIRAGANA LETTER SMALL A..HIRAGANA LETTER SMALL 3097..3098 ; MAYBE NOT # .. 3099..309A ; MAYBE YES # COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK.. 309B..309C ; MAYBE NOT # KATAKANA-HIRAGANA VOICED SOUND MARK..KATAKANA-H 309D..309E ; MAYBE YES # HIRAGANA ITERATION MARK..HIRAGANA VOICED ITERAT 309F..30A0 ; MAYBE NOT # HIRAGANA DIGRAPH YORI..KATAKANA-HIRAGANA DOUBLE 30A1..30FA ; MAYBE YES # KATAKANA LETTER SMALL A..KATAKANA LETTER VO 30FB ; MAYBE NOT # KATAKANA MIDDLE DOT 30FC..30FE ; MAYBE YES # KATAKANA-HIRAGANA PROLONGED SOUND MARK..KATAKAN 30FF..3104 ; MAYBE NOT # KATAKANA DIGRAPH KOTO.. 3105..312C ; MAYBE YES # BOPOMOFO LETTER B..BOPOMOFO LETTER GN 312D..319F ; MAYBE NOT # ..IDEOGRAPHIC ANNOTATION MAN MARK 31A0..31B7 ; MAYBE YES # BOPOMOFO LETTER BU..BOPOMOFO FINAL LETTER H 31B8..31EF ; MAYBE NOT # .. 31F0..31FF ; MAYBE YES # KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL 3200..33FF ; MAYBE NOT # PARENTHESIZED HANGUL KIYEOK..SQUARE GAL 3400..4DB5 ; MAYBE YES # ....HEXAGRAM FOR BEFORE COMPLETION Faltstrom Expires November 22, 2007 [Page 38] Internet-Draft Unicode Codepoints May 2007 4E00..9FBB ; MAYBE YES # .. 9FBC..9FFF ; MAYBE NOT # .. A000..A48C ; MAYBE YES # YI SYLLABLE IT..YI SYLLABLE YYR A48D..A716 ; MAYBE NOT # ..MODIFIER LETTER EXTRA-LOW LEFT-STEM A717..A71A ; MAYBE YES # MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETT A71B..A7FF ; MAYBE NOT # .. A800..A827 ; MAYBE YES # SYLOTI NAGRI LETTER A..SYLOTI NAGRI VOWEL SIGN A828..ABFF ; MAYBE NOT # SYLOTI NAGRI POETRY MARK-1.. AC00..D7A3 ; MAYBE YES # ....CJK COMPATIBILITY IDEOGRAPH-FA0D FA0E..FA0F ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA0E..CJK COMPATIBI FA10 ; MAYBE NOT # CJK COMPATIBILITY IDEOGRAPH-FA10 FA11 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA11 FA12 ; MAYBE NOT # CJK COMPATIBILITY IDEOGRAPH-FA12 FA13..FA14 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA13..CJK COMPATIBI FA15..FA1E ; MAYBE NOT # CJK COMPATIBILITY IDEOGRAPH-FA15..CJK COMPATIBI FA1F ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA1F FA20 ; MAYBE NOT # CJK COMPATIBILITY IDEOGRAPH-FA20 FA21 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA21 FA22 ; MAYBE NOT # CJK COMPATIBILITY IDEOGRAPH-FA22 FA23..FA24 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA23..CJK COMPATIBI FA25..FA26 ; MAYBE NOT # CJK COMPATIBILITY IDEOGRAPH-FA25..CJK COMPATIBI FA27..FA29 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA27..CJK COMPATIBI FA2A..FAFF ; MAYBE NOT # CJK COMPATIBILITY IDEOGRAPH-FA2A.. FB00..FB06 ; NEVER # LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE S FB07..FB1D ; MAYBE NOT # ..HEBREW LETTER YOD WITH HIRIQ FB1E ; MAYBE YES # HEBREW POINT JUDEO-SPANISH VARIKA FB1F..FC5A ; MAYBE NOT # HEBREW LIGATURE YIDDISH YOD YOD PATAH..ARABIC L FC5B..FC5C ; MAYBE YES # ARABIC LIGATURE THAL WITH SUPERSCRIPT ALEF ISOL FC5D ; MAYBE NOT # ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT A FC5E..FC63 ; MAYBE YES # ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED F FC64..FC8F ; MAYBE NOT # ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH REH F FC90 ; MAYBE YES # ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT A FC91..FCD8 ; MAYBE NOT # ARABIC LIGATURE YEH WITH REH FINAL FORM..ARABIC FCD9 ; MAYBE YES # ARABIC LIGATURE HEH WITH SUPERSCRIPT ALEF INITI FCDA..FCF1 ; MAYBE NOT # ARABIC LIGATURE YEH WITH JEEM INITIAL FORM..ARA FCF2..FCF4 ; MAYBE YES # ARABIC LIGATURE SHADDA WITH FATHA MEDIAL FORM.. FCF5..FD3C ; MAYBE NOT # ARABIC LIGATURE TAH WITH ALEF MAKSURA ISOLATED FD3D ; MAYBE YES # ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FOR FD3E..FDFF ; MAYBE NOT # ORNATE LEFT PARENTHESIS.. FE00..FE0F ; MAYBE YES # VARIATION SELECTOR-1..VARIATION SELECTOR-16 FE10..FE1F ; MAYBE NOT # PRESENTATION FORM FOR VERTICAL COMMA.... FE70..FE74 ; MAYBE YES # ARABIC FATHATAN ISOLATED FORM..ARABIC KASRATAN FE75 ; MAYBE NOT # FE76..FE7F ; MAYBE YES # ARABIC FATHA ISOLATED FORM..ARABIC SUKUN MEDIAL FE80..FF20 ; MAYBE NOT # ARABIC LETTER HAMZA ISOLATED FORM..FULLWIDTH CO Faltstrom Expires November 22, 2007 [Page 39] Internet-Draft Unicode Codepoints May 2007 FF21..FF3A ; NEVER # FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT FF3B..FF40 ; MAYBE NOT # FULLWIDTH LEFT SQUARE BRACKET..FULLWIDTH GRAVE FF41..FF5A ; NEVER # FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN FF5B..FFFE ; MAYBE NOT # FULLWIDTH LEFT CURLY BRACKET.. 5. IANA Considerations ...To be supplied. This work will ultimately require registries of characters that are acceptable for use in IDNs. 6. Security Considerations The security issues associated with this work are discussed in [IDNA-issues]. 7. Contributors While the listed editor held the pen, this document represents the joint work and conclusions of an ad hoc design team. In addition to the editor this consisted of, Harald Alvestrand, Tina Dam, Cary Karp, and John Klensin. 8. Acknowledgements 9. References 9.1. Normative References [RFC4690] Klensin, J., Faltstrom, P., and Karp, C., "Review and Recommendations for Internationalized Domain Names (IDNs)", RFC 4690, September 2006. [Unicode5] The Unicode Consortium, "The Unicode Standard, Version 5.0", Boston, MA, Addison-Wesley ISBN 0-321-48091-0, 2007. [idnabis] Klensin, J., "Proposed Issues and Changes for IDNA - An Overview", Work in progress draft-klensin-..., October 2006. Faltstrom Expires November 22, 2007 [Page 40] Internet-Draft Unicode Codepoints May 2007 9.2. Informative References [IDNA-bidi] Alvestrand, H., Ed. and C. Karp, "An IDNA problem in right-to-left scripts", Oct 2006. [IDNA-issues] Klensin, J., Ed., "Proposed Issues and Changes for IDNA - An Overview", October 2006. [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of Internationalized Strings ("stringprep")", RFC 3454, December 2002. [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003. [RFC4713] Lee, X., Mao, W., Chen, E., Hsu, N., and J. Klensin, "Registration and Administration Recommendations for Chinese Domain Names", RFC 4713, October 2006. [codepoints] Faltstrom, P., "Codepoint classification in IDNAbis", Feb 2007. Author's Address Patrik Faltstrom (editor) Cisco Systems Email: paf@cisco.com Faltstrom Expires November 22, 2007 [Page 41] Internet-Draft Unicode Codepoints May 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Faltstrom Expires November 22, 2007 [Page 42]