emoji4unicode

(project)/data/emoji4unicode.xml

Root Element: <emoji4unicode>

No attributes.

Contains zero or more <category> elements.

<category>

Sub-element of the <emoji4unicode> root element.

Attributes:

    • name="1. Nature" — Required.

    • in_proposal="yes" or "no" — Optional, defaults to "yes". If "yes", the symbols in this category are part of the encoding proposal and will be shown in the UTC HTML chart. gen_html.py --only_in_proposal filters out symbols that are not in_proposal.

Contains zero or more <subcategory> elements.

<subcategory>

Sub-element of the <category> element.

Attributes:

    • name="Weather" — Required.

    • in_proposal="yes" or "no" — Optional, defaults to the parent category's in_proposal value. See there for more details.

Contains zero or more <symbol> elements.

<e>

Sub-element of the <subcategory> element. An <e> element defines the data and mappings for an Emoji symbol.

A symbol is proposed for new encoding if it has in_proposal="yes" (directly or inherited) and if its unicode attribute is missing.

Attributes:

    • id="000" — Required. This is the identifier of the Emoji symbol for the purpose of the encoding proposal. The id is unique and stable.

    • glyphRefID="5" — Optional; required if the symbol is proposed for new encoding. This is the glyph number in the proposal font. The glyphRefID is unique and stable.

    • name="BLACK SUN WITH RAYS" — Required. If the symbol is unified with an existing character (that is, there is a unicode attribute), then the name must match that character's name. Otherwise, this is the proposed name for the proposed symbol character, and it must be different from any other Unicode character name.

    • in_proposal="yes" or "no" — Optional, defaults to the parent subcategory's in_proposal value. See there for more details.

    • unicode="2600" — Optional. If present, then the symbol is unified with an existing Unicode character (e.g., unicode="2600"), or a sequence of characters (e.g., unicode="0031+20E3"). Unicode characters are represented by their 4..6-hex-digit code points and separated by '+'. The unified character or sequence is used as the symbol representation in the chart.

    • The code point may be preceded by a '*' which indicates that it is for an upcoming character, that is, a character which is expected to be part of Unicode 5.2/AMD6. The code point and name of an upcoming character are preliminary.

    • The code point may be preceded by a '+' which indicates that this is the code point proposed for Unicode 6.0/AMD8.

    • If the unicode attribute is missing, then the proposed code point is one higher than the one for the previous <e> element that is proposed for new encoding.

    • If the unicode attribute value is only "+", then the proposed code point is one higher than the one for the previous <e> element that is proposed for new encoding whose proposed code point is at least U+1F300.

    • img_from="docomo" — Optional. If present, and if the symbol is not unified with an existing character, then the image for the specified carrier's equivalent symbol is used as the symbol representation in the chart.

    • text_repr="J-Sky1" — Optional. If present, and if both the unicode attribute and the img_from attribute are absent, then this attribute's text value is used as the symbol representation in the chart.

    • text_fallback="[霧]" — Optional. If present, then this attribute's text value is shown in the chart's table cell for each carrier for which there is no mapping specified. When converting an Emoji symbol to an encoding that does not support it, the text_fallback value may be substituted.

    • docomo="E63E" or docomo=">E63E+E63F" — Optional. Defines a mapping between this symbol and the DoCoMo Emoji symbols, defined by the carrier's 4-hex-digit Unicode PUA code points. This can be a single code point to indicate a round-trip mapping, or one or more code points preceded by a '>' and separated by '+' for a fallback mapping (one-way from symbol to DoCoMo). (Note: The '>' may be represented by '&gt;' in the XML file.)

    • kddi="E488" — Optional. Defines a mapping between this symbol and the KDDI Emoji symbols. See the docomo attribute for more details.

    • softbank="E04A" — Optional. Defines a mapping between this symbol and the SoftBank Emoji symbols. See the docomo attribute for more details.

    • google="FE000" — Optional. Defines a mapping between this symbol and the Google Emoji symbols. See the docomo attribute for more details. (Google uses 5-hex-digit Unicode PUA code points.)

Contains zero or more <ann> elements.

Contains at most one <desc> element.

Contains at most one <design> element.

<ann>

Sub-element of the <e> element. Contains text for a Unicode character annotation. One annotation line per <ann> element.

Each line must follow the CHAR_ENTRY syntax defined in http://www.unicode.org/Public/UNIDATA/NamesList.html.

For examples see http://www.unicode.org/Public/UNIDATA/NamesList.txt.

No attributes.

Contains no elements.

<desc>

Sub-element of the <e> element. Contains a text description that is copied into the charts.

No attributes.

Contains no elements.

<design>

Sub-element of the <e> element. Contains instructions text for the font and glyph design. It is copied into the charts.

No attributes.

Contains no elements.