Background
The proposal for encoding of Emoji symbols as Unicode characters covers the Emoji symbols that are in widespread use by
DoCoMo, KDDI and Softbank for their mobile phone networks. These symbols are encoded in
carrier-specific versions of Shift-JIS (as User-Defined Characters),
and, in the case of KDDI, in a carrier-specific version of ISO-2022-JP.
There are mapping tables in use in the industry between these character
sets, with both roundtrip and fallback mappings. These symbols are also
supported in web mail services by Yahoo! Mail and Google Mail. (Yahoo!
Mail currently supports a subset.) (The original proposal also included nine symbols defined by Google, but they were withdrawn from later versions.)We are taking into consideration the following factors in developing the proposal:
-
Source separation rule: If a single carrier separates
two characters (anywhere in the character set, so including standard
JIS codes), then we mapped them to two separate Unicode characters.
(This is a hard and fast rule.)
- Reuse: We mapped to existing Unicode symbols where appropriate.
- Separating generic symbols: If
Unicode had a set of related symbols, but no one character in the set
was as generic as in the Emoji symbol sets, then we encoded a new
character. For example, the Emoji sets do not distinguish between
waxing and waning crescent moons.
- Colors and Animation: We encoded symbols as characters, abstracting away from colors and animation. We only distinguished by nominal color or animation for the source separation rule. (See Character Names below.)
- Existing cross-mapping tables: We followed the tables mentioned above as much as possible, but we tentatively disunified in some cases where the visual images were very different and not semantically associated. For example:
- We disunified the 'M' symbol for Metro from the
Metro train image. The 'M' symbol would have translation problems.
(This is similar to the problems with the international currency symbol
and the proposal for a "generic decimal separator".)
-
On the other hand, we unified the sets of Zodiac symbols, even though
the images shown by carriers vary widely. This is because they clearly
belong to a cohesive set which corresponds across carriers.
- Least-marked common symbol: For a set of symbols which each could map to an existing Unicode code point, we chose the symbol that was shared among the most carriers (according to the cross-mapping tables) and had the least-marked form.
|
KDDI |
|
Unicode |
|
Softbank |
|
x |
↔ |
X |
→ |
y |
|
x |
← |
Y |
↔ |
y |
|
x |
↔ |
y |
||
Character Names
Proposed character names are typically based on the glosses of the carrier symbols or the visual appearance. Based on the consensus from discussions in the UTC, we used the following guidelines:- Follow the analogies of existing Unicode character names where possible
- In particular, use "BLACK" for "filled" and "WHITE" for "hollow".
- Exclude color and animation details from proposed character names except where necessary for distinction.
- For
cases where color is the only source distinction, the convention is to
map to BLACK and WHITE where there are two choices, and to BLACK,
WHITE, and CHECKERED where there are three, and to BLACK, WHITE,
CHECKERED. and STRIPED where there are four.
- Chart annotations will be added to indicate the preferred representations on color devices.
Documents
See the latest charts:- Chart of symbols that are unified or proposed for new encoding
- Chart with all symbols and all data
- Chart legend
2009-04-27: L2/09-153 Emoji Ad-Hoc Meeting Report (=N3636)
2009-04-10: L2/09-139 Response to Concerns Raised in N3607 About Encoding Emoji Characters (=N3614)
2009-04-06: L2/09-114 Towards an encoding of symbol characters used as emoji (=N3607) (Everson & Stötzner, Irish and German NB response to the Emoji proposal)
2009-04-06: L2/09-114 Towards an encoding of symbol characters used as emoji (=N3607) (Everson & Stötzner, Irish and German NB response to the Emoji proposal)
2009-03-05: L2/09-025R2 Proposal for Encoding Emoji Symbols (=N3582)
2009-02-06: L2/09-026R Emoji Symbols Proposed for New Encoding (=N3583)
2009-02-06: L2/09-078 Emoji Sources (=N3585)
2009-01-30: L2/09-025 Proposal for Encoding Emoji Symbols
2009-01-30: L2/09-026 Emoji Symbols Proposed for New Encoding
2009-01-30: L2/09-027 Emoji Symbols: Background Data
2008-08-13: L2/08-323 Scripts Subcommittee Draft Notes and Recommendations to UTC #116
2008-08-12: L2/08-314 Emoticon Core Set - working proposal
2008-08-12: (HTML) Table for Working Draft Proposal for Encoding Emoji Symbols
2008-08-12: L2/08-315 Emoji Symbols: Open Issues
2008-08-12: L2/08-309 Emoji Encoding Proposal: Progress Report
2009-02-06: L2/09-026R Emoji Symbols Proposed for New Encoding (=N3583)
2009-02-06: L2/09-078 Emoji Sources (=N3585)
2009-01-30: L2/09-025 Proposal for Encoding Emoji Symbols
2009-01-30: L2/09-026 Emoji Symbols Proposed for New Encoding
2009-01-30: L2/09-027 Emoji Symbols: Background Data
2008-08-13: L2/08-323 Scripts Subcommittee Draft Notes and Recommendations to UTC #116
2008-08-12: L2/08-314 Emoticon Core Set - working proposal
2008-08-12: (HTML) Table for Working Draft Proposal for Encoding Emoji Symbols
2008-08-12: L2/08-315 Emoji Symbols: Open Issues
2008-08-12: L2/08-309 Emoji Encoding Proposal: Progress Report
2008-08-11: L2/08-305 Some suggestions about the encoding of national flags as requested by the Emoji proposal (L2/08-081)
2008-07-17: (Doc) Feedback on the Updated Emoji Encoding Proposal (=L2/08-081) [L2/08-106 plus additional feedback from UTC #114]
2008-02-05: L2/08-106 Feedback on the Updated Emoji Encoding Proposal (=L2/08-081) [email feedback before UTC #114]
2008-01-30: L2/08-081 Working Draft Proposal (2) for Encoding Emoji Symbols
2008-01-30: L2/08-080 Emoji Proposal Data (PDF snapshot); Zip file of HTML + images HERE; Temporarily hosted live HERE
2007-08-03: L2/07-257 Working Draft Proposal for Encoding Emoji Symbols (Associated tables in ZIP file)
This page and its subpages contain the project documentation. Generated HTML charts are available at http://www.unicode.org/~scherer/emoji4unicode/
See the project announcements (blog posts) in English and Japanese.
2008-07-17: (Doc) Feedback on the Updated Emoji Encoding Proposal (=L2/08-081) [L2/08-106 plus additional feedback from UTC #114]
2008-02-05: L2/08-106 Feedback on the Updated Emoji Encoding Proposal (=L2/08-081) [email feedback before UTC #114]
2008-01-30: L2/08-081 Working Draft Proposal (2) for Encoding Emoji Symbols
2008-01-30: L2/08-080 Emoji Proposal Data (PDF snapshot); Zip file of HTML + images HERE; Temporarily hosted live HERE
2007-08-03: L2/07-257 Working Draft Proposal for Encoding Emoji Symbols (Associated tables in ZIP file)
Resources
Google Data and Tools
Google uses Private Use mappings to represent Emoji ("picture character") symbols in Unicode text. The emoji4unicode project makes these mappings available. This project also provides data and tools that can be used in the development of the encoding proposal. The tools are Python scripts that provide for consistency checks, reports on the data, and chart generation.This page and its subpages contain the project documentation. Generated HTML charts are available at http://www.unicode.org/~scherer/emoji4unicode/
See the project announcements (blog posts) in English and Japanese.
DoCoMo
- English:
- http://www.nttdocomo.co.jp/english/service/imode/make/content/pictograph/basic/index.html
- http://www.nttdocomo.co.jp/english/service/imode/make/content/pictograph/extention/index.html
- Japanese:
KDDI
SoftBank
- http://creation.mb.softbank.jp/web/web_pic_about.html
- http://www2.developers.softbankmobile.co.jp/dp/tool_dl/download.php?docid=120&companyid=
Conversion Tables
- http://www.nttdocomo.co.jp/binary/pdf/service/imode/mail/imode_mail/emoji_convert/pictogram.pdf (DoCoMo -> KDDI and Softbank)
- http://broadband.mb.softbank.jp/service/3G/mail/pictogram/convert.pdf (Softbank --> DoCoMo and KDDI)
- http://mb.softbank.jp/mb/service/3G/mail/pictogram/list.html (SoftBank --> Disney, KDDI, DoCoMo, and emobile)
-
http://www.au.kddi.com/email/emoji/taiohyo/index.html (KDDI ---> DoCoMo and Softbank)
Additional non-carrier references
- http://trialgoods.com/emoji/?career=i&page=all (DoCoMo)
- http://trialgoods.com/emoji/?career=au&page=all (KDDI)
- http://cgi.wap2.jp/emoji/ezweb/?act=list (KDDI)
- http://trialgoods.com/emoji/?career=sb&page=all (Softbank)
Related
See also:- Japanese TV Symbols
- WAP Pictogram Specification approved Version 1.1 -- part of OMA Browsing V2.3 Enabler Specification
- RIS 506 "Music CD Shift-JIS" [need link]
- WingDings fonts [need link]
Arle Lommel sent the following to the emoji4unicode group on 2008-12-27: (Highlighting by Markus Scherer; link to the complete email)
You may have seen this on Unicore, but if not, I have done a comparison of the emoji repertoire with the emoticons used in chat or bulletin board systems from seven major vendors in this area (Skype, Microsoft, Yahoo, America Online, Google, vBulletin, and phpBB). You can download the results from here:
http://dl.getdropbox.com/u/223919/emoticons.pdf
[...]