unicode and utf-8
everybody is familiar with ASCII table and how an ASCII character is encoded inside a computer: one octet.
http://www.joelonsoftware.com/articles/Unicode.html
http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=%E3%82%93&mode=char
http://www.sljfaq.org/afaq/encodings.html#encodings-Character-set-vs-encoding
http://en.wikipedia.org/wiki/List_of_Unicode_characters
http://en.wikipedia.org/wiki/UTF-8
http://www.cl.cam.ac.uk/~mgk25/unicode.html#linux