Locale Enhancement Project Home

Scope of this project

The Java Locale has fallen out of date, and needs to be enhanced to avoid loss of data. Relatively small changes to Locale can update it to modern standards, and avoid significant problems for companies using Java. The Java community should agree to extend the model in Locale to add the features of BCP 47 and CLDR.

Background

Many years ago, the internal structure for Locale was modeled after IETF RFC 1766, which was the industry standard for the representation of languages and locales at the time. But the industry has moved on since then.

1. RFC 1766 is long obsolete. It has been superseded by IETF BCP 47, which makes a number of important additions needed for the representation of languages.  BCP 47 is now the standard used and required by HTML, XML, HTTP, and many other specifications and programs. Among other features, BCP 47 provides the following:
  1. Script codes needed for distinctions among languages that use different writing systems, such as Chinese simplified vs traditional script, or Uzbek in Arabic vs Latin script.
  2. Three-letter base language codes needed to represent such languages as Filipino (fil), the official language of the Philippines. (Three letters are needed for the over 8,000 world languages).
  3. Three-digit region codes for important variants used in IT such as Latin American Spanish ("es_419").
These limitations are already causing significant implementation problems. For example, when a J2EE Servlet container implementation parses a language tag from an Accept Language http header and creates a Locale instance, it cannot map a script code into the new Locale object.

2. BCP 47 Unicode extensions are needed. The Unicode consortium launched the CLDR (Common Locale Data Repository) project years ago to construct and maintain the standard repository of locale data.  Sun Java 6 is also a consumer of CLDR.  The Unicode locale model provides an extension of BCP 47 to add keywords and codes needed in IT. These are needed to properly represent locale variants used in industry, such as dictionary vs phonebook sort orders for German.

Project Information

Documentation

Comments