2011-08-15- Working group organization
- Close this group, merge efforts into i18n group?
- Proposed items for OpenJDK 8
- From Yoshito
- Locale SPI override [major]
- Provides a mechanism to 3rd party's service implementation to override Java's locale support
- Context dependent month names in DateFormat / DateFormatSymbols [major]
- e.g. Russian - "15 августа 2011 г." (year-month-date) vs. "Август" (month only)
- Default locale through system property [major]
- For example, user.locale=<bcp47 tag> to specify default Locale. If absent, use user.lang, user.script, user.country, user.variant.
- Default locale lookup override / Default ResourceBundle.Control override [medium]
- Allow Java users to override the default locale look up behavior for i18n services
- Default ResourceBundle.Control (for example, NoFallbackControl by default)
- Listener for default Locale / TimeZone change [minor]
- get notification from JRE for default Locale/TimeZone change
- TimeZone API returning ZONE_OFFSET and DST_OFFSET [minor]
- Otherwise, you cannot actually implement Calendar subclass properly
- String#equalsIgnoreCase issue
- Definitely a bug
- who's going to take a look? i18n group?
2010-09-13- Naoto pushed locale changes to the client repository.
- Wrap up
- default script/extensions - Naoto will work on this along with the task - user/system locale separation
2010-08-30- Naoto's code review comments
> - Instead of String.equals(), use of "==" for comparing locale elements (language/script/country/variant), as they are all intern'ed. I think you're talking about BaseLocale.equals. It actually does not matter whether we use String#equals() or ==, because it does not create multiple BaseLocale instances with same language/script/country/variant. So technically, BaseLocale.equals() can be simply - return (this == obj). I wanted to make the implementation more robust - even we change the implementation, BaseLocale.equals() continue to work properly. But if you feel this strongly, it's OK to update the code to use "==" (although it does not have any performance improvement with it). For LocaleExtensions.equals - it is not intern'd - and we probably do not need to do so, because non-empty extensions are not commonly used. Leave it, because no performance merit is expected with ==.> - Remove JDKIMPL check. Especially the code block for !JDKIMPL that won't be executed. > OK - will do. > Locale.java > - private Locale(lang,scrpt,ctry,vrnt) constructor is redundant. Its implementation can just be moved to Locale(lang,ctry,vrnt) with script = "". OK - will do. > - Do we need two caches, one in BaseLocale and the other in Locale which in most cases a duplicate (excl. EMPTY_EXTENSIONS)? Can we get rid of the redundancy? The idea was - When Locale is created by factory methods (Builder / forLanguageTag), we do not want to create multiple Locale instances representing the same locale. Thus, we have Locale cache - However, we cannot get rid of Locales created by the constructors. For them - we just try to avoid multiple BaseLocale instances representing the same locale are created. Thus, we also do BaseLocale level caching. The overhead is - one extra map structure. But we do not create redundant BaseLocale objects, so it should not be so bad. This is OK.> - toLowerCase() can be removed with the replacement of AsciiUtils.toLowerString() I think it's not a good idea because String.toLowerCase() depends on a locale. - It introduces cyclic dependency - Locale sensitivity is harmful - AsciiUtil.toLowerString() is the lightest possible implementation for this. Actually, Naoto pointed out we do not need another toLowerCase (private static method) in Locale. Agreed to remove the existing toLowerCase, use AsciiUtil.toLowerString instead.> - hashCode() comment looks wrong with the change. Can you tell me which comment you're talking about? BTW, I realized that Locale.hashCode should probably use XOR, not OR. I'll update this. We probably should keep the transient field to store calculated hashcode.> > ResourceBundle.java > - no need to import ResourceBundle.Control Right. I'll remove the import. > > Extension.java > - Should implement a proper hashCode() and cache the value, as in BaseLocale OK. It is not necessary, but if that is the convention, I'll do so. Yoshito to review this again. We may intern LocaleExtensions' String representation.> > InternalLocaleBuilder.java > - No need to define LOCALESEP, as it is a duplicate of BaseLocale.SEP > OK. I'll update. > LocaleSyntaxException.java > - Do we really need this? Can InternalLocaleBuilder just throw IllformedLocaleException? > Actually, it is not necessary. But it makes us much easier to share the implementation between JDK and ICU. Is this a problem? OK to keep it. It's pure implementation code, no public APIs involved. We can change this later if we really want.> StringTokenIterator.java > - Need GPL2 copyright header > Sorry, I missed it. I'll update. > (not in the list) > make/java/java/FILES_java.gmk > - should contain new files. > OK, I'll update. 2010-08-23- Unit Test question
- make jdk_text output example
- Naoto thinks there is a quick workaround - if it works, he will send the info to project members
- Naoto also ask other Java folks to see how to resolve the issue.
-------------------------------------------------- TEST: java/text/Collator/Bug5047314.java JDK under test: (/home/yoshito/JavaLocale/locale-enhancement/build/linux-i586/j2sdk-image) openjdk version "1.7.0-internal" OpenJDK Runtime Environment (build 1.7.0-internal-yoshito_2010_08_23_12_13-b00) Java HotSpot(TM) Client VM (build 19.0-b05, mixed mode) ACTION: build -- Failed. Compilation failed: Compilation failed REASON: Named class compiled on demand TIME: 0.055 seconds messages: command: build Bug5047314 reason: Named class compiled on demand elapsed time (seconds): 0.055 ACTION: compile -- Failed. Compilation failed: Compilation failed REASON: .class file out of date or does not exist TIME: 0.053 seconds messages: command: compile /home/yoshito/JavaLocale/locale-enhancement/test/java/text/Collator/Bug5047314.java reason: .class file out of date or does not exist elapsed time (seconds): 0.053 direct: /home/yoshito/JavaLocale/locale-enhancement/test/java/text/Collator/Bug5047314.java:35: cannot access BaseLocale private static Collator colLao = Collator.getInstance(new Locale("lo")); ^ class file for sun.util.locale.BaseLocale not found 1 error - Code review
- Yoshito to generate webrev
- Internal code review done by Aug 26 (Thu)
- Others
- user.script / user.extensions? - not now. Need another API change proposal. Naoto is planning to separate display locale from system locale and this small change might be done at the same time.
- localization for script/region names - Yoshito to provide name for Hans/Hant in Chinese locale data.
- zh_Hans/zh_Hant in available locale list? - Naoto will check with others to see if this makes sense. If we think we need this later, then we can do it as a regular bug fix.
2010-08-16- Minor doc updates were done in last week. The tip documents are final
- Serialization doc, doc link fix
- Minor doc updates
- Proposal status
- Code completion target - end of this week
- Copyright comment
2010-08-09- IllformedLocaleException
- serialization
- Remaining work
- LocaleServiceProvider implementation
- sun.util.LocaleServiceProviderPool
- Can we use the default implementation of ResourceBundle.Control?
- test coverage
- Calendar/NumberFormat to use ca-japanese / nu-thai
2010-07-19- Norwegian candidate locale update
- http://sites.google.com/site/openjdklocale/design-notes/resource-bundle-lookup-order
- Proposed update for Nynorsk
- no_NO_NY is mapped to "nn_NO" at the beginning
- generate standard ordering down to "nn"
- append "no_NO_NY", "no_NO", "no" after "nn"
- apply step 2 & 3 for request with language "nn"
- Builder#setLocale with no_NO_NY
- Changed to "nn_NO" internally
- Consistent with toLanguageTag
- Resource/service lookup should be compatible
- ja_JP_JP / th_TH_TH - still open
- Constructor new Locale(language, country, variant) appends u-ca-japanese / u-nu-thai
- toLanguageTag "ja-JP-u-ca-japanese-x-lvariant-JP"
- forLangaugeTag creates Locale("ja", "JP", "JP") for "ja-JP-u-ca-japanese-x-lvariant-JP"? "ja-JP-x-lvariant-JP"
- Builder#setLocale(new Locale("ja", "JP", "JP")) -> variant JP is dropped off
- Naoto's review comments
- http://icu-project.org/~srl/tmp/blend.html
- java.util.IllformedLocaleException (class description): What is "a
value" here? It could mean the argument passed to setXXXX(tag) is
ill-formed or passed locale to setLocale() contains some ill-formed
fields. More description (with the definition of "ill-formed") would be
helpful.
- java.util.Locale (class description):
- "non-conforming locales" should be something like
"non-BCP47-conforming", otherwise some developers would take it as
invalid Java locales.
- "Compatibility" section in the Locale class description: "original
behavior" should be replaced with something like "behavior prior to
1.7".
- The term 'The new BCP47 APIs' needs to be clearly defined (assuming
builders and factory methods). BTW, we should not use "new" here, as
things get old eventually.
- Locale.toLanguageTag(): "ASCII characters" should be replaced with
letters from 'a' to 'z', 'A' to 'Z', or digits. Or define
'alphanumerics' here, which is also being used.
- Locale.toLanguageTag()/forLanguageTag(): Add a note about no roundtrip
guarantee.
- Locale.getISOXXX(): Do we really need to provide links to on-line
listing of codes? They are fragile and easily get broken. We could
instead just refer to the ISO website for users that need more information.
- ResourceBundle.getBundle(String, Locale, ClassLoader):
- In the fallback example, "if country is an empty string, the second
and the fourth candidate bundle names are omitted". Is that "fourth"
"fifth"?
- As to the "parent chain", "iterates over the rest of the candidate
bundle names generated from either the specified locale or the default
locale, then the base name alone at the end." Is this correct? I thought
that parent chain is obtained just by chopping each part off.
- I'd expect there'd be special fallback cases for Chinese/Norwegian.
- Compatibility Note for CCC
- Locale#toString
- When script/extensions are available, extra subtags are added to the result
- But, the extra portion look like a variant from the existing implementation, the impact should be minimum
- ja_JP_JP / th_TH_TH produces different string
- Equality of two Locales
- no longer done by 3 fields - language/country/variant
- equality should be checked by equals
- ResourceBundle
- Produces different candidate list - Chinese / Norwegian
- Norwegian bundle may mix up
2010-07-12- Remaining review
- ResourceBundle lookup behavior review - See this doc
- Schedule
- July 14 (Wed) to finish API doc updates (Doug) - other folks can review them on Thu/Fri
- July 18 (Sun) to complete the implementation - Locale/ResourceBundle - ready for API review
- July 19 (Mon) - final review before updating CCC - conference call evening
2010-07-02- Unicode Locale Extension
- attribute and typeless keyword
- support in Java i18n service classes
- ja_JP_JP and th_TH_TH
- others
- Builder API Review - See this
doc
- left over from last meeting
- field removal - null or empty string
- Locale API Review - See this doc
2010-06-28
- Current Status
- Unicode Locale Extension (-u-) APIs
- Yoshito start feeling we may remove APIs accessing Unicode Locale Extensions for following reasons
- -u- extension "attribute" is out of picture for now
- typeless keyword does not fit well to the current API proposal
- Users can still get/set -u- extension value
- Will make final conclusion on next call - July 2, 2010.
- Builder API Review - See this doc
- See comments in green background in the document above.
- Work Items for this week
- u extension attribute/typeless keyword
- Review new APIs/API changes in Locale class on July 2nd (next call).
2009-11-3
- Unicode locale extension draft
- unicode-extension = "u" 0*("-" attribute) 1*("-" keyword)
- attribute = 3*8alphanum
- keyword = key 0*("-" type)
- key = 2alphanum
- type = 3*8alphanum
- two items we may need to find out what to do
- attribute (not used currently, but reserved for future extensions)
- valueless keyword (not sure if we allow this or not. at this moment, not used)
- Yoshito: We want to align the API design to the syntax, even it is just reserved for future extensions. Otherwise, we need API/behavior change when such reserved syntax is actually used.
- Yoshito: We do not want to make any changes in the proposal until the spec becomes stable.
- Builder#setLocale with ill-formed Locales supported by Java 6
- Q: what do we want if no_NO_NY / ja_JP_JP / th_TH_TH is passed in?
- Option 1: Accept it as is in lenient variant mode, otherwise, throws an exception
- Option 2: Accept it as is in lenient variant mode, semantical well-formed map by default (no_NO_NY -> nn-NO, ja_JP_JP -> ja-JP-u-ca-japanese,...)
- Steven/Yoshito think Option 2 is preferred
- forLanguageTag("ja-JP-u-ca-japanese") should return a Locale with variant "JP"
- no_NO_NY is transformed to nn_NO in the strict mode, but ResourceBundle/Locale service lookup will support the fallback between no_NO_NY and nn_NO.
2009-10-20 - Extension as a class or an interface
- Extension is a pair of key and value, or just value?
- UnicodeLocaleExtension extends Extension
- UnicodeLocaleExtension - getKey() to return Unicode locale key / getType() to return Unicode locale type
- Locale#getExtension(char key) to return Extension?
- Typical use case
- UnicodeExtension uext = (UnicodeLocaleExtension)locale.getExtension('u');
- String calType = uext.getType("ca");
- Some folks are worrying about future extension registration
- Returning new Extension subtype for a new extension - is this a breaking change?
- Do we need "extension factory" registration mechanism?
- Impact in Builder
- Current proposal
- Builder setExtension(char key, String value)
- Builder setLDMLExtensionValue(String key, String value)
- Should we change them to
- Builder setExtension(Extension ext)
- and.. ExtensionBuilder to set extension value?
2009-09-22- The proposal was submitted
- strong typing for script, such as enum?
- getAvailableLocales to return a list or iterator?
- class doc to explain toLanguageTag/forLanguageTag is the recommended way to convert Locale to/from String?
- getISO3XXX
2009-09-08- No syntactical restrictions on variant field for backward comaptibility
- Revisit toString() topic
- Revised API doc review
- No normalization for no_NO_NY in Locale construction
2009-07-14Variant casing - Case sensitive for now. What you set is what you get - new Locale("ja", "JP", "Yoshito").getVariant() returns "Yoshito"
- ja_JP_Yoshito is not equal to ja_JP_YOSHITO
- All examples in the API doc and actual variants used by JDK are upper case letters.
- Multi-segment variant
- Java 6 API doc says - The variant argument is a vendor or browser-specific code.
For example, use WIN for Windows, MAC for Macintosh, and POSIX for POSIX.
Where there are two variants, separate them with an underscore, and
put the most important one first. For example, a Traditional Spanish collation
might construct a locale with parameters for language, country and variant as:
"es", "ES", "Traditional_WIN".
- Why "put the most important one first"? This description looks expecting right-to-left truncation may happen for look up, but it does not.
- ResourceBundleControl implementation
- getCandidateLocales
- base logic - when script field exists, truncate variant, country in this order, then insert locales with variant/country without script before language only locale.
- language-script-country-variant -> language-script-country -> language-script -> language-country-variant -> language-country -> language -> ROOT
- special cases
- Chiense: supply script first, then follow the base logic above.
- Question -
- Hans for CN - others? SG
- Hant for TW - others? HK, MO
- Norwegian:
- input no_NO_NY: nn_NO -> nn -> no_NO_NY -> no_NO -> no
- Hebrew, Yiddish and Indonesian
- Out of standard framework - implemented as a special logic in getBundleImpl
- When a locale taken from getCandidateLocales has language "iw", "ji" or "in", generate 2 bundle names - one with valid language code first (for example "he"), then one with deprecated language code (for example, "iw").
- Just document the behavior.
- Code update: language tag to use the canonical form in BCP47
- script in title case -> Latn, Hans
- region in upper case -> US, CN
- variant in lower case
2009-06-16
With candidate chain - he and iw won't co-exist, thus, inheritance chain like he_IL, iw_IL, he, iw won't be a problem
- To make the inheritance chain consistent, we may always start with "he" for look up even "iw" is given
- We may apply this for zh_Hans/Hant case - for example, for given locale "zh_CN", the candidate chain would be always "zh_Hans_CN", "zh_CN", "zh_Hans", "zh". The same chain is constructed when zh_Hans is supplied.
- For Norwegian case - not sure.. does it work?
- Agreement - If we clearly write down all the exceptional cases precisely, this approach should be fine.
2009-05-262009-05-12
2009-04-14
- Still waiting for the JDK7 Feature proposal template
- Naoto to check with Mark about the status
- Translation coverage
- JDK7 supported locales
- For now, we assume the set is same with Sun JDK6
- no keyword display names/types?
- Set of codes included in -
- getISOLanguages
- getISOCountries
- Keep them as is (of course, add new codes if the base standard - ISO639.1/ISO3166.1 are updated)
- Do not add 3 digit UN territory code into getISOCountries
- Clearly state that Locale does not limit valid code by the sets returned by these methods
- getISOScripts?
- If we do not have a method supporting valid language/region code list for BCP47, it is not worth adding such method now. Revisit after JDK7.
- Default locale customization
- user.scrpt
- extensions
- Existing system properties such as user.language, user.region are Sun's implementation, not a part of API/Java specification
- Agreed to add system properties to allow Java user to supply script/extensions
- extensions value is in BCP47 extension format, such as u-ca-japanese
2009-03-31- JDK7 Feature proposal
- Waiting for the feature proposal template coming out
- Shooting for M4 (June -)?
- Locale service lookup and extensions
- Comments from Masayoshi/Naoto
- User supplied Locale service provider implementation to override Java's default locale service implementation?
- Not discussed in the last call
- Limiting capability of LSP framework
- What can we do for this? What do we need to continue to protect?
- Resource/service lookup ordering
- Not discussed in the last call
- Please provide comments on this design note
- >Yoshito to update
ResourceBundle/LocaleServiceProvider implementation based on the
current design proposal by next bi-weekly meeting. -> not yet ready
2009-03-17
- Are we ready for submitting JDK7 API change proposal?
- We assume all open API design discussions are settled.
- Basically, attendees are happy with the current proposed API set (we may want just one more - see Locale service lookup and extensions below)
- OpenJDK procedure questions
- Yoshito posted a question to jdk7-dev list and also sent a note to Mark Reinhold, but still did not receive any responses.
- Anyone from Sun can help?
- Locale service lookup and extensions
- Discussed in the ML
- The service lookup is done only by base locale
- The service invocation includes extensions
- We may want to have Locale#getBaseLocale() or Locale#hasSameBaseLocale(Locale) Yoshito will look into the LSP lookup implementation.
- User supplied Locale service provider implementation to override Java's default locale service implementation?
- Limiting capability of LSP framework
- What can we do for this? What do we need to continue to protect?
- Resource/service lookup ordering
- Please provide comments on this design note
- Yoshito to update ResourceBundle/LocaleServiceProvider implementation based on the current design proposal by next bi-weekly meeting.
2009-03-10
- Should we normalize language field to empty string when language tag "und" is used in Locale constructors/Builder?
- We want BCP47 tag "und" to be mapped to Locale.ROOT.
- To be consistent with above, Locale.ROOT and new Locale("und") should be same.
- Agreement
- We decided not to make such canonicalization in the Locale constructor at all (other than existing one)
- Both Locale.ROOT and new Locale("und") is transformed to BCP47 tag "und". BCP47 tag "und" is converted back to Locale.ROOT always.
- We'll explain -
- Use of ISO 3-letter language code which has 2-letter version is illegal in API doc, but the implementation does not check this.
- Locale "und" is different from Locale.ROOT, but both are converted to BCP47 tag "und".
2009-03-03- The status of Unicode Locale Identifier syntax
- CLDR committee members agreed to use the BCP47 syntax
- Allocate 'u' for Unicode Locale Keywords
- Updating the specification in CLDR 1.7
- Mark to write the internet draft for the singleton allocation as soon as after the current RFC4646bis/4645bis are published as new RFCs (updating BCP47)
- Each key/type is represented by a pair of single extension subtags
- e.g. "ca-japanese-cu-jpy"
- key names are sorted in alphabetical order
- case insensitive (normalized to lower case letters)
- CLDR project to define short key names - always length of 2
- collation -> "co"
- calendar -> "ca"
- numbers -> "nu"
- CLDR project to define shorter type names if the current type names do not satisfy the BCP47 extension subtag requirement
- Always 3 to 8 characters
- e.g. collation type "phonebook" -> "phonebk"
- The initial API implementation is committed to the OpenJDK Locale Enhancement repository
- Locale to store fields by BaseLocale and LocaleExtensions
- Accessors for new fields - script, extension (and locale keywords) and private use
- getDisplayScript (with English name data), one API addition in LocaleNameServiceProvider
- forLanguageTag/toLanguageTag
- Open issues in the current implementation
- Where to normalize language code
- Deprecated mappings - "he" -> "iw"
- 3 letter code to 2 letter code - "eng" -> "en"
- Grandfathered locales with no mappings in forLanguageTag
- extlang handling
- Locale display name - with script display name
- LocaleBuilder#create() and createStrict()
- Strict language tag parsing
- Canonical equivalence - new method - isEqualTo(Locale)?
- Locale "iw" vs "he"
- Locale "ja_JP_JP" vs "ja_JP_u_calendar
- Next steps
- Define ResourceBundle suffix string with script
- ResourceBundle.Control#getCandidateList()
- Do we want to have the implementation in Locale? If so, should we make it public?
- Other APIs to be exposed?
- API review
- Yoshito/Doug come up with all proposed JDK7 APIs by March 10.
- Collect feedbacks for currently proposed APIs - make necessary updates by March 10
- Schedule an extra API review call on March 10 including APIs proposed in this week.
2009-02-17
The latest proposed API posted here- Yoshito will send the link to the ML for review
- Doug and Yoshito will provide the working implementation for these proposed API by March 3. See below.
- wrap up toString() problem
- No changes at all. toString() will still return a string without script/extensions
- API review 2 - builder APIs
- Walked through proposed APIs
- Doug suggested to change get() -> create()
- Initial implementation for these proposed API by March 3 (next call)
- Future milestones
- March 3 - The initial proposed API set are ready for evaluation
- March 17 - API freeze for JDK7.
- March 18 - Submit API change request to CCC for JDK7.
- April 1 ? - feature freeze
- Yoshito will update Unicode Locale Identifier syntax changes proposal status (@key1=type1;key2=type2... -> BCP47 format)
2009-02-02- API review 1
- public String getScript() Agreed
- Returns the script code for this locale.
- If this locale does not have a script, an empty string "" is returned.
- e.g. "Hans", "Cyrl", ...
- public String getDisplayScript(Locale inLocale) Agreed
- Returns a name for the locale's script that is appropriate for display to the user in the specified locale.
- If localized display name is not available, the script code itself is returned.
- e.g. "Simplified Chinese", "キリル文字"
- public String getDisplayScript() Agreed
- Returns a name for the locale's script that is appropriate for display to the user in the default locale.
- If localized display name is not available, the script code iteself is returned.
- public Iterator<String> getKeywords()
- Returns an iterator over keyword names for this locale.
- If no keywords are available in this locale, null is returned.
- A BCP47 extension is interpreted as a keyword. For example, for the Locale representing BCP47 language tag "en-US-a-xyz", "a" is a keyword name and "xyz" is its value.
- A BCP47 private use is also interpreted as a keyword. For example, for the Locale representing BCP47 language tag "en-US-x-jdk", "x" is a keyword name and "jdk" is its value.
- Alternative suggestion for JDK7 - store extension/private-use in a single String
- Will continue to discuss this in ML - will decide what to do in the next conf call.
- public String getKeywordValue(String keywordName)
- Returns the value for the specified keyword name.
- If the specified keyword name is not available, an empty string "" is returned.
- Same as above.
- public Locale getBaseLocale()
- Returns a Locale without keywords (LDML keywords/extensions/privateuse)
- Probably still make sense even we do not support keyword name/value pair explicitly - will decied what to do in the next conf call.
- Locale identifier and conversions
- toString() stability issue
- Do not want to introduce script field by backward compatibility concerns.
- JDK6 API Doc says - "programmatic name", but it's probably not true even for now. When language and country are empty, this method returns an empty string even variant is available.
- API proposal
- public enum IDType{JAVA, BCP47, LDML}
- public static Locale createLocale(String id, IDType type)
- public String toString(IDType type)
- Next meeting
- Conclude keyword/extension/private-use storage/accessor
- Review for Builder APIs
2009-01-20- Project members
- Doug Felt (Google)
- Mark Davis (Google)
- Masayoshi Okutsu (Sun)
- Naoto Sato (Sun)
- Yuka (Sun)
- Steven Loomis (ICU Project)
- Yoshito Umaoka (ICU Project)
- Goals
- Schedule / Milestones
- Tasks
- nail down the design in this order
- Representation of new locale elements - script/keywords
- Construction - LocaleBuilder/Factory
- Locale lookup order
- Hebew/Indonesian
- SPI / ResourceBundle.Control - getCandidateLocales
- Test code
|
|