The latest ULI segmentation exception has been posted in SVN: http://unicode.org/uli/trac/browser/trunk/abbrs
- Reference to CLDR date/month and other necessary symbols
- Available in JSON (json-cooked) format. The XLS files contain the input data including exception type and frequency.
The latest demo is also updated to reflect the changes above. Try it out yourself at http://demo.icu-project.org/icu-bin/icusegments
Things to try:
- Compare ULI vs. non-ULI version of the English sample text. Note breaks after "Mr." in the non-uli format.
- For German ULI, try the string "
Im Okt. München war kalt." ( Something like, in October, Munich was cold. ) Without ULI the sentence breaks after the abbreviation Okt (for Oktober). With ULI and with CLDR data, "Okt." is an exception.
A public mailing list has been created for discussion of unicode, localization, and interoperability as on Unicode.org. This mailing list is intended for broad-based conversations on the topics, and is not limited to the members of the Unicode consortium.
To subscribe to uli-users
, follow the directions here
. The process is managed under the regular Unicode process, as described here.
We are pleased to announce that Localization World is organizing a one-day Unicode workshop on Unicode, including an introduction with Richard Ishida and three additional sessions. This will take place on the preconference day, June 4, 2012, in Paris. Richard is an experienced presenter at Unicode conferences, and is well known for his clear and effective presentations.
The Unicode Consortium’s goal is to enable people around the world to use computers in any language. The Consortium is involved in core internationalization specifications at the heart of all modern software, such as the Unicode Standard for character encoding. The Consortium’s involvement in localization is a key extension of this work. The Unicode Consortium maintains and extends the Common Data Locale Repository (CLDR), and in 2011 established the Unicode Localization Interoperability Technical Committee to improve the interoperability of localization data interchange.
For more information, including the program of the June LocalizationWorld Conference, please see http://www.localizationworld.com/lwparis2012/program.php.
Ulrich Henes, Donna Parrish and Daniel Goldschmidt, chair, vice-chairs, Localization World Conference Program Committee
Helena Chapman, chair, Unicode Localization Interoperability Technical Committee
The Unicode Consortium recently announced a new technical committee focusing on standards for data interoperability of critical localization-related assets, such as language segmentation, translation source strings, translated strings, and translation memories. (http://www.unicode.org/press/pr-uli.html
) The Unicode Localization Interoperability (ULI) Technical Committee kick-off meeting has been scheduled for Friday June 10, 2011 at 11am Eastern Time. It will be conducted by phone. Meeting material will be available at http://uli.unicode.org/home/uli-documents
There will be information about ULI, the initial focus area, next steps, and how you can be involved in this effort to help mature localization industry standards to meet your business needs. To receive the kick-off invitation, please use the reporting form to indicate your interest in ULI at http://www.unicode.org/reporting.html#form
and your contact information.
For additional information about ULI, see http://uli.unicode.org
Dear Unicode Community.
As business moves to broader market space, enable offerings and services to be available in those markets also becomes vital to the success and survival of any organizations. According to a 2009 European Union study, the language industry’s annual compounded growth rate was estimated at 10% minimum over the next few years, resulting in approximate value of 16.5 billion to 20 billion € in 2015. Unicode's stated mission is to "enables people around the world to use computers in any language". Technical standard development forms the foundation of desired interoperability for achieving that mission. Ultimately, the benefits of information and technology should be available at the end user level in the friendliest interface possible and to:
1. Gather requirements for core and extension of the specified standards in the areas of text segmentation and content memory,
2. Establish core specification scope, extension and implementations to improve the usefulness of existing standards and profiles for interoperability,
3. Provide consistent interpretation of the specification, extension and profiles.
With the recent announcement of the Unicode Localization Interoperability (ULI) Technical Committee, I would like to invite you to the kick off of ULI and help shape the scope and plan for this TC. To receive the kick-off invitation, please report your interests in ULI at http://www.unicode.org/reporting.html#form
with your contact information.
Thank you and look forward to speaking with you.
The Unicode Consortium announces a new technical committee, the Unicode Localization Interoperability Technical Committee. Localization of software information is a key part of the adoption of most software offerings in many countries. The purpose of the new committee is to ensure interoperable data interchange for critical localization-related assets, such as language segmentation, translation source strings, translated strings, and translation memories.
The initial focus of the Unicode Localization Interoperability (ULI) Technical Committee is on the improved interoperability of translation memories in the TMX format, segmentation rules that use the SRX format, and translation source strings and resulting translated strings that use the XLIFF format.
The ULI Technical Committee will establish profiles of use for TMX, SRX, and XLIFF. The committee will develop and publish specifications that document specific usage conventions that can be shared for interoperability. This will improve data interchange through more consistent implementations and will enhance the usefulness of these three standards.
For information on how to join the ULI effort and get involved in its work, contact the Unicode Consortium with the contact form (see http://www.unicode.org/reporting.html) and ask about the ULI.
To become a voting participant in the work of the ULI committee, join Unicode in one of the three voting categories of membership: Full, Institutional, or Supporting. See http://www.unicode.org/consortium/join.html
For more details about the ULI, see: http://unicode.org/uli/