|The Unicode Localization Interoperability Technical Committee (ULI) works to ensure interoperable data interchange of critical localization-related assets, including:
- Translation memory: A translation memory system stores words or phrases that have been tanslated previously. The use of translation memory ensures the consistency of translated content, accelerates the speed of translation, and also reduces the cost of repeated translation requests.
- Segmentation rules: Segmentation rules define the way to segment text for translation or other text processing. The rules are used in conjunction with translation memory to create memory segments or identify matches within the source content of existing translation memories.
- Translation source strings and their translations: Translation source is natural language text, typically with markup, that will be translated into another language. The translated strings are the results of translating the source strings while preserving the markup.
- Word Count: new! - Defining best practices around how to best count words in the context of translation interchange.
Whether a translation request is completed by human or machine, these assets play a vital role in the overall translation process. Interoperable interchange of these assets reduces errors, lowers costs, and improves throughput.
Charter and Scope
Please see ULI Charter page for more details.
Profiles of Use
The primary focus of the ULI Technical Committee will be to establish profiles of use for XLIFF, TMX, and SRX. The committee will develop and publish specifications that document specific usage conventions that can be shared for interoperability. This will improve data exchange through more consistent implementations and enhance the usefulness of these three standards.
Extensions to Established StandardsThe secondary focus of the ULI Technical Committee will be to gather requirements for future extensions to XLIFF, TMX, and SRX. The ULI committee will develop reference implementations, as necessary, to demonstrate the feasibility of any proposals for future standardization.
One of the challenges of translation interoperability is objectively measuring the difficulty of a particular translation workload. A common metric used is the word count. However, methods for counting words vary across different systems and languages. Some examples: Thai is written without space characters between words, as is Japanese and Chinese. Should numbers be included or not included? Are Mongolian suffixes
considered a separate word or not?
The ULI-TC is hosting the development of a future Unicode technical note, you may follow and contribute to the discussion on this Github page
These documents are archived for historical purposes and do not specify a Unicode standard. These documents are already publicly available online elsewhere, are are only hosted on the Unicode ULI site as a convenience.
For information on how to join the ULI and get involved in its work, contact the Unicode Consortium with the contact form and ask about the ULI.
To become a voting participant in the work of the ULI committee, join Unicode in one of the three voting categories of membership: Full, Institutional, or Supporting. Learn about the benefits of joining.
The officers of the ULI will establish the meeting schedule. Meetings are to be conducted by conference call to enable broad participation by members of the industry.
Outside of formal meetings, much of the technical work of the Unicode Localization Interoperability Technical Committee is conducted in email discussions held on the distribution list of ULI members (uli). Informal discussions of technical issues are also held on public Unicode email distribution lists.
The current Technical Committee Officers are:
- Chair: Steven R Loomis (IBM)
- Vice Chair (Interim): Yoshito Umaoka (IBM)