Information Hub for Linguists

This page was last modified on: June 11, 2018.

The pages listed to the left provide guidelines for translation of CLDR strings. For an overview of the tools, please read the Survey Tool Guide before starting.

Current Survey Tool stage: Submission

The survey tool is currently open for general data submission.  Refer to the Survey Tool stages section below for expectations for contributors during each phase. For the schedule, please refer to Milestone Schedule in the left navigation.

There are changes to the Survey Tool as outlined in the What's New in this release section below. Please read this entire document before getting started. The change in the Import at first log in is particularly important to know ahead before your first log-in.

CLDR v33.1 and v34 releases

Your contributions in this year's contribution cycle will result in two releases of CLDR.
  • CLDR v33.1 is now in the data resolution stage. See the description of Resolution phase below.  
  • CLDR v34 will release in the fall, and will include all data collected in this year's contribution cycle. Once you finish your contribution to the new Emoji, you can move directly to other data and you DO NOT need to wait for the CLDR v33.1 data freeze. The CLDR TCs will be handling the data included in v33.1 with no disruption to you.

Data stability

Please be mindful of data stability by carefully reviewing previously Approved data. When it's clearly incorrect, it should be changed — but for data stability, don't change the field it is already acceptable (even if not optimal). When you have an evidence of a variant being much better and in customary use than the existing Approved data, use the Forum to bring up discussions and gain consensus to change Approved values.

What's new in this release

Note: the ticket numbers are included in brackets (such as [#11056]) for those who are interested in the details.
  1. Import of old votes is automatically handled. All your votes matching the latest Approved data will be imported automatically upon first log in. If you expected to see your old votes, but do not see them after your first log in, file a ticket
    • If you have voted previously, upon log-in, you will see a message showing the number of your votes that matched the currently winning votes that have been auto-imported. 

    • You can still import old voted data that do not match the Approved data. Go to Setting (gear icon), under My Votes, then Import Old votes. You will need to review and select each of the losing items for import. Select All is not an option votes.
  2. Browser support for Survey Tool now includes the latest versions of Edge, Safari, Chrome, and Firefox. Please report issues with the latest versions of any of the supported browsers. [#10396]
  3. Emoji
    1. Finding Emoji 11: The new emoji entries for version 11 use identifiers starting with "E11:" (see image). You can do a “Find on page” to move quickly from one to the next. You can also see the Emoji 11 Unicode chart for a full list. [#10997]
    2. Emoji Images: The new emoji will not show properly in your browser. However, an image will show up in the right-hand panel: 
    3. Keyword voting: The calculation of the winning set of keywords is now different. Beforehand, if you had the following choices, #1 would win. Now, the fact that #2 is a subset of #3 gives it a larger weight in voting, and #2 will win.
      1. {small} : 4 votes
      2. {big | large} : 3 votes
      3. {big | large | grand} : 3 votes
    4. Keyword de-duplication: If one keyword phrase is covered by other keyword phrases, then it will be removed. For example, the set {big bad wolf | big | bad | wolf}  {bad | big | wolf}. This will happen automatically as you enter values.
      1. Note that the items in the set are also automatically alphabetized: {big | bad | wolf}  {bad | big | wolf}
    5. Names included in keywords: The emoji names will get included as keywords automatically. You won't see this happen as you enter them since the name may change before the release is resolved. So this change is done later in data resolution, after the names are final. [#10537]
      1. You do not need to enter the Emoji names as Keywords explicitly — but don't bother removing them if present (since that might artificially introduce voting conflicts).
      2. Example: 
        • Name: fox face
        • Keywords: {face | fox}
        • Final keyword in the released XML: {face | fox | fox face}
    6. Make sure to read the Translation Guides below for more on emoji.
  4. Priority Items: In a sublocale like French (Canada) [fr_CA], the dashboard values for Missing and Losing more accurate reflect what needs to be done. [#9505]
  5. Long Date formats: We have found that many languages misused "dd" instead of "d". Please revisit your decision for Long date formats to determine correct use of "d" vs. "dd". [#10018]
  6. Language-Specific Issues: Feedback on specific language data will be posted in language forums.
    1. For German language only: For the purpose of CLDR data, we will be using the English terms "AM/PM" as the data. CLDR provides the flexibility to use the 12 hour format even for those language who strictly use 24 hour format. In case of German, the CLDR TCs have concluded to use the English words. [#10789]
    2. For Odia language only: Do not use diacritics in transliterations. [#11044]
    3. Please participate actively in Forum postings for language-specific data feedback from CLDR users, and postings by CLDR committee members. See Survey Tool Guide.
  7. Old forum posts: Forum posts from previous contribution cycles are now available as one thread, and the version number where the posting came from are identified with the date and time stamp.  


Translation Guides

  1. The translation guides for date/time patterns and names have been updated, focusing especially on the need to synchronize different name forms such as format and standalone with the patterns that use them, and different ways of utilizing the format and standalone name forms.
  2. Timezone names and Territory names often share the same term. A list of overlapping data between Timezone and Territory names are available in this public spreadsheet. Use this spreadsheet as a reference when working on Timezone names, and bring consistency for Timezone names where they are also found in Territory names.
  3. Many of the emoji names are constructed. The emoji parts that are used to construct and add on to other emoji are found in under Characters in the Survey Tool, under Component, People and CategoryPlease review these carefully!
    1. Component contains special emoji whose names are used for emoji with hair colors and skin tones.
      1. Carefully check all of these. Some languages still have older terms for the skin tones that won't mean anything to users. Use understandable terms like “light skin” instead of numbered levels like “peau 1”.
      2. CLDR doesn't have gender agreement for nouns, so please try to choose the grammatical forms that work the best. For example, in some languages there will be an adjective for “light skinned” that would need to agree with the noun (man or woman). It may work to make noun phrases instead, eg “light skin” or “bald head”.
    2. People contains three values which are used to construct emoji.
      1. All of these have examples marked by an ⓔ in the English Column.
      2. Hover over the ⓔ to see how some sample constructed emoji would look in English.
      3. Hover over each translated term (Winning, Others) to see how some sample constructed emoji look in your language:
    3. Category contains 3 terms like “flag” (used in constructing flag names). These 3 terms are also marked with ⓔ, so make sure to review each of the examples in English and your language.

Known Issues

Please review this list before getting started to avoid creating duplicate tickets. This list will be updated as fixes are made available in production. If you hit a problem, please file a ticket.
  • The Venezuela currency will change on June 4 from VEF to VES. VES will show as "(old)" in the header until after June 4. At that point VEF will show as (old), and VES will be unmarked
  • Ticket 11133: The automatic import does not import in your old votes when the winning vote is on inheritance.  Workaround: Ignore these fields or import them in from the Import my votes page. 

Resolved Issues

    Previously listed on the known issues that have been resolved:
    • Ticket 11143: Emoji keyword fields with more than 7 keywords throw errors. Workaround: For the time being, please ignore these error messages.

    Survey Tool Stages 

    Shakedown 

    The survey tool is live and all data that you enter will be saved and used. You can start work, but there may be additional fixes during this period. So the tool may be taken down for updates more frequently than after we exit Shakedown. During Shakedown, your participation in looking for issues with the Survey tool is essential. If you find any problems in the tool, please file a ticket.

    Submission

    In the submission phase, please focus on getting all Missing items entered.
    If you are working in a sub-locales (such as fr_CA), wait until the main locale (fr) has completed submission. See voting for inheritance vs. hard votes in Survey Tool guide

    Vetting

    All contributors are encourage to move their focus to the Dashboard view, and unanswered items in the Forums. Consider other's opinions, by reviewing the Disputed and the Loosing. See guidelines for handling Disputed and Losing.
    Also, review the items that are Flagged for TC and provide comments if you have information that should be considered.  
    To see the Flagged items, go to the Gear dropdown, under Forum see Flagged items.

    Resolution

    The vetting is done, and further work is being done by the CLDR committee to resolve problems. You should periodically take a couple of minutes to check your Forums to see if there are any questions about language-specific items that came up.
    Subpages (39): View All
    Comments