Updating Population, GDP, Literacy

Updated 2016-02-08 by J. C. Emmons

Instructions are based on FIrefox browser.

Load the World DataBank

The World DataBank is at (http://databank.worldbank.org/data/views/variableselection/selectvariables.aspx?source=world-development-indicators). If the page has been moved, try to get to it by doing the following:

  1. Go to http://worldbank.org
  2. Click "Data" at top
  3. Click "Data Catalog"
  4. Click "World Development Indicators"
  5. Click on the blue "DATABANK" link in the "".  It should be at or near the top.
Once you are there, generate a file by using the following steps.  There are 3 collapsible sections, "COUNTRIES", "SERIES", and "TIME"
  • Countries
    • Expand the "Countries" section, click the "Countries" tab, and then click the "Select All" button on the left. You do NOT want the aggregates here, just eh countries.  There were 214 countries on the list when these instructions were written.
  • Series
    • Expand the "Series" section.
    • Select "Population, total"
    • Select "GNI, PPP (current international $)"
  • Time
    • Select all years starting at 2000 up to the latest available year.  The latest as of this writing was "2014".  Be careful here, because sometimes it will list a year as being available, but there will be no real data there, which messes up our tooling.
      • If the latest year is greater than the latest year in the current tools/java/org/unicode/cldr/util/data/external/workd_bank.csv, then add the new one(s) to the WBLine enum in org/unicode/cldr/tool/AddPopulationData.java .

  • Click the "Download Options" link in the upper right.  A small "Download options" box will appear.  Select "CSV" and click "Download".  Instruct your browser to the save the file.  
  • You will receive a ZIP file named "Data_Extract_From_World_Development_Indicators".  Unpack this zip file.  It will contain two files.  The larger file (about 100kb) contains the actual data we are interested in.  The smaller file is just a field definitions file that we don't care about.

  • The data file should be of the form:
Afghanistan    AFG    GNI, PPP (current international $)    NY.GNP.MKTP.PP.CD    22092804549    23943938506    ..
Afghanistan    AFG    Population, total    SP.POP.TOTL    ..    ..    ..
...

  1. Rename it to world_bank_data.csv and and save in org/uniocde/cldr/util/data/external/
  2. Diff the old version vs. the current.
  3. If the format changes, you'll have to modify AddPopulationData.WBLine to have the right order and contents. Often this is just adding an extra year field.

Load UN Literacy Data

  1. Goto http://unstats.un.org/unsd/demographic/products/socind/default.htm
  2. Click on "Education"
  3. Click in "Table 4a - Literacy"
  4. Download data - save as temporary file
  5. Open in Excel or OpenOffice - save as data/external/un_literacy.csv (Windows Comma Separated)
  6. Diff the old version vs. the current.
  7. If the format changes, you'll have to modify the loadUnLiteracy() method in org/unicode/cldr/tool/AddPopulationData.java

Load CIA Factbook

  1. Goto: https://www.cia.gov/library/publications/the-world-factbook/index.html
  2. Goto the "References" tab, and click on "Guide to Country Comparisons"
  3. Expand "People and Society" and click on "Population" -
    1. Right Click on DownloadData, Save Link As... call it  org/unicode/cldr/util/data/external/factbook_population.txt
  4. Back up a page, then Expand "Economy" and click on "GDP (purchasing power parity)"
      1. Right Click on DownloadData, Save Link As... call it  org/unicode/cldr/util/data/external/factbook_gdp_ppp.txt
  5. Click on the "References" tab at the top,  and click on "Guide to Country Profiles"
    1. Expand "People and Society" and click on "Literacy"
    2. It will take you to a listing of countries. Click on "Country Comparison to the World".
    3. Right Click on "Download Data", Save Link As... Call it org/unicode/cldr/util/data/external/factbook_literacy.txt
  6. Diff the old version vs. the current.
  7. If the format changes, you'll have to modify the loadFactbookLiteracy()) method in org/unicode/cldr/tool/AddPopulationData.java

Convert the data

  1. If you saw any different country names above, you'll need to edit external/alternate_country_names.txt to add them.
  2. Run "AddPopulationData -DADD_POP=true" and look for errors.
  3. Once done, then run the ConvertLanguageData tool as on Update Language Script Info
  4. Once everything looks ok, check everything in to SVN.
Comments