Anthropometrics
Project CHILD includes World Health Organization height-for-age and BMI-for-age z-scores as well as raw height and weight. The reference population of the WHO Child Growth Standards were healthy children from around the world.
Age-Standardized Child Development Scores
Child ability is often age dependent; standardizing assessments of child development by age can be helpful for comparisons. Although some use the raw score as the outcome and control for age in regressions, the interpretation of an impact on points is complicated, especially when the standard deviation of points also changes with age. The interpretation is particularly challenging in comparative studies, where different surveys use different tests to measure the same development realm. In these cases, the raw score also have very different meanings across sets. Thus I recommend using the standardized measure as an outcome rather.
Internationally standardized benchmarks in child development have only been recently established and are not yet widely implemented. Thus a variety of child development tests have been applied in these surveys. Some tests currently used provide a standardization scheme from a reference population. However, the reference population is usually from a WEIRD country, and may not be appropriate for other contexts. Additionally, child development may have different progressions in different contexts. Thus the standardized scores in Project CHILD are standardized within each country with all data from that survey round; this approach requires that interpretation be relative within the country's distribution rather than to some external reference population.
A number of child development assessments vary the questions given based on the child's age (Ages & Stages Questionaire, for example). Thus the standardization of these scores should maintain these age groupings to obtain comparability. The score of a 3-month old child should not be standardized with the scores of a 4-month old child because the questions the parent answers are different.
Other tests have a continuous structure, building in difficulty as the child grows. Although there may be a benchmark based on age for where to start testing the child, if the child gets the question wrong, the psychologist returns to an easier question to determine where the child's ability is.
This harmonization project already takes all these factors into account and provides age-standardized scores for the child development tests offered on each survey. More detail about the standardization procedures can be found in the do files and in the Stata user-written program stndzxage.
Sample Sizes
Some surveys are cohort studies, following the same children over time. Attrition occurs, but no children are added after the first round, though perhaps a child misses a middle round and is recovered later. Other surveys are household surveys, which include all children in the same household. In these surveys, new births since previous rounds are added, so sample sizes could increase over time. Young Lives Peru is, in a sense, a hybrid, where a cohort of children is followed with limited data on the next younger sibling. (Project CHILD does not yet incorporate the sibling data.) Chile's ELPI also grows it sample by adding new cohorts every round. Data on sample sizes is available here.
Unlike measures of anthropometrics and adult intellect, the measures of child development vary by age. Tests given to a three-year-old often have very different questions than tests given to a six year old. Thus, sample sizes for each test can be quite different from the overall sample size. If the test is given over a limited span in ages, the time between survey rounds influences how many children have repeated scores. Data on sample size and age spans is available here.