SociO!

Socio economic ontology: a backbone of socio economic data interoperability

Soonho Kim*, Marie-Angélique Laporte **, Elizabeth Arnaud **, Medha Devare*, and Gideon Kruseman***

* International Food Policy Research Institute(IFPRI)

** Bioversity, International

*** Centro Internacional de Mejoramiento de Maíz y Trigo (CIMMYT)

Acknowledgement

This ontology is the socio-economic ontology developed by CGIAR and partners. Socio-economic ontology team gratefully acknowledges the Platform for big data in agriculture (https://bigdata.cgiar.org) and CGIAR Research program on Policies, Instiutions and Markets(http://pim.cgiar.org/) for their financial support to implement this ontology.

INTRODUCTION

Fig. 1 BFO schema

The CGIAR research program, Platform for Big data in Agriculture, aims for sharing our data across 15 CGIAR centers and with their partners by reducing barriers to information access and reuse. It would help researchers, farmers and policy makers take reliable and data-driven decisions. Given the large number of socio-economic surveys carried out by CGIAR, socio economic data is one of crucial data types for the Big Data platform. Under the Community of Practice on socio-economic data, we focus on creating socio economic ontology to improve data collect and interoperability by using the same concepts in common modules, which can be used in socio economic surveys to be conducted in different CGIAR centers.

OBJECTIVES AND SCOPE

This project will deliver a prototype of socio economic ontology named “SociO”. As a starting point, the scope of ontology would cover all concepts and relationships which are used in the common modules named “100 standard questions”. Terms in each question would be linked to URIs of concepts in the SociO ontology and would be included in the metadata of the data sets[A(4] [K(5] . Concepts of the SociO ontology would map to existing wide-used agriculture ontology such as AGROVOC ontology which provides local translations and synonym.

The combined used of standardized questions and a common ontology will facilitate quality control regarding surveys. It can facilitate data curation through standard data consistency and error checking procedures.

Moreover, we work closely with other working groups inside the Community of Practice on socio-economic data and CGIAR researchers who deal with socio economic data and we create a systematic mechanism to get their feedbacks and apply them to the ontology.

METHODOLOGY

The SociO ontology would not build from the scratch but reuse concepts in existing terms/thesaurus/classification systems/ontologies such as World Bank document repository, crop ontology, agronomy ontology, and others. The SociO ontology would add new concepts if needed and coordinate those new concepts with existing agricultural ontology community (i.e. Global Agricultural Concept Scheme). If we need to add new concepts and relationships, the SociO ontology would be created by mixed top-down and bottom-up methodology[2]. Ontologies evolve by time, which means that we continuously add new concepts and relationships, edit them and map them into existing ontologies. In addition, if “100 standard questions” are updated, then ontology needs to support those changes.

The basis for the ontology is the understanding that a number of basic general concepts exist, which are useful to define broad categories of socio-economic data. We map a variety of related concepts to these broad concepts. This may take on the form of a tree structure. At the outer reaches of the branches of the tree are the actual data fields

SociO aims to address three major components of this framework:

  1. Clear definition of the tree structure related to the data concepts. Each term in the ontology will need a validated definition.
  2. Provide a basic classification for key non-numerical data fields
  3. Provide a set of useful structural metadata fields related to the data defined through the ontology

EXPECTED RESEARCH OUTPUTS AND OUTCOMES

In addition to the generation of SociO ontology, this project provides systematic tools to control versioning and to get feedbacks from a variety of groups who are researchers, field manager, data manager, research coordinator, data curators, and data scientist.

For the medium/long-term, this project would contribute to Sustainable Development Goal 10: Reduce inequality within and among countries through providing local translation of concepts and relationships in the ontology and Sustainable Development Goal 17: Strengthen the means of implementation and revitalize the global partnership for sustainable development through improving data intermobility and applying FAIR principles.


The implementation on the SociO! ontology server is through github repository:

CONCEPTS

Tuesday June 5, 2018

fig 2. Linking survey to the BFO


Fig 3. modeling


Prototype implementation using "demographics" module

Aspect

Indicators

Demographics

Household type; Household size (number of members and in terms of Male Adult Equivalent); number of children below 2 and 5 years

Fig 4a. OWL implementation


Fig. 4b Owl Implemetation (cont.)


Fig 5a. Indicator (variables)


Fig 5b. Indicator (variables, cont.)


Fig 6a. Type


Fig 6b. Type (cont.)