The list of gendered data sets is an output of the mini grant for the Gender Platform.
IFPRI researchers provided a list ofr 61 gendered data sets. The team did a search in the IFPRI Dataverse and found 57 files, 47 were in the original list, 14 were missing because of inadequate tagging and 10 data sets popped up that the researchers were not aware of.
The Community of Practice on Socio-Economic Data supported the Gender Platform through a mini grant in 2018. In this seed project the focus was on the findability of gender data or research data with strong gender components. Key outputs were:
a. An inventory of gender datasets
b. List metadata fields that support gender researchers in accessing data
With the output of the list provided by IFPRI and CCFAS plus what the data team of the gender platform retrieved from a search, there is an initial version the inventory. similar approach will be applied to the Dataverses of the other centers to expand the inventory.
Outcomes of the initial investigation of the metadata reveals that the keyword “gender” is a very strong predictor in finding a gender dataset. Not surprising but: 1) this inventory is not properly published anywhere, therefore making it available increases findability and visibility and 2) “gender” is not enough to find everything. Therefore an immediate simple guideline here is to recommend to all gender researchers to use the keyword “gender” (or if we would prefer “gender research”) when describing the datasets they post.
We also found out that the “sex-disaggregated data” term is not used in the keyword describing those datasets. So here we either would have a similar advice, or we could think whether it would be worth request a core meta-data field on this to be fulfilled by any social science dataset.
During the data harmonization workshop organized by the Community of Practice on Socio-Economic Data in December 2018 in Rome, it become clearthat we need to make a difference between:
· Gender-related Keywords/Metadata-fields that Gender researchers should be consistent about
· Gender-related Keywords/Metadata-fields that Gender researchers are interested in knowing about any dataset
The first point is much study specific and there’s probably little we can prescribe, except for a harmonization
For the second point, as this is as general as possible, the advice would have to be something minimal (like the sex-disaggregated data specification on the unit of observation), but can be extremely powerful in unlocking potential.
A conclusion, also in discussion with CoP Social and Economic Data from the Big Data Platform, is that enriching keywords is doable and desirable, while altering meta-data schemes not. Therefore the advice is to compile a good list of important gender specific keywords.
In the Dataverse search on Gender Datasets there where datasets found which where not initially provided by IFPRI researchers. We need to understand if they qualify as gender datasets. For those provided but not found, the simple solution would be to add the keyword “gender” into the keywords controlled vocabulary.
For more information please contact Marcelo Tyszler <m.tyszler@kit.nl>