Information for potential data donors


If you are willing to participate in our study, you are probably finding yourself in one of the following situations:

A. You already have collected data and you want to share it with us.

B. You don't have data yet but you are willing to collect it and share it with us.


A. You already have data

If you already have data, there are four different options:


  1. You have already shared your data in a scientific repository: great, we only need to access you data!


  1. You haven’t shared your data in a scientific repository yet but you are willing to do so: This is our preferred option!

Sharing data in a scientific repository might be good for you and your study because it allows you to get lots of potential collaborators and citations by making the effort of uploading your data only once. Moreover you are contributing to make science an open source for everybody. *



  1. You don’t want to upload your data on a scientific repository but you are willing to create a license: A License is an official permission granted by the owner of some Work (the “Licensor”) to other people (the “Licensee”) and governing how the Licensee is allowed to use the Licensor’s Work.


This also allows you to make the effort of establishing rules for potential collaborators/citations once, and then you can re-use this license with others.

When you create your license, you make the rules, so you can specify whatever you feel it's best. You probably want to look at some examples:

You can also use open source licences, but you need to know that they allow for re-distribution (i.e., the people who get hands on your corpus can re-distribute it). Just in case, we made a list of the most commonly used ones, which you can find here.



  1. None of the above mentioned options works well for you: In this case, we can sign a bilateral agreement; this is convenient if we want to agree on something very specific (e.g., we want to make an exchange -- for instance, we fund your re-consenting the families who participated, in exchange for the possibility of re-using the data). Please note that in the agreement, you'll always keep ownership over the data; and that we'll agree on an appropriate scope for intellectual property. At ExELang, we are specifically interested in relating children's experiences with how they talk, so we will ask for the ability to analyze and publish about this one point. If that covers your own research goals, we will discuss how to frame intellectual property so that it is ideal for both.

*If you are not familiar with scientific repositories online, you can find a table here, that can help you decide which repository works best for you. The table is based on our experience with the different repositories. Do ask us if you have other questions!


B. You don’t have data yet.


If you haven't collected data yet, we'll typically need to sign a bilateral agreement. This is ideal because we are going to agree on several things:

  • you'll always keep ownership (and main responsibility) over the data,

  • what you get from us can include expertise hours, equipment, data processing, and funding for logistics,

  • what we would like to get is the possibility to re-analyze your data within the scope of ExELang


Please note that in the agreement, we'll agree on an appropriate scope for intellectual property. At ExELang, we are specifically interested in relating children's experiences with how they talk, so we will ask for the ability to analyze and publish about this one point. If that covers your own research goals, we will discuss how to frame intellectual property so that it is ideal for both of us.


Here is an example of bilateral agreement.


If this sounds complicated, you can always just start collecting data on your own, and eventually get back to us when you have your data -- in which case, you can look at your options under "A. You already have data" above.


To get you started thinking about the steps involved in collecting data, you can look at these four steps:


  1. You'll probably need approval from an IRB; you can start by reading Cychosz et al. 2020, pp. 8+.

  2. You also need to look into legal aspects of collecting and processing these data. For Europeans, this will mean applying GDPR. In the USA, all data will be covered by HIPAA, and there may also be State-specific laws to consider.

  3. If you don't have hardware, you can read sections 3.3 (Collecting data) in Casillas & Cristia (2019, Collabra), which explain all options.

  4. If you don't have software, you can read sections “Considerations and best practices in research with longform audio recordings” in Cychosz et al. (2020, Behaviour Research Methods), which explain all options. (Note that you may not need to do these analyses yourself, since this is one of the things that you could gain from allowing us to re-use your data).