We really want to have an impact with this project, and thus we expect to make bold decisions, some of which may later prove wrong. To make sure we do learn from our mistakes, we already commit ourselves to a transparent procedure of self-evaluation every 6 months. The ExELang project is scheduled to start on November 1, 2021, but we have already learned any lessons and made some mistakes. Check in again in late April 2022 for the results of our next self-evaluation!
What we were aiming to do: In this early phase of the project, we need to build our team. We are budgeted to hire a part-time project manager, a full-time data manager, and several interns.
What we did: We want to keep our current manager. For the data manager position, we created a job ad in March 2021, and revised it in June, August, and December. For the internships, we created short descriptions of internships that we put up in the project's site. We circulated all of these opportunities in social media.
How we fell short: The data manager position attracted people who did not have the necessary experience in March 2021, and since we made the ad more precise, it has not attracted many candidates, and none of them have the right mix of technical and people skills. The internships attracted some candidates who were appropriate. One major issue, though, is that candidates were overwhelmingly male (100% for the data manager position, 75% for the internship positions).
How we are going to improve: We reached out to engineering schools, to try to "hire from the source" -- but most of them confessed they have 10-15% female students, so we are starting from a super biased pool of graduates in France. We have the impression that direct contact with potential candidates may be helpful: We gave a talk at an internship fair that resulted in a much more balanced set of applications (50% females). For the data manager ad, although we worked on the wording so that it's not sexist, candidates are still exclusively male, so we need to find alternate strategies. We'll be trying to do this via social media.
What we were aiming to do: The goal of ExELang is to describe early language acquisition everywhere on the basis of long-form recordings, but most publications using this technique come from researchers based in a small number of countries (as documented in Cychosz & Cristia, 2021).
What we did: We thought in part this was due to the technique not being obvious to use, so we created some educational materials (including this video and this book). We believe the technique is, at least economically, within reach of many more researchers than those using it today.
How we fell short: Knowing about a technique is not enough to feel like one can use it. To begin with, researchers may be afraid about suggesting the use of long-form recordings to their ethics review board and/or their participating population. Moreover, it seems very unlikely that simply access to the technique will be enough: for instance, studies based on observations recorded in a diary should be economically and technically within reach of researchers everywhere, but even those studies are typically published by researchers in Western Europe and North America. It seems rather naïve to ignore the many pressures against contributions by researchers from more diverse geographic backgrounds!
How we are going to improve: We are getting inspired by research on gender diversity that discusses a "leaky pipeline": the idea that, at some beginning, all people have equal abilities, but there are differential pressures that promote some people over others. For instance, a prevalence of White male scientists in public media (including TV shows and films) creates stereotypes that reduces the likelihood of people not matching that demographic enter an academic career; and the same stereotype affects their evaluation by so-called peers later on. We have made an exhaustive list of stages in the pipeline and sources for the "leaks", as well as potential data that could help us estimate the effect of those different leaks, here.
What we were aiming to do: The goal of ExELang is to describe early language acquisition everywhere, but we realized that most of the data available for this project came from a small number of researchers and child populations. We wanted to diversify the populations represented in long-form recording research.
What we did: Our first approach was to stimulate data collection and sharing by systematically contacting authors of published Randomized Control Trials. By looking at systematic reviews we found 114 papers, then traced emails of first and last author (95% found). We emailed every one of them, but only heard back from 9% of the initial sample. We offered to meet, and 2.6% of the initial sample agreed. These meetings were inspiring: We met people who were motivated and doing relevant research. We feel they were happy to meet because they were getting useful information from us too, about whether long-form recordings were right for their research project, and how to best use this technique.
How we fell short: We are afraid that this path will lead us to perpetuate the biases in the literature. We looked at 2 systematic reviews of long-form recordings and saw that 81.2% of authors are based in USA, 11.6% in Europe. Also, the people who tended to reply to us, and meet with us, were all based in USA or Europe.
How we are going to improve: We realized we needed to do something very different! That is why we were excited about the initiative to do a summer/winter school, in which we got to teach about long-form recordings. The video is available here; and we also distilled a lot of "frequently asked questions" in this book. We hope this will enable researchers in a broader set of contexts to use the long-form recordings technique.
When a researcher approaches a community with a proposal for a research project that is fully fledged, it is too late for the community to contribute and thus an important aspect of community participation has been lost. Does the same happen when we approach communities with a specific method in mind (i.e., long-form recordings)? Or is it possible to co-create research projects that maintain the community's autonomy and their right to shape a research project in this case, since the only thing that is "determined" is how data are collected?
What is the exhaustive list of potential dual uses of long-form recordings? Which ones can have positive impacts and how? Which ones can have negative impacts and how?