Task 1: Dataton on language resource creation for Equality, Diversity and Inclusivity (EDI).
In many places, supporting diversity and promoting inclusion is still a major issue, as it is in language technology as well. Data created with bias propagates and makes systems developed using the dataset biased as well [1]. We could describe ‘bias’ as an unfair discrimination against any individual or a group of individuals that occurs systematically in favour of others [2]. In other words, we could also say that Bias occurs when there is systematic unfair discrimination. Bias may be introduced in the data as a result of the following three scenarios:
Pre-existing Bias : Any bias that occurs in institutions, practices and attitude in society
Technical Bias : Any bias that originates from technical constraints and decisions
Emergent Bias: Any bias that occurs when a system designed for one context is applied in another
In this datahon, we propose bringing researchers together to create datasets to be more inclusive not only with respect to gender issues but also racial, sexuality, people with disability etc. The participants will be asked to create language resources or improve existing datasets to deal with EDI in their native language. The participants can create datasets for socio-pragmatics, morphology, syntax etc.
The datathon is to create a new dataset or remediate bias in an already existing dataset. We encourage participant to submit the data statement, dataset and paper describing the dataset. Sample data statements are available here:
https://sites.google.com/uw.edu/data-statements-for-nlp/sample-data-statements?authuser=0 Resources will be evaluated in terms of resource quality and the EDI factors considered. Participants are expected to submit the datasets along with a data statement.
Evaluated on
terms of resource quality
EDI factors
Paper submission
Each team participating in the datathon is expected to submit a short/long paper along with a data statement. The paper should explain the data collection processes and tools used to collect the resource. The methodology/strategy should be documented in such a way that the readers and other researchers are able to replicate the work from the system description paper. Submit the paper, data statement and data to lt-edi@insight-centre.org and priya.rani@insight-centre.org. Deadline is same as workshop deadlines.
Organizers
Bharathi Raja Chakravarthi, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
Ruba Priyadharshini, ULTRA Arts and Science College, Madurai, India
Theodorus Fransen, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
Kalika Bali, Microsoft Research India
John P. McCrae, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
Paul Buitelaar, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
Manel Zarrouk, Institut Galilée @ University Sorbonne North Paris
Student Volunteers
Priya Rani, PhD Student, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
Koustava Goswami, PhD Student, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
Shardul Suryawanshi, PhD Student, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
References:
[1]Bender, E.M. and Friedman, B., 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, pp.587-604.
[2] Friedman, Batya, and Helen Nissenbaum. "Bias in computer systems." ACM Transactions on Information Systems (TOIS) 14.3 (1996): 330-347.
Task 2: Hope Speech Detection for Equality, Diversity and Inclusion
Hope is considered significant for the well-being, recuperation and restoration of human life by health professionals. Hope speech reflects the belief that one can discover pathways to one's desired objectives and become motivated to utilize those pathways[1-5]. Our work aims to change the prevalent way of thinking by moving away from a preoccupation with discrimination, loneliness or the worst things in life to building confidence, support and good qualities based on comments by individuals. The goal of this task is to identify whether a comment contains hope speech or not. The comment/post may contain more than one sentence but the average sentence length of the corpora is 1. Each comment/post is annotated at a comment/post level. This dataset also has class imbalance problems depicting real-world scenarios.
The participants will be provided development, training and test dataset in English, Tamil, and Malayalam. To download the data and participate, go to codalab and click “Participate tab”.
To the best of our knowledge, this is the first shared task on Hope Speech Detection.
Codalab link: https://competitions.codalab.org/competitions/27653
Organizers
Bharathi Raja Chakravarthi, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
Vigneshwaran Muralidaran, School of Computer Science and Informatics, Cardiff University, United Kingdom
Reference:
[1] Harvey Milk. 1997. The hope speech. We are everywhere: A historical source book of gay and lesbian politics,pages 51–53
[2] Edward C. Chang. 1998. Hope, problem-solving ability, and coping in a college student population: Some implications for theory and practice. Journal of Clinical Psychology, 54(7):953–962
[3] Carolyn M. Youssef and Fred Luthans. 2007. Positive organizational behavior in the workplace: The impact of hope, optimism, and resilience. Journal of Management, 33(5):774–80
[4] Rob Cover. 2013. Queer youth resilience: Critiquing the discourse of hope and hopelessness in lgbt suicide representation.M/C Journal, 16(5).
[5]Snyder, C. R., Harris, C., Anderson, J. R., Holleran, S. A., Irving, L. M., Sigmon, S. T., et al.(1991). The will and the ways: Development and validation of an individual-differences measure of hope. Journal of Personality and Social Psychology, 60, 570-585.
[6] https://www.aclweb.org/anthology/2020.peoples-1.5/