FAQ

Frequently Asked Questions

In the interest of transparency and with inspiration from this paper by Carla Parra Escartín, Teresa Lynn, Joss Moorkens and Jane Dunne, we would like to lay out some information about the Shared Task, participation, equity and the evaluation criteria.

  1. Who can participate in the DISRPT 2021 Shared Task?

Anyone can participate, though in the interest of transparency, results from teams which overlap some of the organizers or annotators of any of the included datasets will be denoted as such once the results are published. If one of the authors of your system is either an organizer, annotator of one or more of the datasets, or both, you must state so in a footnote in the camera ready version of your paper.


  1. Is there a registration/participation fee/something else?

Participation itself is free of charge and anyone can submit a system, however you must register for the CODI 2021 workshop in order to submit a paper and have your system evaluated. You can also send an e-mail to disrpt2021@googlegroups.com and we will add you to the distribution list for shared task updates (Alternatively, if you cannot successfully send an email to the email above, please send a note to Chloé Braud [Chloe.Braud@irit.fr] or Janet Liu [yl879@georgetown.edu] so we can add you from our end). We also recommend subscribing to the shared task repository on GitHub at https://github.com/disrpt/sharedtask2021


  1. How are systems submitted? Do I have to submit a paper too?

We believe that evaluation and analysis are very important, and therefore we require all systems to be accompanied by a paper using the EMNLP template. All papers will be submitted to the CODI workshop and marked as Shared Task papers. Accepted papers will be published in the Shared Task section of the proceedings of the workshop.

During paper submission, authors will be asked to provide a link to their system, including all necessary resources which are not trivially available (for example, there is no need to provide pre-trained models available from huggingface, etc.). All systems must include code to retrain the system from scratch, so that evaluators can test aspects of the system’s performance and reproduce reported scores, as well as a detailed README file explaining how to train the system. Systems which cannot be run in the evaluation phase will not be accepted.

Please also make sure to use seeds to keep performance as reproducible as possible!


  1. Is there an overall winner to the task, and if so, what are the evaluation criteria?

There are five overall rankings which will be published at the end of the shared task:


  • Discourse Unit Segmentation - from tokenized text (.tok for non-pdtb style corpora)

  • Discourse Unit Segmentation - from treebanked data (.conllu for non-pdtb style corpora)

  • Connective Detection - from tokenized text (.tok for pdtb style corpora)

  • Connective Detection - from treebanked data (.conllu for pdtb style corpora)

  • Relation classification - from treebanked data (.rels for all corpora)

The overall system rankings in each category will be determined by the macro average score across treebanks, where each treebank score is decided by the micro-averaged metric.

For segmentation, micro-averaged positive class f-score is used as the metric in each treebank, while for relation classification the simple accuracy score is used. In all cases, the official Shared Task scorers, available from the GitHub repository, will be used.


  1. Can I use the data from the dev partition as training data as well?

No. Training with dev is not allowed. The final scores of a system used in the overall ranking would be obtained from a model trained solely on training data. (One could do so (e.g. as an experiment) and report the resulting scores in their paper, but such results will not be considered / reported as the official scores of the system in the overall ranking. )


  1. I have a negative result (a system which under-performs published results). Can I submit it?

Yes. We believe that negative results can bring the field forward, especially when they are accompanied by insightful analysis about why a certain approach does not work. However negative results are not guaranteed to be accepted, and will be reviewed based on the contribution that their analyses can provide to the field.


  1. Can I opt to exclude or anonymize my results in the overall ranking, for example if I have a negative result?

Yes. You may ask for your results to either not appear at all in the overall ranking, or to appear anonymously.


  1. What should system outputs look like?

System outputs should be identical to the gold standard data in each format. The official scorer will also expect this format, so if it is running correctly and outputting the score you expect, your system should be fine.


  1. What do I do if I don’t have access to LDC resources?

We are aware that not all teams may have LDC subscriptions - in the interest of promoting equity regardless of access and funding status, we will evaluate submitted systems on the closed LDC datasets for you - even for authors who cannot test their systems on these datasets themselves. We will report scores to authors so that they can add them to their papers for the camera ready version.


  1. I don’t have access to cloud computing/GPUs - how can I compete?

We believe that equity is an important part of the shared task and while we cannot make computing resources available to participants, we are considering reporting a score for the best non-neural system in each category (depending on whether such systems are submitted).


  1. I’d like to know more about the corpora used in the shared task. Where can I find such resources?

You can find the relevant resources (e.g. papers, annotation manuals etc.) in the README.md in each data directory in the GitHub repository.