Shared task in Evaluating Accuracy focuses on techniques for evaluating the factual accuracy of texts produced by data-to-text systems. We welcome submissions of both automatic metrics and human evaluation protocols for this task.
Craig Thomson and Ehud Reiter. 2021. Generation Challenges: Results of the Accuracy Evaluation Shared Task. In Proceedings of the 14th International Conference on Natural Language Generation, pages 240–248, Aberdeen, Scotland, UK. Association for Computational Linguistics.
https://github.com/ehudreiter/accuracysharedtask
Nicolas Garneau and Luc Lamontagne. 2021. Shared Task in Evaluating Accuracy: Leveraging Pre-Annotations in the Validation Process. In Proceedings of the 14th International Conference on Natural Language Generation, pages 266–270, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Zdeněk Kasner, Simon Mille, and Ondřej Dušek. 2021. Text-in-Context: Token-Level Error Detection for Table-to-Text Generation. In Proceedings of the 14th International Conference on Natural Language Generation, pages 259–265, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Tadashi Nomoto. 2021. Grounding NBA Matchup Summaries. In Proceedings of the 14th International Conference on Natural Language Generation, pages 276–281, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Rayhane Rezgui, Mohammed Saeed, and Paolo Papotti. 2021. Automatic Verification of Data Summaries. In Proceedings of the 14th International Conference on Natural Language Generation, pages 271–275, Aberdeen, Scotland, UK. Association for Computational Linguistics.