October 16, 2022

TextGraphs is a workshop series promoting the synergies between methods of Graph Theory and Natural Language Processing

Welcome to a web page of the workshop at COLING 2022 (https://coling2022.org) on October 16, 2022 in Gyeongju, Republic of Korea. You will find below essential information about the event.

Workshop Description

For the past sixteen years, the workshops in the TextGraphs series have published and promoted the synergy between the field of Graph Theory (GT) and Natural Language Processing (NLP). The mix between the two started small, with graph-theoretical frameworks providing efficient and elegant solutions for NLP applications. Graph-based solutions initially focused on single-document part-of-speech tagging, word sense disambiguation, and semantic role labeling. They became progressively larger to include ontology learning and information extraction from large text collections. Nowadays, graph-based solutions also target Web-scale applications such as information propagation in social networks, rumor proliferation, e-reputation, multiple entity detection, language dynamics learning, and future events prediction, to name a few.

We plan to encourage the description of novel NLP problems or applications that have emerged in recent years, which can be enhanced with existing and new graph-based methods. The sixteenth edition of the TextGraphs workshop aims to extend the focus on graph-based representations for (1) integration and joint training and use of transformer-based models for graphs and text (such as Graph-BERT and BERT), and (2) domain-specific natural language inference. Related to the former point, we would like to advance the state-of-the-art natural language understanding facilitated with large-scale language models like GPT-3 and linguistic relationships represented by graph neural networks. Related to the latter point, we are interested in addressing a challenging task contributing to mathematical proof discovery. Furthermore, we also encourage research on applications of graph-based methods in knowledge graphs to link them to related NLP problems and applications.

TextGraphs-16 invites submissions on (but not limited to) the following topics

  • Graph-based and graph-supported machine learning methods: Graph embeddings and their combinations with text embeddings; Graph-based and graph-supported deep learning (e.g., graph-based recurrent and recursive networks); Probabilistic graphical models and structure learning methods

  • Graph-based methods for Information Retrieval and Extraction: Graph-based methods for word sense disambiguation; Graph-based strategies for semantic relation identification; Encoding semantic distances in graphs; Graph-based techniques for text summarization, simplification, and paraphrasing; Graph-based techniques for document navigation and visualization

  • New graph-based methods for NLP applications: Random walk methods in graphs; Semi-supervised graph-based methods

  • Graph-based methods for applications on social networks

  • Graph-based methods for NLP and Semantic Web: Representation learning methods for knowledge graphs; Using graphs-based methods to populate ontologies using textual data

Workshop Program

Time zone is local to the host conference located in Gyeongju, Republic of Korea.

To attend the workshop remotely, use the following link: https://underline.io/events/360/sessions?eventSessionId=13202

Workshop proceedings are available in the ACL Anthology: https://aclanthology.org/volumes/2022.textgraphs-1/

Invited talk

Prof. Dr. Animesh Mukherjee, IIT Kharagpur

NLP4SE: Text-cum-graph based approaches to tackle software engineering issues in large repositories

Large software repositories form a highly valuable networked artifact, usually in the form of a collection of packages, their developers, dependencies among them, and bug reports. This "social network of code" is rarely studied by social network researchers. In the last five years we have been exploring this space extensively. In this talk, I shall introduce new problems that are well-motivated in the software engineering community but not closely studied by NLP/network science researchers. The first is to identify packages that are most likely to be troubled by bugs in the immediate future, thereby demanding the greatest attention. The second constitutes the task of predicting the severity of bugs in the near future. The third is to recommend developers to packages for the next development cycle. We use text cum graph-based approaches to develop solutions to each of these problems. In this process, we curate and release large volumes of data including the long-term history of 20 releases of Ubuntu, growing to over 25,000 packages with their dependency links, maintained by over 3,800 developers, with over 280k bug reports.

Important Dates for Workshop

  • Papers Due: July 11, July 17, 2022 (Monday)

  • Notification of Acceptance: August 22, August 28, 2022 (Sunday)

  • Camera-ready papers due: September 5, September 11, 2022 (Sunday)

  • Workshop date: October 16, 2022

Shared Task

We invite participation in the 1st Shared Task on Natural Language Premise Selection associated with the 16th Workshop on Graph-Based Natural Language Processing (TextGraphs 2022).

The task proposed this year is the Natural Language Premise Selection (NLPS) (Ferreira et al., 2020a), inspired by the field of automated theorem proving. The task of NLPS takes as input a mathematical statement, written in natural language, and outputs a set of relevant sentences (premises) that could support an end-user finding a proof for that mathematical statement. The premises are composed of supporting definitions and propositions that can act as explanations for the proof process.

Important Dates for Shared Task

  • Training and development data release: June 30, 2022

  • Test data release; Evaluation start: July 15, 2022

  • Evaluation end: August 27, 2022

  • System description paper deadline: August 29, 2022

  • Author notifications: September 2, 2022

  • Camera-ready papers: September 11, 2022

Shared task participation is not mandatory for participation in the workshop.


  • We invite submissions of up to eight (8) pages maximum, plus bibliography for long papers and four (4) pages, plus bibliography, for short papers.

  • The COLING 2022 templates must be used; these are provided in LaTeX and also Microsoft Word format. Submissions will only be accepted in PDF format. Download the Word and LaTeX templates here: https://coling2022.org/Cpapers.

  • Submit papers by the end of the deadline day (timezone is UTC-12) via our Softconf Submission Site: https://www.softconf.com/coling2022/TextGraphs-16/


Please direct all questions and inquiries to our official e-mail address (textgraphsOC@gmail.com) or contact any of the organizers via their individual emails. Also you can join us on Facebook: https://www.facebook.com/groups/900711756665369.

Previous Editions

The TextGraphs-16 builds on the success of the previous fiveteen workshops organized in association with such international NLP conferences as ACL, COLING, and EMNLP: http://textgraphs.org .

  • TextGraphs-15 was organized online at NAACL-HLT in June 2021 and attracted 478 people. We organized a shared task on many-hop multi-hop inference for explanation regeneration that attracted four teams around the world, after which we additionally accepted three system description papers and included one shared task organizers report.

  • TextGraphs-14 was organized online at COLING in December 2020. That year's shared task on multi-hop explanation regeneration attracted nine teams around the world, leading to the additional acceptance of four system description papers and one shared task organizers report. The host conference has not performed registration counting.

  • TextGraphs-13 was held at EMNLP-IJCNLP in Hong Kong in November 2019 and attracted 186 people. For the first time ever, we have organized a shared task on multi-hop explanation regeneration that received submissions from five different teams around the world, which expanded our proceedings by five additional participation report papers and one shared task description paper.

Program Committee

  • Amir Bakarov, Behavox

  • Flavio Massimiliano Cecchini, Università Cattolica del Sacro Cuore

  • Mikhail Chernoskutov, Krasovskii Institute of Mathematics and Mechanics

  • Hejie Cui, Emory University

  • Stefano Faralli, University of Rome Sapienza

  • Michael Flor, Educational Testing Service

  • Natalia Grabar, CNRS STL UMR8163, Université de Lille

  • Aayushee Gupta, International Institute of Information Technology, Bangalore

  • Rima Hazra, Indian Institute of Technology, Kharagpur

  • Dmitry Ilvovsky, National Research University Higher School of Economics

  • Rohith Gowtham Kodali, ambientone

  • Andrey Kutuzov, University of Oslo

  • Ping Li, Soutwest Petroleum University

  • Suman Kalyan Maity, MIT

  • Gabor Melli, Sony PlayStation

  • Enrique Noriega-Atala, The University of Arizona

  • Damien Nouvel, Inalco (ERTIM)

  • Jan Wira Gotama Putra, SmartNews, Inc.

  • Leonardo F. R. Ribeiro, TU Darmstadt

  • Minoru Sasaki, Ibaraki University

  • Viktor Schlegel, University of Manchester

  • Mark Steedman, University of Edinburgh

  • Mihai Surdeanu, University of Arizona

  • Adrian Ulges, RheinMain University of Applied Sciences

  • Vaibhav Vaibhav, Apple

  • Mariana Vargas Vieyra, Inria Lille Nord Europe

  • Xiang Zhao, National University of Defense Technology


  • Dmitry Ustalov, Ph.D., Toloka

  • Yanjun Gao, Ph.D., University of Wisconsin-Madison

  • Abhik Jana, Ph.D., University of Hamburg

  • Prof. Thien Huu Nguyen, Ph.D., University of Oregon

  • Prof. Gerald Penn, Ph.D., University of Toronto

  • Arti Ramesh, Ph.D., Educational Testing Services

  • Prof. Alexander Panchenko, Ph.D., Skoltech

  • Mokanarangan Thayaparan, University of Manchester & Idiap Research Institute

  • Marco Valentino, University of Manchester & Idiap Research Institute

Image source: https://en.wikipedia.org/wiki/Higman–Sims_graph