Deep Learning Inside Out (DeeLIO):

Knowledge Extraction and Integration

Workshop Proposal

Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures

Workshop Description

Deep learning methods have opened up a new era in NLP, providing the community with extremely powerful tools and language representations, and reaching impressive performance in numerous tasks. After the first enthusiasm this success stirred, the community started looking inside the box to understand what is coded in there, but also outside of the same (neural) box, seeking other potentially useful sources of language-related information. Several fundamental questions arise based on the direction of information exchange. What kind of knowledge do neural models capture about language and the real-world? How is this knowledge encoded in the resulting representations, and how can it be effectively extracted to enrich external knowledge repositories and linguistic data? On a similar vein, but going the other way around, an open question is whether feeding external knowledge can enhance the extracted representations and extend neural models’ understanding capabilities, and how this can be achieved to combine end-to-end (distributional) training and discrete knowledge coded in external repositories.

Another important challenge in efficiently exploiting external resources in deep learning is their limited coverage. Available resources are often incomplete even in resource-rich languages like English, but scarcity becomes critical in the cross-lingual setting. Open questions concern how to enrich external resources in resource-lean languages through cross-lingual transfer or joint multilingual modeling, how to combine external knowledge with data-hungry architectures in order to address the needs of cross-lingual NLP and deep learning, and how to exploit knowledge from resource-richer languages to boost deep learning representations and applications for resource-leaner ones. From another perspective, the amount of readily available data to train deep learning methods is smaller in such languages in the first place; therefore, it remains to be seen if leveraging external linguistic knowledge can be even more beneficial for these languages.

Deep Learning Inside Out (DeeLIO), the first workshop on knowledge extraction and integration for deep learning architectures, aims to bring together the interpretation, extraction and integration lines of research, and cover the area in between. It will explore the introduction of external knowledge in deep learning models and representations, the types of linguistic and real-world knowledge neural nets encode, the extent to which this can be used for building resources, and whether this knowledge can be beneficial to them, by being re-integrated in the models, compared to external hand-crafted knowledge. Furthermore, the proposed workshop has a strong focus on structurally diverse languages with varying semantic-syntactic properties (going way beyond English) and low-data regimes. The workshop’s aim is also to inspire novel variation-aware transfer learning and multilingual solutions on how to use the knowledge from resource-rich languages (both extracted from large models as well as the knowledge from readily available repositories in the source language) to inform deep learning architectures where external repositories are scarce or missing. Unlike BlackboxNLP and other related initiatives, the focus of the DeeLIO workshop is on “deeper” lexico-semantic knowledge than can be recovered from or integrated into deep learning methods across a variety of languages.

Key Topics

Topics of interest include, but are not limited to:

    • introduction of external knowledge in neural networks (under the form of semantic specialization of embeddings, retrofitting, joint modeling, or other).
    • exploration of the types of linguistic and extra-linguistic knowledge neural models, architectures and representations encode (similar to BERTology initiatives), but with a focus on extracting and using this knowledge in practice instead of only interpreting it: i.e., to enrich incomplete external repositories and/or to transfer the knowledge to resource-leaner target languages.
    • analyzing and understanding the limitations of the knowledge that is acquired by current neural models
    • probing and analysing different types of hand-crafted knowledge that can enhance ``blind’’ distributional models. Which type of knowledge (external or internally encoded) is more beneficial? Can these two sources of knowledge complement each other?
    • usage of deep learning models for the development and enrichment of lexico-semantic knowledge resources.
    • usage of (semi-)automatically compiled resources and their (re)integration into deep learning models
    • using external knowledge in resource-lean languages through transfer techniques or joint multilingual modeling

Submission Guidelines

  • TBD

Important Dates

  • TBD

All deadlines are 23:59 UTC - 10 hours.