Accepted Papers
The DeepLo 2022 proceedings are now available.
Introducing QuBERT: A Large Monolingual Corpus and BERT Model for Southern Quechua
Rodolfo Zevallos, John Ortega, William Chen, Richard Castro, Núria Bel, Cesar Toshio, Renzo Venturas, Hilario Aradiel and Nelsi Melgarejo
Improving Distantly Supervised Document-Level Relation Extraction Through Natural Language Inference
Clara Vania, Grace Lee and Andrea Pierleoni
IDANI: Inference-time Domain Adaptation via Neuron-level Interventions
Omer Antverg, Eyal Ben-David and Yonatan Belinkov
Generating unlabelled data for a tri-training approach in a low resourced NER task
Hugo Boulanger, Thomas Lavergne and Sophie Rosset
ANTS: A Framework for Retrieval of Text Segments in Unstructured Documents
Brian Chivers, Mason P. Jiang, Wonhee Lee, Amy Ng, Natalya I. Rapstine and Alex Storer
Cross-TOP: Zero-Shot Cross-Schema Task-Oriented Parsing
Melanie A. Rubino, Nicolas Guenon des mesnards, Uday Shah, Nanjiang Jiang, Weiqi Sun and Konstantine Arkoudas
Help from the Neighbors: Estonian Dialect Normalization Using a Finnish Dialect Generator
Mika Hämäläinen, Khalid Alnajjar and Tuuli Tuisk
Exploring diversity in back translation for low-resource machine translation
Laurie Burchell, Alexandra Birch and Kenneth Heafield
Punctuation Restoration in Spanish Customer Support Transcripts using Transfer Learning
Xiliang Zhu, Shayna Gardiner, David Rossouw, Tere Roldán and Simon Corston-Oliver
[🏆 Spotlight Paper] Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese
Kurt Micallef, Albert Gatt, Marc Tanti, Lonneke van der Plas and Claudia Borg
Event Extractor with Only a Few Examples
Pengfei Yu, Zixuan Zhang, Clare Voss, Jonathan May and Heng Ji
Task Transfer and Domain Adaptation for Zero-Shot Question Answering
Xiang Pan, Alex Sheng, David Shimshoni, Aditya Singhal, Sara Rosenthal and Avirup Sil
Let the Model Decide its Curriculum for Multitask Learning
Neeraj Varshney, Swaroop Mishra and Chitta Baral
AfriTeVA: Extending “Small Data” Pretraining Approaches to Sequence-to-Sequence Models
Odunayo Jude Ogundepo, Akintunde Oladipo, Mofetoluwa Adeyemi, Kelechi Ogueji and Jimmy Lin
Few-shot Learning for Sumerian Named Entity Recognition
Guanghai Wang, Yudong Liu and James Hearne
Deep Learning-Based Morphological Segmentation for Indigenous Languages: A Study Case on Innu-Aimun
Ngoc Tan Le, Antoine Cadotte, Mathieu Boivin, Fatiha Sadat and Jimena Terraza
[🏆 Spotlight Paper] Clean or Annotate: How to Spend a Limited Data Collection Budget
Derek Chen, Zhou Yu and Samuel R. Bowman
Unsupervised Knowledge Graph Generation Using Semantic Similarity Matching
Lixian Liu, Amin Omidvar, Zongyang Ma, Ameeta Agrawal and Aijun An
FarFetched: Entity-centric Reasoning and Claim Validation for the Greek Language based on Textually Represented Environments
Dimitris Papadopoulos, Katerina Metropoulou, Nikolaos Papadakis and Nikolaos Matsatsinis
Alternative non-BERT model choices for the textual classification in low-resource languages and environments
Syed Mustavi Maheen, Moshiur Rahman Faisal, Md. Rafakat Rahman and Md. Shahriar Karim
Generating Complement Data for Aspect Term Extraction with GPT-2
Amir Pouran Ben Veyseh, Franck Dernoncourt, Bonan Min and Thien Huu Nguyen
How to Translate Your Samples and Choose Your Shots? Analyzing Translate-train & Few-shot Cross-lingual Transfer
Iman Jundi and Gabriella Lapesa
Unified NMT models for the Indian subcontinent, transcending script-barriers
Gokul N.C.