PrivateNLP 2021

Third Workshop on Privacy in Natural Language Processing

Colocated with NAACL 2021, June 11, 2021, Virtual, Worldwide


Privacy-preserving data analysis has become essential in the age of Machine Learning (ML) where access to vast amounts of data can provide gains over tuned algorithms. A large proportion of user-contributed data comes from natural language e.g., text transcriptions from voice assistants.

It is therefore important to curate NLP datasets while preserving the privacy of the users whose data is collected, and train ML models that only retain non-identifying user data.

The workshop aims to bring together practitioners and researchers from academia and industry to discuss the challenges and approaches to designing, building, verifying, and testing privacy preserving systems in the context of Natural Language Processing.

Information about the workshop's topics of interest can be found in the Call for Papers.

Key Dates

  • Submission Deadline: March 15, 2021 March 22, 2021 (11.59pm UTC-12)

  • Acceptance Notification: April 15, 2021

  • Camera-ready versions: April 26, 2021

  • Workshop: June 11, 2021

Invited Speakers

Travis Breaux (Carnegie Mellon University)

Adam Dziedzic (Vector Institute and The University of Toronto)


Venue: Virtual

Date: June 11, 2021

Timezone: PST – Pacific Standard Time

08:00 - 08:10 Welcome

    • Sepideh Ghanavati

08:10 - 09:10 Invited Talk

    • Adam Dziedzic (Vector Institute)

09:10 - 09:30 Break

    • Morning break

09:30 - 09:50 Research Paper

    • An Investigation towards Differentially Private Sequence Tagging in a Federated Framework

    • Abhik Jana and Chris Biemann

09:50 - 10:10 Research Paper

    • A Privacy-Preserving Approach to Extraction of Personal Information through Automatic Annotation and Federated Learning

    • Rajitha Hathurusinghe, Isar Nejadgholi and Miodrag Bolic

10:10 - 10:30 Research Paper

    • Understanding Unintended Memorization in Language Models Under Federated Learning

    • Om Dipakbhai Thakkar, Swaroop Ramaswamy, Rajiv Mathews and Francoise Beaufays

10:30 - 10:45 Break

    • Short break

10:45 - 11:45 Invited Talk

    • Travis Breaux (Carnegie Mellon University)

11:45 - 12:30 Break

    • Lunch break

12:30 - 12:50 Research Paper

    • Learning and Evaluating a Differentially Private Pre-trained Language Model

    • Shlomo Hoory, Amir Feder, Avichai Tendler, Alon Cohen, Sofia Erell, Itay Laish, Hootan Nakhost, Uri Stemmer, Ayelet Benjamini, Avinatan Hassidim and Yossi Matias

12:50 - 13:10 Research Paper

    • Anonymisation Models for Text Data: State of the art, Challenges and Future Directions

    • Pierre Lison, Ildikó Pilán, David Sánchez, Montserrat Batet and Lilja Øvrelid

13:10 - 13:20 Break

    • Short break

13:20 - 13:40 Research Paper

    • Using Confidential Data for Domain Adaptation of Neural Machine Translation

    • Sohyung Kim, Arianna Bisazza and Fatih Turkmen

13:40 - 14:00 Research Paper

    • Private Text Classification with Convolutional Neural Networks

    • Samuel Adams, David Melanson and Martine De Cock

14:00 - 14:20 Research Paper

    • On a Utilitarian Approach to Privacy Preserving Text Generation

    • Zekun Xu, Abhinav Aggarwal, Oluwaseyi Feyisetan and Nathanael Teissier

14:20 - 14:50 Community discussion / Informal panel

    • Patricia Thaine

14:50 - 15:00 Closing remarks

    • Oluwaseyi Feyisetan

Previous Workshops

PrivateNLP at WSDM 2020 - view here

PrivateNLP at EMNLP 2020 - view here


For questions/queries regarding the workshop or submission: