PrivateNLP 2020

First Workshop on Privacy in Natural Language Processing

Colocated with WSDM 2020, Feb 7, 2020, Houston Texas


Privacy-preserving data analysis has become essential in the age of Machine Learning (ML) where access to vast amounts of data can provide gains over tuned algorithms. A large proportion of user-contributed data comes from natural language e.g., text transcriptions from voice assistants.

It is therefore important to curate NLP datasets while preserving the privacy of the users whose data is collected, and train ML models that only retain non-identifying user data.

The workshop aims to bring together practitioners and researchers from academia and industry to discuss the challenges and approaches to designing, building, verifying, and testing privacy preserving systems in the context of Natural Language Processing.

The topics to be covered in the workshop include:

  • Personal information detection: classifying whether text contains the author’s personal information,

  • Privacy-preserving text analysis: how differential privacy and homomorphic encryption can be used to preserve user privacy when integrated in word embeddings and calculations that use them,

  • Privacy enhancing technologies and AI: discussing the state of research in privacy-preserving AI and the kinds of technologies that can be integrated into AI in order to preserve privacy.


Venue: Hyatt Regency Houston/Galleria | Room: Regency D (Level 2)

08:45 Welcome

    • Oluwaseyi Feyisetan

09:00 Keynote

10:00 Coffee break

10:30 Invited Talk

11:15 Invited Talk

12:00 Lunch

13:00 Research Paper

13:30 Research Paper

14:00 Research Paper

14:30 Research Paper

15:00 Coffee break

15:30 Panel discussion

16:00 Closing remarks

Invited Speakers

Tom Diethe (Amazon, UK) - Keynote Speaker

Tom Diethe is an Applied Science Manager in Amazon Research, Cambridge UK. Tom is also an Honorary Research Fellow at the University of Bristol. Tom was formerly a Research Fellow for the "SPHERE" Interdisciplinary Research Collaboration, which is designing a platform for eHealth in a smart-home context. This platform is currently being deployed into homes throughout Bristol.

Tom specializes in probabilistic methods for machine learning, applications to digital healthcare, and privacy enhancing technologies. He has a Ph.D. in Machine Learning applied to multivariate signal processing from UCL, and was employed by Microsoft Research Cambridge where he co-authored a book titled `Model-Based Machine Learning.' He also has significant industrial experience, with positions at QinetiQ and the British Medical Journal. He is a fellow of the Royal Statistical Society and a member of the IEEE Signal Processing Society.

Oluwaseyi Feyisetan (Amazon, USA)

Oluwaseyi Feyisetan is an Applied Scientist at Amazon Alexa where he works on Differential Privacy and Privacy Auditing mechanisms within the context of Natural Language Processing. He holds 2 pending patents with Amazon on preserving privacy in NLP systems. He completed his PhD at the University of Southampton in the UK and has published in top tier conferences and journals on crowdsourcing, homomorphic encryption, and privacy in the context of Active Learning and NLP. He has served as a reviewer at top NLP conferences including ACL and EMNLP. He is the lead organizer of the Workshop on Privacy and Natural Language Processing (PrivateNLP) at WSDM with an upcoming event scheduled for EMNLP. Prior to working at Amazon in the US, he spent 7 years in the UK where he worked at different startups and institutions focusing on regulatory compliance, machine learning and NLP within the finance sector, most recently, at the Bank of America.

Patricia Thaine (University of Toronto, Canada)

Patricia Thaine is a PhD Candidate at the Department of Computer Science (University of Toronto) doing research on Privacy-Preserving Natural Language Processing, with a special focus on Applied Cryptography. She also does research on computational methods for lost language decipherment. Patricia is a recipient of the NSERC Postgraduate Scholarship, the RBC Graduate Fellowship, the Beatrice ‘Trixie’ Worsley Graduate Scholarship in Computer Science, and the Ontario Graduate Scholarship. She has eight years of research and software development experience, including at the McGill Language Development Lab, the University of Toronto’s Computational Linguistics Lab, the University of Toronto’s Department of Linguistics, and the Public Health Agency of Canada. She is the Co-Founder and CEO of Private AI, the former President of the Computer Science Graduate Student Union at the University of Toronto, and a member of the Board of Directors of Equity Showcase, one of Canada’s oldest not-for-profit charitable organizations.

Key Dates

Information about the workshop's topics of interest can be found in the Call for Papers.

We have a flyer, help us publicize the workshop by sharing the flyer with colleagues.

  • Submission Deadline: December 7, 2019 December 15, 2019

  • Acceptance Notification: January 4, 2020

  • Poster / extended abstract deadline: January 10, 2020

  • Camera-ready versions: January 20, 2020

  • Workshop: February 7, 2020