PrivateNLP@WSDM 2020
First Workshop on Privacy in Natural Language Processing
Colocated with WSDM 2020, Feb 7, 2020, Houston Texas
Overview
Privacy-preserving data analysis has become essential in the age of Machine Learning (ML) where access to vast amounts of data can provide gains over tuned algorithms. A large proportion of user-contributed data comes from natural language e.g., text transcriptions from voice assistants.
It is therefore important to curate NLP datasets while preserving the privacy of the users whose data is collected, and train ML models that only retain non-identifying user data.
The workshop aims to bring together practitioners and researchers from academia and industry to discuss the challenges and approaches to designing, building, verifying, and testing privacy preserving systems in the context of Natural Language Processing.
The topics to be covered in the workshop include:
Personal information detection: classifying whether text contains the author’s personal information,
Privacy-preserving text analysis: how differential privacy and homomorphic encryption can be used to preserve user privacy when integrated in word embeddings and calculations that use them,
Privacy enhancing technologies and AI: discussing the state of research in privacy-preserving AI and the kinds of technologies that can be integrated into AI in order to preserve privacy.
Agenda
Venue: Hyatt Regency Houston/Galleria | Room: Regency D (Level 2)
08:45 Welcome
Oluwaseyi Feyisetan
09:00 Keynote
Tom Diethe (Amazon UK)
10:00 Coffee break
10:30 Invited Talk
Perfectly Privacy-Preserving AI: What is it and how do we achieve it?
Patricia Thaine (University of Toronto)
11:15 Invited Talk
Oluwaseyi Feyisetan (Amazon US)
12:00 Lunch
13:00 Research Paper
Privacy-Aware Personalized Entity Representations for Improved User Understanding
Levi Melnick (Microsoft)
13:30 Research Paper
Classification of Encrypted Word Embeddings using Recurrent Neural Networks
Robert Podschwadt (Georgia State University)
14:00 Research Paper
A K M Nuhil Mehdy (Boise State University)
14:30 Research Paper
Vijayanta Jain (University of Maine)
15:00 Coffee break
15:30 Panel discussion
16:00 Closing remarks
Invited Speakers
Tom Diethe (Amazon, UK) - Keynote Speaker
Tom Diethe is an Applied Science Manager in Amazon Research, Cambridge UK. Tom is also an Honorary Research Fellow at the University of Bristol. Tom was formerly a Research Fellow for the "SPHERE" Interdisciplinary Research Collaboration, which is designing a platform for eHealth in a smart-home context. This platform is currently being deployed into homes throughout Bristol.
Tom specializes in probabilistic methods for machine learning, applications to digital healthcare, and privacy enhancing technologies. He has a Ph.D. in Machine Learning applied to multivariate signal processing from UCL, and was employed by Microsoft Research Cambridge where he co-authored a book titled `Model-Based Machine Learning.' He also has significant industrial experience, with positions at QinetiQ and the British Medical Journal. He is a fellow of the Royal Statistical Society and a member of the IEEE Signal Processing Society.
Oluwaseyi Feyisetan (Amazon, USA)
Oluwaseyi Feyisetan is an Applied Scientist at Amazon Alexa where he works on Differential Privacy and Privacy Auditing mechanisms within the context of Natural Language Processing. He holds 2 pending patents with Amazon on preserving privacy in NLP systems. He completed his PhD at the University of Southampton in the UK and has published in top tier conferences and journals on crowdsourcing, homomorphic encryption, and privacy in the context of Active Learning and NLP. He has served as a reviewer at top NLP conferences including ACL and EMNLP. He is the lead organizer of the Workshop on Privacy and Natural Language Processing (PrivateNLP) at WSDM with an upcoming event scheduled for EMNLP. Prior to working at Amazon in the US, he spent 7 years in the UK where he worked at different startups and institutions focusing on regulatory compliance, machine learning and NLP within the finance sector, most recently, at the Bank of America.
Patricia Thaine (University of Toronto, Canada)
Patricia Thaine is a PhD Candidate at the Department of Computer Science (University of Toronto) doing research on Privacy-Preserving Natural Language Processing, with a special focus on Applied Cryptography. She also does research on computational methods for lost language decipherment. Patricia is a recipient of the NSERC Postgraduate Scholarship, the RBC Graduate Fellowship, the Beatrice ‘Trixie’ Worsley Graduate Scholarship in Computer Science, and the Ontario Graduate Scholarship. She has eight years of research and software development experience, including at the McGill Language Development Lab, the University of Toronto’s Computational Linguistics Lab, the University of Toronto’s Department of Linguistics, and the Public Health Agency of Canada. She is the Co-Founder and CEO of Private AI, the former President of the Computer Science Graduate Student Union at the University of Toronto, and a member of the Board of Directors of Equity Showcase, one of Canada’s oldest not-for-profit charitable organizations.
Key Dates
Information about the workshop's topics of interest can be found in the Call for Papers.
We have a flyer, help us publicize the workshop by sharing the flyer with colleagues.
Submission Deadline:
December 7, 2019December 15, 2019Acceptance Notification: January 4, 2020
Poster / extended abstract deadline: January 10, 2020
Camera-ready versions: January 20, 2020
Workshop: February 7, 2020
Contact
privatenlp-wsdm@googlegroups.com