Short Bio/CV

Zdravo! Ciao! Hola! Здравствуйте! Welkom! Welcome! Bienvenue! Willkommen! 

I am a Senior Research Associate at the University of Cambridge working on the LEXICAL project (Lexical Acquisition Across Languages, 2015-2020) together with Anna Korhonen (the PI of the LEXICAL project) and Roi Reichart since October 2015

I hold a PhD in computer science from KU Leuven (awarded summa cum laude with congratulations of the board of examiners).

I am very interested in natural language processing, human language understanding, machine learning theory and applications, and information retrievalmostly in multilingual/cross-lingual and multi-modal settings, including (but not limited to) bilingual lexicon extraction and cross-lingual semantic modeling, cross-lingual and multilingual information retrieval, distributional semantics, cross-lingual text mining and knowledge transfer, language grounding and cognitive modeling of language, lexical acquisition, text representation learning, latent topic models, probabilistic modeling of text data, terminology mining and alignment, machine translation, unsupervised techniques for languages with scarce resources, multi-idiomatic and multi-modal information search and retrieval, multi-modal and visually/perceptually-enhanced semantics, etc.

Feel free to contact me if your research lies within these or related areas!

In the meantime, also feel free to take a look at my CV, longer bio, and publications.

For reasons beyond my own comprehension, I have created a Twitter account: @licwu

I am also a Senior Scientist at PolyAI, a young and extremely talented startup company working on cutting-edge Conversational AI technology.

Breaking News

December 2018: I got tired of posting news here, so I'll just make updates to the publications list and CV from now on.

September 2018: I gave an invited talk at the FoTran workshop in Helsinki.

August 2018: Two long papers accepted for EMNLP 2018 in Brussels (with Edoardo, Daniela, Goran, Nikola, Roi, Anna). We will also present our TACL paper there.

August 2018: I gave a course (a series of 5 90-minute lectures) on the topic of word vector space specialisation at ESSLLI 2018 in Sofia.

July 2018: I gave an invited talk on data-driven dialogue state tracking using word embeddings at the Conversational AI Summer School in Moscow. The slides are here.

June 2018: Geert's paper on bilingual lexicon induction for biomedicine using deep neural nets accepted in BMC Bioinformatics

June 2018: I co-lectured a tutorial on deep learning for conversational AI with the PolyAI crew at NAACL 2018 in New Orleans. The slides are here.

May 2018: TACL paper on (improving) language modeling across 50 languages to appear soon (congratulations to Daniela and other authors)

April 2018: 6 papers (4 long papers and 2 short papers) accepted for ACL 2018 in Melbourne (with so many amazing collaborators)

April 2018: Short paper on fully unsupervised CLIR accepted for SIGIR 2018 in Ann Arbor (with the Mannheim crew: Robert Litschko, Goran, and Simone Paolo Ponzetto)

February 2018: Two first-author long papers and one short paper accepted for NAACL 2018 in New Orleans (with Nikola x2; with Anna and Goran; with Goran)

February 2018: Together with the PolyAI crew, I will co-lecture a tutorial on deep learning for conversational AI at NAACL 2018 in New Orleans

January 2018: Billy's article on word similarity datasets for biomedicine accepted in the BMC Bioinformatics journal (with Billy, Sampo, and Anna)

October 2017: A survey paper on cross-lingual word embedding models is on arXiv. (with Sebastian and Anders)

September 2017: Nikola and I will teach a course on vector space specialisation at ESSLLI 2018 in Sofia.

September 2017: Olga's paper on translatability of VerbNets accepted for the Language Resources and Evaluation journal (with lots of people involved)

September 2017: I gave a talk on cross-lingual embeddings @CPH NLP Meetup in Copenhagen and @Apple in Cambridge.

September 2017: Anders and I gave a tutorial on cross-lingual word representations at EMNLP 2017. The slides are here.

September 2017: I got promoted to Senior Research Associate!

July 2017: Paper on (evaluation of) graded lexical entailment accepted in Computational Lingustics! Check the HyperLex dataset and the non-final paper version.

July 2017: Long paper on cross-lingual induction and transfer of verb classes accepted for EMNLP 2017! (with Nikola and Anna)

April 2017 - (in progress): I am (co-)writing a book for Morgan & Claypool on cross-lingual representation learning with Manaal Faruqui (Google) and Anders Søgaard (UCPH)

Old(er) News

May 2017: TACL paper on semantic specialisation using the Attract-Repel model accepted! (with Nikola and many more co-authors) Code, vectors, DST evaluation sets: available online!

May 2017: Long paper on decoding sentiment from distributed sentence representations accepted for *SEM 2017! (with Edoardo and Anna)

April 2017: Nikola, Taher, and I gave a tutorial on vector space specialisation at EACL 2017. The slides are available here.

March 2017: Nikola and I gave a talk at Bar-Ilan.

March 2017: Long paper on "morph-fitting" accepted for ACL 2017! (with Nikola, Roi, Anna, Diarmuid Ó Séaghdha, and Steve Young) Code and vectors available online!

February 2017: Short paper on syntactically informed cross-lingual word representations accepted for EACL 2017! (single-author)

January 2017: A tutorial on cross-lingual word representations with Manaal Faruqui and Anders Søgaard accepted for EMNLP 2017! (Sept. 7 2017)

January 2017: A tutorial on vector space specialisation with Nikola Mrkšić and Taher Pilehvar accepted for EACL 2017! (April 3 2017)

December 2016: Two long papers accepted for EACL 2017! (with Douwe and Anna; with Geert and Sien) Our script for evaluating word representation learning models on the word association task will be available soon.

September 2016: I gave a talk at the South England NLP Meetup, organised by the UCL Machine Reading Group (Sept. 22nd)

August 2016: Our HyperLex dataset targeting graded lexical entailment is finally online! Check the dataset and the accompanying paper! (with Daniela, Douwe, Felix, and Anna)

July 2016: Long paper accepted for EMNLP 2016! (with Daniela Gerz, Felix Hill, Roi Reichart, and Anna Korhonen). Verb similarity dataset available here!

June 2016: Information Sciences article accepted! (with Susana Zoghbi and Sien Moens)

May 2016: Long paper accepted for ACL 2016! (with Anna Korhonen)

April 2016: Two short papers accepted for ACL 2016! (with Anna Korhonen; with Douwe Kiela, Stephen Clark, and Sien Moens)