Evelyne Tzoukermann, Ph.D.

evelyne.tzoukermann at gmail.com


·         Natural Language Processing:  analysis infrastructure including morphological processing, part-of-speech tagging, lexical semantic relations for multilingual corpus analysis

·         Machine translation and speech-to-speech machine translation

·         Information Extraction from text and multimedia

·         Summarization from text and multimedia 

·         Information Retrieval from text and multimedia

·         Text-to-Speech Synthesis



I am a Senior Researcher and Manager of a Natural Language Processing research group at StreamSage, part of Comcast, since 2003.   My current research activities include the use of natural language processing techniques to contribute to the challenges of high quality information extraction and retrieval over video input.  More specifically, effective retrieval over video input requires creative combinations of technologies to triangulate on the retrieval task, including natural language over speech and text, combined with video information to locate characters, objects, and actions in multimedia content.

Prior to joining StreamSage, I was a member of the technical research staff at Bell Laboratories, (first AT&T and then Lucent Technologies.)  I developed a new approach to automatic morphological processing using finite-state machines, part-of-speech tagging based on linguistic knowledge and statistical information as well as weighted finite-state transducers.  I also designed and developed text-to-speech systems for Continental and Canadian French, the components of which include acoustic inventory, duration module, intonation module, and grapheme-to-phoneme conversion; my research on information extraction and summarization focused on extracting the gist of email content for presentation  to users.

Patents cover: information indexing for retrieval; identification of segment boundaries within audio, video, and multimedia items; segmentation specific to TV programs; organization of segments of media to determine relevance to a query. 


SELECTED PUBLICATIONS – Book Chapters, Journal Articles

  • Tzoukermann, Evelyne, Geetu Ambwani, Amit Bagga, Leslie Chipman, Tony Davis, Ryan Farrell, David Houghton, Oliver Jojic, Jan Neumann, Robert Rubinoff, Bageshree Shevade, and Hongzhong Zhou, “Semantic Multimedia Extraction using Audio and Video”, in Multimedia Information Extraction, Editor:  Mark Maybury, to appear.
  • Tzoukermann, Evelyne, Judith Klavans, and Tomek Strzalkowski, “Information Retrieval and Natural Language Processing'“ in Handbook of Computational Linguistics, Editor: Ruslan Mitkov, Oxford University Press, England, 2002.
  • Jing, Hongyan and Evelyne Tzoukermann, “Determining Semantic Equivalence of Terms in Information Retrieval”, in Recent Advances in Computational Terminology, eds. D. Bourigault, Ch. Jacquemin, and M-C. L'Homme, Natural Language Processing, John Benjamins, 2001.
  • Tzoukermann, Evelyne and Dragomir R. Radev, “Use of Weighted Finite State Transducers in Part of Speech Tagging”, in Extended Finite State Models of Language, ed. Andras Kornai, Cambridge University Press, 1999.
  • Jacquemin, Christian and Evelyne Tzoukermann, “NLP for Term Variant Extraction: Synergy between Morphology, Lexicon and Syntax”, in Natural Language Information Retrieval, ed. Tomek Strzalkowski, Kluwer, Boston, MA, 1999.
  • Tzoukermann, Evelyne, Dragomir R. Radev, and William A. Gale, “Tagging French Without Lexical Probabilities”, in Natural Language Processing using Very Large Corpora, eds S. Amstrong, S. Manzi,  K. Church, P. Isabelle, E. Tzoukermann, and D. Yarowsky, Kluwer, 1999.
  • Sproat, Richard, Bernd Möbius, Maeda Kazuaki, and Evelyne Tzoukermann, “Multilingual Text Analysis”, in Multilingual Text-to-Speech Synthesis: The Bell Labs Approach, ed. Richard Sproat, Kluwer, Boston, MA, 1997.
  • Klavans, Judith and Evelyne Tzoukermann, “Combining Corpus and Machine-Readable Dictionary Data for Building Bilingual Lexicons”, in The Machine Translation Journal, Kluwer, 1996:10(3-4): 185-218.
  • Klavans, Judith and Tzoukermann Evelyne, “Morphology”, Encyclopedia of Artificial Intelligence, ed. S. Shapiro, John Wiley and Sons, New York, 1992.

PUBLICATIONS - Refereed Conference Proceedings

  • Tzoukermann, E., Anthony Davis,  David Houghton, Philip Rennert, Robert Rubinoff, Tim Sibley, and  Goldee Udani, Knowledge Discovery via Content Indexing of Multimedia and Text, 2005 International Conference on Intelligence Analysis, Mitre, McLean, Virginia, 2005.
  • Davis, A., Rennert, P., Rubinoff, R., Sibley, T., and Tzoukermann, E., “Retrieving what’s relevant in audio and video: statistics and linguistics in combination”, Proceedings of RIAO, Avignon, France, 2004.
  • Tzoukermann, Evelyne, Smaranda Muresan, and Judith L. Klavans, “GIST-IT: Combining Linguistic and Machine Learning Techniques for Email Summarization”, Human Language Technology and Knowledge Management (HLT/KM), ACL/EACL Conference, Toulouse, France, 2001.
  • Tzoukermann, Evelyne, Lucie Ménard, and Marise Ouellet, “Canadian French Text-To-Speech Synthesis: Modeling an Optimal Set of Realizations for Dialect Markers”, Eurospeech, Budapest, Hungary, 1999.
  • Jing, Hongyan and Evelyne Tzoukermann, “Information Retrieval Based on Context Distance and Morphology”, Twenty Second International Conference on Research and Development in Information Retrieval, Association for Computing Machinery, SIGIR'99, Berkeley, 1999.
  • Tzoukermann, Evelyne, Judith Klavans and Christian Jacquemin, “Effective Use of Natural Language Processing Techniques for Automatic Conflation of Multi-Word Terms: The Role of Derivational Morphology, Part of Speech Tagging, and Shallow Parsing”, Twentieth International Conference on Research and Development in Information Retrieval, Association for Computing Machinery, SIGIR'97, Philadelphia, 1997.
  • Tzoukermann, Evelyne and Judith Klavans, “Determining Concatenative Units for Speech Synthesis”, The Journal of the Acoustical Society of America, Boston, Massachusetts, 1994.
  • Klavans, Judith and Evelyne Tzoukermann, “The Use of Machine-Readable Dictionaries in Text to Speech”, Proceedings of the 15th International Conference on Computational Linguistics: COLING., Kyoto, Japan, 1994.
  • Tzoukermann Evelyne, “Issues in Text-to-Speech for French”, Proceedings of the 15th International Conference on Computational Linguistics: COLING, Kyoto, Japan, 1994.
  • Tzoukermann Evelyne and Judith Klavans, “Inducing Concatenative Units from Machine Readable Dictionaries and Corpora for Speech Synthesis”, Proceedings of the 2nd International Conference on Spoken Language Processing, Yokohama, Japan, 1994.
  • Tzoukermann, Evelyne, “Text-to-Speech for French”, Proceedings of the ESCA Workshop on Speech Synthesis, Mohonk, USA, 1994.
  • Tzoukermann, Evelyne, Roberto Pieraccini, and Zakhar Gorelov, “Natural Language Processing in the CHRONUS System”, Proceedings of the 2nd International Conference on Spoken Language Processing, Banff, Canada, 1992.
  • Pieraccini, Roberto, Evelyne Tzoukermann, Zakhar Gorelov, Jean-Luc Gauvain, Esther Levin, Chin-Hui Lee, and Jay G. Wilpon, “A Speech Understanding System Based on Statistical Representation of Semantics”, Proceedings of ICASSP, San Francisco, 1992.
  • Tzoukermann, Evelyne and Mark Liberman, “A Finite-State Morphological Processor For Spanish”, Proceedings of the 13th International Conference on Computational Linguistics: COLING, Helsinki, Finland, 1990.
  • Tzoukermann, Evelyne and Roy Byrd, “The application of a morphological analyzer to on-line French dictionaries”, Proceedings of the Third International Congress of the European Association for Lexicography, Budapest, Hungary, 1988.



  • Tzoukermann, Evelyne and Christian Jacquemin: Methods and Apparatus for Information Indexing and Retrieval as well as Query Expansion using Morpho-Syntactic Analysis. Patent #: 6101492, Filed: 8/8/2000.



  • Book: “Natural Language Processing Using Very Large Corpora” for Kluwer, S. Armstrong, K. W. Church, P. Isabelle, S. Manzi, E. Tzoukermann, and D. Yarowsky, editors, 1999.
  • Journal: Traitement automatique des Langues (Hermès) - Special Issue on Text-to-Speech Synthesis Systems, Christophe D'Allessandro and Evelyne Tzoukermann, editors, Volume 42, 1, 2001.



Fulbright Scholar, Brown University, Providence, R.I.  

Computational linguistics plus full university fellowship.                                                                                                                       

Ph.D.  Computer Science and Linguistics,  Sorbonne-Paris III.   - CERTAL (Research and Development Center for Natural Language Processing), INALCO (Institut National des Langues et Civilisations Orientales). Summa cum laude.   Dissertation:  “Morphology and Automatic Generation of French Verbs:  Implementation of a Dialogue Interface”.

M.A. Linguistics, University of Paris III, Sorbonne Nouvelle.                                                                   

(Diplôme d'Etudes Approfondies) in Applied Linguistics.

Diplôme, University of Paris III, Sorbonne Nouvelle.                                                                                      

French as a Foreign Language (Diplôme de Didactique des Langues).

 M.A. Russian Language and Literature, University of Paris IV, Sorbonne Nouvelle.                                  




Fulbright fellowship from Brown University. Post-doctoral research with Professor Henry Kučera.                                                          

Research fellowship for doctoral studies (Doctorat du 3ème cycle) from the French Ministry of Research and Technology (competitive grant given to the top 1% of doctoral students).                                           

Grant toward MA studies (D.E.A.) from the French Ministry of National Education (competitive grant).