Homepage of Cheng-Zhi Zhang

Cross-language Information Processing & Retrieval

        互联网上的信息日益多语言化,进行跨语言的信息处理和检索多加强各国科技、经济、文化等交流具有重要的意义。跨语言信息处理与检索属于多语言信息获取(Multilingual Information Access)研究范畴,主要涉及信息科学、人工智能以及其他相关领域,如图1所示。本研究组研究主要侧重于对多语言文本(主要为中文和英文)进行跨语言的文本处理(Cross-language Information Processing),包括跨语言关键词提取、跨语言文本分类、跨语言文本聚类和检索(Cross-language Information Retrieva)等。
 
 

Fig1.  Related Fields about Multilingual Information Access. [ See: Douglas W. Oard, IRAL99] 

     
         Links:
   
  • Cross-Language Information Retrieval Resources (By Doug Oard)
  • Cross-Lingual Text Classification (By Jie Tang)
  • LREC 2008 Workshop on Comparable Corpora
  • CLIA2007 workshop
  • CLIA2008 workshop
  • CLIP2007
  • CLIP2006
  • CLIP2005
  • MMIES-2
  • MMIES-1
  • RANLP-2007 Workshop on Acquisition and Management of Multilingual Lexicons
  • NIPS 2006 Workshop Machine Learning for Multilingual Information Access
  • EACL 2006 Workshop on Cross-language Knowledge Induction
  • Eurolan2005 ( Workshop on Cross-Language Knowledge Induction)
  • ACL 2005 Workshop on Building and Using Parallel Texts
  • HLT/NAACL 2003 Workshop on Building and Using Parallel Texts
  • Special Topic Section on Multilingual Information Systems ( JASIST,2006,57(5) )
  • Multi-lingual Language Processing
  • Machine Translation Archive
  • Sentence Alignment and Word Alignment: Projects, Papers, Evaluation, etc.
  • Automatic summarization of Multiple(Multilingual) Documents
  • Language Technology at the JRC 
  • JRC Workshop on EUROVOC
  • Top 10 Languages
  • Mining Multilingual Documents
  • CINDOR
  • 语料天涯
  • The International Corpus of English (ICE)
  • ACL SIGWAC
  • Multilingual Glossary of technical and popular medical terms in nine European Languages
  • MultiLingual Computing
  • Language Grid
  • EDR Electronic Dictionary
  • Multilingual Information Management: Current Levels and Future Abilities
  •  
  • Language Technology at JRC
  • NLP Group at UNED
  • Web Knowledge Discovery Lab at Sinica
  • JULIE Lab
  • KLE: Knowledge & Language Engineering
  • Natural Language Processing Lab @ Linkoping University
  • Wikipedia Laboratory
  • Ralf  Steinberger
  • Christopher C. Yang
  • Chih-Ping Wei
  • Pascale Fung
  • Chung-Hsing Yeh
  • HSIN-HSI CHEN
  • Eduard Hovy
  • Bruno Pouliquen
  • Jian-Yun Nie 
  • Woosung Kim
  • Nicola CANCEDDA
  • Saif Mohammad
  • Philip Resnik
  • Pierre Zweigenbaum
  • Carol Peters
  • Martin Volk
  • Kalervo Jarvelin
  • Ke Ping
  • Key-Sun Choi
  • Nigel Collier
  • Keita Tsuji
  • Viktor Pekar
  • Diana Inkpen
  • Gayo Diallo
  • Catherine Roussey
  • Paul Rayson
  • Willy Vandeweghe
  • Wolfgang Teubert
  • Tuomas Talvensaari
  • Dragos Stefan Munteanu
  • Fatiha SADAT
  • Jiangping Chen
  • Jason S. Chang
  • Bing Zhao
  • Akiko Aizawa
  • Andrea Mulloni
  • Yanjun Ma

  •           Our Projects:
    • National Natural Science Foundation (No. 70903032): Multilingual Documents Clustering Based on Comparable Corpus (2010-1012), PI
    • National Key Project of Scientific and Technical Supporting Programs funded by Ministry of Science & Technology of China: Information Service System of Scientific and Technical Documents: Key Techniques and Application Demonstration  (No. 2006BAH03B02, 2006BAH03B04) (2006-2009)
             Our Publications:
    • Zhang Chengzhi. Extracting Chinese-English Bilingual Core Terminology from Parallel Classified Corpora in Special Domain. In: Proceedings of Workshop on Natural Language Processing and Ontology Engineering (NLPOE 2009) in conjunction with Conference on Web Intelligence (WI/IAT-09). Milan, Italy, 2009: 271-274.  [PPT
    • Zhang Chengzhi,Wang Huilin. Survey on Multilingual Document Clustering. New Technology of Library and Information Service, 2009, (6): 31-36.  (in Chinese with English abstract)
    • Wu Dan, He Daqing, Wang Huilin, Shi Chongde, Zhang Chengzhi. Does Query Length Matter? A Comparison of Query Expansion Methods in English-Chinese Cross-Language Information Retrieval. Journal of Computational Information Systems, 2008, 4(3): 1213-1222.