iNEWS'07 - "Improving Non English Web Searching"

SIGIR 2007 Workshop

Fotis Lazarinis - Technological Educational Institute Mesolonghi, Greece
Jesús Vilares Ferro - Universidade da Coruña, Spain
John Tait - Information Retrieval Facility, Austria


Over 60% of the online population are non-English speakers and it is probable the number of non-English speakers is growing faster than English speakers. Recent studies showed that non-English queries and unclassifiable queries have nearly tripled since 1997. Most search engines were originally engineered for English. They do not take full account of inflectional semantics nor, for example, diacritics or the use of capitals.

The main conclusion from the literature is that searching using non-English and non-Latin based queries results in lower success and requires additional user effort so as to achieve acceptable recall and precision. Further international search engines (like Yahoo! and Google) are relatively weaker with monolingual non-English queries.

New tools and resources are needed to support researchers in non-English retrieval. New methodologies need to be proposed which will help the identification of problems in existing search engines. New teaching strategies should be formed aiding users to become more efficient in formulating their queries.