Research Projects

Construction of Gender in Primary Literacy Resources: A Study of Readers for Hong Kong Early Learners (General Research Fund, Research Grants Council, 1/2018 - 12/2019, Co-investigator)
  • The present study arises from the belief that early school readers play a major role in the socialization of children’s lives and their gender development. Given that schooling is an important medium of reproducing cultures and of perpetuating the existing mechanisms of domination, the present study aims to examine how males and females are represented in early readers. 
  • A total of 180 readers for Key Stage 1 (Primary 1-3) will be selected for the analysis. The texts will be examined at the macro and micro levels to investigate how the experiential, relational and expressive values associated with the two genders are manifested.
  • The value of the present study is that it will help heighten the public awareness of the power of book authors to position the two genders through various means, and advance our knowledge about language and gender studies in Asian regions, which is a field that has not yet been explored adequately.

Research Capacity Building in Linguistics and Language Studies (Central Reserve Fund, Education University of Hong Kong, 1/2017 - 12/2019, Co-Investigator)

  • Research initiatives are developed for capacity building in linguistics and language studies. Two strategic areas (1) multilingual learning in Hong Kong and (2) corpus-based research in education and language use, have been identified for further strengthening in ground works and for developing regional / international research network.

A Digital Language Museum on Cantonese (Knowledge Transfer Fund, Faculty of Humanities, 1/2015 - 6/2015)

  • This KT project has the focused area on “Hong Kong cultural heritage”. It plans to convert some of the findings of the PI’s current ECS project on the development of Cantonese into a “digital language museum” to highlight some interesting but unnoticeable linguistic changes of the Cantonese language over the past 60 years. 
  • The museum will be in the form of a website and the language items to be displayed include vocabulary and grammatical structures. Some of these features will be displayed together with the selected video segments from the Cantonese movies. The language museum is an initiative to document and preserve the Cantonese language and also to introduce to the public the relevant socio-cultural issues of old Hong Kong. 
  • This proposed KT project is also a timely response to the First Inventory of Intangible Cultural Heritage of Hong Kong  released by the Hong Kong SAR Government in mid-June 2014. 

Linguistic Analysis of Mid-20th Century Hong Kong Cantonese by Constructing an Annotated Spoken Corpus (Early Career Scheme, Research Grants Council, 12/2013 - 11/2015)
  • This proposed project takes on a real-time and corpus-based approach to study the development of Cantonese  in the past six decades. An annotated corpus of mid-20th century Hong Kong Cantonese is constructed with spoken data drawn from the dialogs of 50 Cantonese movies (generally known as 粵語長片) produced in Hong Kong between the 1950s and the 1970s.
  • When compared with the 19th century Cantonese materials which mostly include literary works such as language manuals, dictionaries or translation of Bible, this proposed corpus draws on authentic and natural spoken data which can largely reflect and represent the language actually used in the society of the period concerned. The real-time nature of the corpus data could also allow us to uncover some linguistic usage or features unnoticed in previous Cantonese studies, especially those that are not in current use in contemporary Cantonese. 
  • The dialogs of the selected movies are transcribed with Chinese characters, then annotated with parts-of-speech and Cantonese pronunciations, non-linguistic attributes, such as genders and names of speakers, etc. It is estimated that the corpus will consist of about half million Chinese characters and the processed and indexed data will be available for search through a user-friendly online search engine.

Development of a Multilingual and Multi-modal English-Mandarin-Cantonese-Japanese Parallel Corpus and an Online Parallel Concordancing Platform for Comparative Linguistic Studies (Departmental team research grant, 4/2013 - 4/2014)
  • In this project, we plan to develop research agendas in a number of directions: compilation of a multilingual and multimodal English-Mandarin-Cantonese-Japanese parallel corpus for research and pedagogical purposes; development of a parallel concordancing computer program that allows users to search the multilingual parallel corpus and the system will automatically produce parallel concordance examples and other useful corpus statistics; development of an online platform that hosts the multilingual parallel corpus and the parallel concordancing program, allowing researchers and language teachers/learners to fully explore the corpus data; comparative linguistic studies based on the multilingual parallel corpus data: Mandarin-English, Mandarin-Cantonese, English-Cantonese, English-Japanese, Japanese-Mandarin, and Japanese-Cantonese comparative studies; and translation studies based on the multilingual parallel corpus data.

A preliminary linguistic analysis of mid-20th Century Cantonese from a corpus-based approach (HKIEd Internal Research Grant, 2/2013 - 2/2014)
  • This project aims to undertake a preliminary study on mid-20th century Hong Kong Cantonese by means of corpus linguistics analysis. The corpus data is based on the transcribed dialogues of 20 Hong Kong Cantonese movies (generally known as 粵語長片) (a previous project supported by HKIEd's internal research grant from 2011-2012). This project will focus on lexical items and their usage which are different from modern Cantonese.


Promotion of Corpus Linguistics Research at the Department of Linguistics & Modern Language Studies (LML)  (KT Research Fund, Faculty of Humanities, 3/2013-6/2013)
  • There are three major objectives of this proposed KT project: (1) To introduce and showcase the various linguistic corpora and relevant corpus linguistic tools developed at LML, and their potential values in text-based humanity research; (2) To introduce to the target participants the relevant methodology and challenges in undertaking corpus-based research, especially on the Chinese language; (3) To enhance professional development of target participants by elevating their sensitiveness and awareness in using authentic and natural language materials in humanity research.

A Typological and Sociolinguistic Study of the Gelong Language Spoken in Western Hainan (General Research Fund, Research Grants Council, 1/10/2011 - 31/3/2014)
  • Hainan Province (海南省) has long been a place where people of different language varieties have interacted. This project proposes to investigate the unusual Gelong (村語/哥龍話) language now spoken in Dongfang city (東方市), western Hainan, from typological and sociolinguistic perspectives. The proposed project will (1) provide documentation on Gelong aiming at giving an account of the inter-relationship among Li, Chinese and other language varieties spoken in Dongfang city; (2) conduct a sociolinguistic survey which can provide the broader sociolinguistic landscape on the linguistic demography and more specifically on language shift/drift as well as language loss in the Gelong speech community; (3) undertake detailed fieldwork studies (with speakers of different generations) on some salient syntactic structures to study on-going typological change in Gelong.

Spoken Corpus Construction and Linguistic Analysis of Mid-20th Century Cantonese (HKIEd Internal Research Grant, 1/2011-1/2012)
  • This project proposes to construct a Cantonese spoken corpus which consists of transcribed and annotated data on some selected Cantonese movies [generally known as 粵語長片] produced between 1950 and 1970. The corpus will supplement additional spoken data for close examination of the linguistic features/developments as well as new/ongoing linguistic changes in the mid-20th century Cantonese. It will provide a different perspective for studying the Cantonese dialects as well as its changes as well as the factors contributing to these changes in the past 50 years.

A Quantitative and Qualitative Comparison of Word Formation in Modern Standard Chinese & Early Modern Chinese (CERG, Research Grants Council, 1/1/2009-31/12/2012)
  • This project proposes a rigorous and innovative comparison of Modern Standard Chinese language (MSC) and Early Modern Chinese language (EMC) particularly with respect to word formation and lexical development, drawing on qualitative analysis followed by linguists and quantitative measures established in information sciences. While MSC evolves from EMC and Classical Chinese, and all share a largely common set of several thousand Chinese characters, it is usual for someone educated in MSC to have only limited comprehension of EMC and Classical Chinese. We thus ask why such a largely common base could make a vast linguistic difference over time, and how lexical, syntactic, and semantic evolution has changed the way Chinese characters carry information content. To this end, besides describing such changes qualitatively, it would also be useful to characterize the changes by some objective quantitative indices, which would allow comparison of our language at one point of time or across several in history. We thus aim at comparing EMC and MSC primarily from the lexical perspective, with respect to issues and differences in word formation and related grammatical changes in the linguistic properties of lexical items.

A Comparative Study of 10 Peripheral Yue Dialects: Contribution to Chinese Linguistics (CERG, Research Grants Council, 1/2008-6/2010)
  • Recent research on Cantonese, a major dialect group of China, has drawn attention to some uncommon and salient features of linguistic variation compared with other Chinese dialects. They also raise questions on the history and development of Cantonese and so of the Chinese language. The present project has identified 10 little documented Yue dialects peripheral to the Pearl River Delta Hubs, and proposes to examine at least 6 such uncommon linguistic features to achieve a better understanding of: (1) a fuller range of Yue dialects found in the north and west of the Pearl River Delta and in Guangxi, as well as newly discovered Yue dialects in Hainan, (2) the linguistic history of Cantonese in the light of possibly competing contributions from internal change and from its extensive contact with non-Sinitic languages, and the contribution this could make to the historical development of the Chinese language, and (3) the general nature of linguistic development which may be influenced by external factors such as language contact and by internal factors such as contiguous development with the language, and the interaction of such factors.