Submission Deadline : April 20, 2022 (Extended until June 20 - CLOSED)
The Chinese University of Hong Kong
INVITED SPEAKER
Harnessing the power of social media in linguistic analysis: A diachronic and sociolinguistic study of Philippine English(es) using the Twitter Corpus of Philippine Englishes
Invited Lecture
Research on contemporary Philippine English remain relatively scarce and inadequate in comparison to research on other varieties such as American English and Singapore English, partially due to the lack of large-scale, organized, publicly available data sets that allow comprehensive and in-depth investigations of the variety. Responding to this demand, I introduce the Twitter Corpus of Philippine Englishes (TCOPE) – a 135-million-word corpus created from roughly 27 million tweets sampled from 29 major cities in the Philippine archipelago. In the first part of the talk, I provide an overview: I discuss the considerations that went into TCOPE’s design, the compilation procedure, the format, and how interested individuals can access the corpus. Then, I illustrate the utility of the corpus by showcasing how it can be used to insightfully examine the linguistic features of Philippine English as well as the relationship between these features and socio-temporal factors (e.g., ethno-geographic region, time, age, sex), focusing on four documented Philippine English features: (1) the use of irregular past tense morpheme -t, (2) double comparatives, (3) subjunctive were in subordinate counterfactual clauses, and (4) the phrasal verb base from. My initial explorations confirm patterns observed in previous research but go further to show the multifaceted and dynamic nature of Philippine English, providing empirical support for the theory that Philippine English is at the final stage of Schneider’s dynamic model. Because of its large size, sampling distribution, and its availability in different corpus formats, TCOPE can be used to investigate features in ‘general’ contemporary Philippine English as well as different types of variation, particularly diachronic and ethno-geographic variation – a feat that might not be possible with other Philippine English corpora. In combination with other existing corpora, TCOPE has the potential to broaden horizons in the diachronic and sociolinguistic study of Philippine English(es).
Bionote
Wilkinson Daniel Wong GONZALES is a linguist specializing in language variation, change, language contact, and language documentation in multilingual contexts. After receiving a Ph.D. in Linguistics and earning his post-graduate certificates in Data Science and Cognitive Science at the University of Michigan in Ann Arbor, he moved to Hong Kong to join the Department of English at The Chinese University of Hong Kong, where he is an Assistant Professor of Applied English Linguistics. As a linguist, Wil is particularly interested in sociolinguistics in the Philippines and in wider East Asia. He employs corpus-based, experimental, ethnographic, and computational techniques on diverse datasets, including natural speech data and social media data. He works on Sino-Philippine languages (e.g., Lánnang-uè) and other East Asian linguistic varieties, such as Colloquial Singapore English or 'Singlish' and Philippine English(es). As a data scientist, he has collaborated with and led two cross-functional teams – the Alzheimer’s Disease Machine Learning team at the Institute for Healthcare Policy & Innovation at the University of Michigan and the Natural Language Processing team of Digital Alpha Technologies headquartered in New York. He develops corpus-making and language analysis programs using machine learning and computational methods. A list of his publications and tools can be found on his website: www.wdwgonzales.com.
LINK TO PERSONAL WEBSITE