My research focuses mainly on finance in areas such as Behavioral Finance, Asset Pricing, and Corporate Finance. Aside from finance, I have a special interest in machine learning, Natural Language Processing for textual analysis.
In particular, my research first examines how social media sentiment prices securities. The second level relates to behavioral finance whereby how users' characteristics affect the judgment of their sentiment in predicting future stock returns. Third I examine how firm opacity affects users' readability in fundamental.
Another line of research relates to asset pricing, I provided a possible explanation on the flatness of CAPM Securities Market Line. Such flatness was in fact time-varying due to some macro factor other than analyst disagreement and short-selling constraints.
For machine learning, various methods have been proposed for analyzing textual documents. The growing field of Natural Language Processing interests me the most because it does not rely on a predefined corpus but uses a statistical model to "learn" words and sentence composition. Since the statistical machine learning model does not depend on a pre-defined rule (like grammar, or sentence composition) in understanding words, they are essentially a black box.
To gain better knowledge in machine understanding I recently helped set up a benchmark just for that called TempoWiC with Cardiff NLP. The main idea is that different time periods will have a different meaning for a given word and that also depends on the context. For example, the meaning of "virus" in 2020 will be very different to, say, 2015. The benchmark essentially asks the NLP model whether both meanings are the same or not.
"TempoWiC: An Evaluation Benchmark for Detecting Meaning Shift in Social Media"
with Daniel Loureiro, Aminette D'Souza, Areej Nasser Muhajab, Isabella A. White, Luis Espinosa Anke, Leonardo Neves, Francesco Barbieri and Jose Camacho-Collados
[arXiv]
Abstract
Language evolves over time, and word meaning changes accordingly. This is especially true in social media, since its dynamic nature leads to faster semantic shifts, making it challenging for NLP models to deal with new content and trends. However, the number of datasets and models that specifically address the dynamic nature of these social platforms is scarce. To bridge this gap, we present TempoWiC, a new benchmark especially aimed at accelerating research in social media-based meaning shift. Our results show that TempoWiC is a challenging benchmark, even for recently-released language models specialized in social media.
Conference: EMNLP 2022 EvoNLP Shared Task: Temporal Meaning Shift
Job Market Paper
"Social-Media Sentiment, Limited attention, and Stock return" with Woon Sau Leung and Woon Wong
[SSRN]
Abstract
Using tweets from StockTwits and machine-learning techniques in classifying them, we find that social-media sentiment predicts positively and significantly future stock returns, and, importantly, such positive predictability decreases when the number of stocks users tweeted about increases. Such return predictability likely stems from users’ ability in forecasting future earnings. Additional tests reveal that the reduced predictability due to stock coverage is significant only for firms that are complex, opaque, and thus hard-to-analyze. Together, our evidence suggests that stock analysis by users following many stocks is inferior due to attention and time constraints.
Workshops and Conferences: Cardiff Economics PhD Workshop 2019, Welsh Postgraduate Research Conference 2019, Cardiff-Newcastle-Xiamen Conference 2019
Media Coverage:
"The Power of Social Media in Predicting Stock Returns", Institutional Investor, January 2, 2020.
Social Media Sentiment, Limited Attention & Stock Returns, Ted Merz - Head of Bloomberg global news product personal blog, January 21, 2020.
Other papers
"Corporate Culture and Firm Value: Evidence from Crisis" with Yiwei Fang, Franco Fiordelisi, Iftekhar Hasan, and Woon Sau Leung [SSRN]
"Female Equity Analysts and Corporate Environmental and Social Performance" with Kai Li, Feng Mai, Chelsea Yang, and Tengfei Zhang [SSRN]
"Gender, Competition, and Performance: International Evidence" with Kai Li, Qiyuan Peng, and Rui Shuen [SSRN]