Jinrui Yang

Hi, this is Jinrui, welcome to my corner!

I am final-year PhD candidate at the School of Computing and Information Systems, University of Melbourne, advised by Trevor Cohn and Tim Baldwin (MBZUAI).

My interests lie broadly across different domains, including applied artificial intelligence and computational social science. I have always dreamed of making the world a little bit better with technology and education. ❤

Before my PhD, I received my Master of Computer Science degree from University of California Santa Cruz and worked as research intern with Dan Roth at University of Pennsylvania in 2020/21 and with Nanyun (Violet) Peng at University of Southern California in 2019. Prior to my academic pursuits oversea, I graduated with a honorable Bachelor degree from Sichuan University🐼.

[My CV]

News 🚀

[2025-10-19] Excited to share that our FEIT Diversity & Inclusion grant is approved!

We’ll host a 3-day AU-US workshop empowering women in tech across academia, industry, and entrepreneurship. The first event has now taken place! check 👉 here

[2025-08-25] I will give an oral talk to present our work at ICNLSP 2025, see you in Denmark!

[2024-11-01] I will present our work at EMNLP 2024, see you in Miami!

[2024-08-05] Thrilled to receive the 2024 FEIT Visiting Fellow Award and funding. Thanks FEIT for the solid support for my visiting in US!

[2024-04-26] Thrilled to receive the 2024 Robert Bage Memorial Scholarship! Thanks Mrs. Edward Bage for her generous support and remembering Captain Edward Frederick Robert Bage with sincerity🕯.

[2024-01-09] I will start working at UC Davis as visiting research scholar from 24' summer. Hello again, CA!

[2023-10-20] I will present our work at EMNLP 2023, see you in Singapore!

[2022-11-16] I will give an oral talk to present our work at EMNLP 2022, see you in Abu Dhabi!

EDUCATION

The University of Melbourne

Ph.D in Engineering and IT Feb. 2022 - Present

University of California, Santa Cruz(UCSC)

M.S. in Computer Science Sept. 2018 - Mar. 2020

Sichuan University

B.S. with Honor in Electrical Engineering Sept. 2011 - Jun. 2015

EXPERIENCE

University of California, Davis, LUKA NLP Lab

Visiting Researcher Sept. 2024 -present | California, United States

◦ Worked on Multimodal Large Language Model project

◦ Applied coherent Chain-of-Thought technologies to multi image VQA task to

improve its performance and robustness

University of Melbourne, NLP Lab & Social and Political Sciences Group

Research Assistant; Advisor: Prof. Leah Ruppanner and Dr. Lea Frermann April. 2022 - Oct. 2022 | Melbourne, Australia

◦ Worked on Gender Bias in CV project

◦ Applied Machine Learning and NLP technologies on social sciences research filed

◦ Accepted one paper by EMNLP CSS 2022

University of Pennsylvania, NLP Cognitive Compute Lab

Research Intern; Advisor: Prof. Dan Roth May. 2020 - May. 2021 | Philadelphia, United States

◦ Worked on Zero-Shot text classification project

◦ Individually developed the demo implementing with machine learning models

◦ Submitted two conference papers to ACL 2021 and EMNLP2021

◦ Accepted one paper by NAACL 2022 demo track

University of Southern California, NLP PLUS lab

Research Intern; Advisor:Prof. Nanyun (Violet) Peng June 2019 - Oct. 2019 | Los Angeles, United States

◦ Worked on creative text generation and story generation research

◦ Worked on one conference paper, submitted to ACL 2020

UCSC Natural Language and Dialogue Systems Lab

Graduate Student Researcher; Advisor:Prof. Marilyn Walker and Prof. Jeffrey Flanigan Jan. 2019 - Mar. 2020 |Santa Cruz, United States

◦ Worked on Neural Question Generation research

◦ Worked on two conference papers(submitted to EMNLP 2019) and one Master Degree Thesis

UCSC Natural Language Understanding Lab

Graduate Student Researcher; Advisor:Prof. Snigdha Chaturvedi Jan. 2019 - Jul. 2019 |Santa Cruz, United States

◦ Collaborated with Ph.D students on Story Generation and Opinion Summarization research

◦ Participated Ph.D group meetings and paper clinic

Le Wagon(global coding bootcamp)

Teaching Assistant Apr. 2018 - Jun. 2018 | Chengdu, China

◦ Coded in 12 weeks with 360 hours full-stack coursework study and over 10,000 lines of coding work

◦ Developed an application FleaMarket using Ruby on Rails and JavaScript with 3 teammates in 10 days

SELECTED PROJECTS

Benchmarking LLMs' Gender Bias and Political Leaning in European Parliament

We introduce EuroParlVote, a novel benchmark for evaluating large language models (LLMs) in politically sensitive contexts. It links European Parliament debate speeches to roll-call vote outcomes and includes rich demographic metadata for each Member. We evaluate state-of-the-art LLMs on two tasks—gender classification and vote prediction—revealing consistent patterns of gender bias and political leaning.

Paper; Code; Data; Demo

Language Bias in Multilingual Information Retrieval: The Nature of the Beast and Mitigation Methods

This paper based on the assumption that queries in different languages, but with identical semantics, should yield equivalent ranking lists when retrieving on the same multilingual documents. We evaluate the degree of fairness using both traditional retrieval methods, and a DPR neural ranker based on mBERT and XLM-R. Additionally, we introduce ‘LaKDA’, a novel loss designed to mitigate language biases in neural MLIR approaches.

Paper; Code; Data

Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in IR

We present Multi-EuP, a new multilingual benchmark dataset, comprising 22K multilingual documents collected from the European Parliament, spanning 24 languages. This dataset is designed to investigate fairness in a multilingual information retrieval (IR) context to analyze both language and demographic bias in a ranking context. We also conduct a preliminary experiment on language bias caused by the choice of tokenization strategy.

Paper; Code; Data

Professional Presentation and Projected Power: A Case Study of Implicit Gender Info in CVs

We introduce a data set of 1.8K authentic, English-language, CVs from the US, covering 16 occupations, allowing us to partially control for the confound occupation-specific gender base rates. We find that (1) women use more verbs evoking impressions of low power; and (2) classifiers capture gender signal even after data balancing and removal of pronouns and named entities, and this holds for both transformer-based and linear classifiers.

Paper; Data Available Upon Request

Towards Open-Domain Topic Classification

We introduce an open-domain topic classification system that accepts user-defined taxonomy in real time. Users will be able to classify a text snippet with respect to any candidate labels they want, and get instant response from our web interface. To obtain such flexibility, we build the backend model in a zero-shot way.

Paper; Code; Demo; Video

The Opposite QA task: Natural Language Generation from QA-SRL Annotations

QA-SRL is a Question-Answer driven Semantic Role Labeling annotation schema, where question-answer pairs are used to represent predicate-argument structure. We presented an opposite QA← task. For example, given a QA-SRL which including two QA pairs about predicate flew: (1) Who flew somewhere? A: Trump and (2) Q: Where did someone fly? A: Russia, our system could generate out a sentence like Trump flew to Russia.

Paper; Code

SELECTED PUBLICATIONS (* donates equal contributions)

Jinrui Yang, Xudong Han and Timothy Baldwin. Demographics and Democracy: Benchmarking LLMs' Gender Bias and Political Leaning in European Parliament. (ICNLSP 2025) Try here for demo!
Jinrui Yang, Fan Jiang and Timothy Baldwin. Language Bias in Multilingual Information Retrieval: The Nature of the Beast and Mitigation Methods. (EMNLP MRL 2024)
Jinrui Yang, Timothy Baldwin and Trevor Cohn. Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval. (EMNLP MRL 2023)
Jinrui Yang*, Sheilla Njoto*, Marc Cheong, Leah Ruppanner and Lea Frermann. Professional Presentation and Projected Power: A Case Study of Implicit Gender Information in English CVs. (EMNLP CSS 2022)
Hantian Ding, Jinrui Yang, Yuqian Deng, Hongming Zhang, and Dan Roth. Towards Open-Domain Topic Classification. (NAACL Demo 2022 ) Try here for demo!

SELECTED AWARDS

Obtained 3 outstanding internships awards in Tesla, Uber and SIEMENS during undergraduate
Obtained 2 Chinese national patents during undergraduate
Graduated With Honor from Sichuan University
Won 1st prize of Angel Hackathon 2018, Shanghai, China
Won 1st prize of Unleash Hackathon 2018, Chengdu, China

Me in Life

Outside of my academic and professional commitments, I am a big fan of snowboarding, enjoying the carving through fresh powder on the slopes, and also enjoying staying with my beloved cat, who always accompanies me on my adventures.

Winter Life

(left) 2022 Chongli, Beijing - (right) 2023 Niseko, Japan

Pet Life

Kingbo (进宝) has traveled to many places; brave and lazy boy!

Volunteer Life

I have volunteered as a teacher at One-School for 8 years, an NGO aimed at providing equal opportunities for students from low-income rural families.

We have supported 10,858 students and raised approximately $4M sponsorship. Click the video to learn more 👉

I am very privileged to work with these students, and firmly believe that education is the best form of charity and everyone deserves it. That is one of the reasons why I plan to working in academic.

(photos copyrighted to One-School)