A.I. Authorship Analysis

The A3 (A.I. Authorship Analysis) project @ PIKE of Penn State University, USA, investigates various authorship-related issues in the generation, detection, perturbation, and obfuscation of AI-generated human languages such as LLM-generated texts. In particular, we aim to find good solutions to research questions such as the following:

What are the characteristics of LLM-generated texts, distinct from human-written texts? Are there such?
How to build efficient and effective Turing Testers to differentiate LLM-generated texts from human-written texts?
Is it possible to hide effective, robust, and undetectable watermark to LLM-generated texts?
How to obfuscate texts to disguise their true authorship?
What are innovative applications and scenarios where true understanding on AI authorship can benefit users?

Team

Current

Mahjabin Naher, PhD student @ Penn State
Nafis Tripto, PhD student @ Penn State
Saranya Venkatraman, PhD student @ Penn State
Dongwon Lee, Faculty @ Penn State

Alumni

Jooyoung Lee, PhD student @ Penn State --> Applied Scientist @ Amazon
Adaku Uchendu, PhD student @ Penn State --> Technical Staff @ MIT Lincoln Lab
Thai Le, PhD student @ Penn State --> Assistant Professor @ Indiana Univ.
Erix Xing, undergraduate summer intern @ Western Kentucky Univ. --> PhD student @ WashU
Ziyao Wang, undergraduate summer intern @ Wuhan Univ. --> PhD student @ UMD
Jialin Shao, undergraduate summer intern @ Beijing Univ. of Technology --> MS student @ UIUC

Publication

2024

Authorship Obfuscation in Multilingual Machine-Generated Text Detection

Dominik Macko, Robert Moro, Adaku Uchendu, Ivan Srba, Jason Lucas, Michiharu Yamashita, Nafis Irtiza Tripto, Dongwon Lee, Jakub Simko, Maria Bielikova

Findings of Conf. on Empirical Methods in Natural Language Processing (EMNLP-Findings), Miami, FL, November 2024

Fakes of Varying Shades: How Warning Affects Human Perception and Engagement Regarding LLM Hallucinations

Mahjabin Naher, Haeseung Seo, Eun-Ju Lee, Aiping Xiong, Dongwon Lee

Conf. on Language Modeling (COLM), Philadelphia, PA, October 2024

TopFormer: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

Adaku Uchendu, Thai Le, Dongwon Lee

European Conf. on Artificial Intelligence (ECAI), Santiago de Compostela, Spain, October 2024

A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

Nafis Irtiza Tripto, Saranya Venkatraman, Dominik Macko, Robert Moro, Ivan Srba, Adaku Uchendu, Thai Le, Dongwon Lee

62nd Annual Meeting of the Asso. for Comp. Linguistics (ACL), Bangkok, Thailand, August 2024

Catch Me If You GPT: Tutorial on Deepfake Texts

Adaku Uchendu, Saranya Venkatraman, Thai Le, Dongwon Lee

Annual Conf. of the North American Chapter of the Asso. for Comp. Linguistics (NAACL), Mexico City, Mexico, June 2024 (Tutorial)

GPT-who: An Information Density-based Machine-Generated Text Detector

Saranya Venkatraman, Adaku Uchendu, Dongwon Lee

Annual Conf. of the North American Chapter of the Asso. for Comp. Linguistics (NAACL-Findings), Mexico City, Mexico, June 2024

ALISON: Fast and Effective Stylometric Authorship Obfuscation
Eric Xing, Saranya Venkatraman, Thai Le, Dongwon Lee

38th AAAI Conf. on Artificial Intelligence (AAAI), Vancouver, Canada, February 2024

2023

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
Dominik Macko, Robert Moro, Adaku Uchendu, Jason Lucas, Michiharu Yamashita, Matus Pikuliak, Ivan Srba, Thai Le, Dongwon Lee, Jakub Simko, Maria Bielikova

Conf. on Empirical Methods in Natural Language Processing (EMNLP), Singapore, December 2023

UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning
Ziyao Wang, Thai Le, Dongwon Lee

Findings of Conf. on Empirical Methods in Natural Language Processing (EMNLP-Findings), Singapore, December 2023

HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
Nafis Irtiza Tripto, Adaku Uchendu, Thai Le, Mattia Setzu, Fosca Giannotti, Dongwon Lee

Findings of Conf. on Empirical Methods in Natural Language Processing (EMNLP-Findings), Singapore, December 2023

Does Human Collaboration Enhance the Accuracy of Identifying LLM-Generated Deepfake Texts?

Adaku Uchendu, Jooyoung Lee, Hua Shen, Thai Le, Ting-Hao Huang, Dongwon Lee

11th AAAI Conf. on Human Computation and Crowdsourcing (HCOMP), Delft, Netherlands, November 2023

Do Language Models Plagiarize?

Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee

The ACM Web Conference (WWW), Austin, TX, April 2023

Catch Me If You GAN: Generation, Detection, and Obfuscation of Deepfake Texts

Adaku Uchendu, Thai Le, Dongwon Lee

The ACM Web Conference (WWW), Austin, TX, April 2023 (Tutorial)

Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective

Adaku Uchendu, Thai Le, Dongwon Lee

SIGKDD Explorations, Vol. 25, No. 1, page 1-18, June 2023

2022

SHIELD: Defending Textual Neural Networks against Multiple Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher
Thai Le, Noseong Park, Dongwon Lee

60th Annual Meeting of the Asso. for Comp. Linguistics (ACL), Dublin, Ireland, May 2022

Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense
Thai Le, Jooyoung Lee, Kevin Yen, Yifan Hu, Dongwon Lee

Findings of 60th Annual Meeting of the Asso. for Comp. Linguistics (ACL-Findings), Dublin, Ireland, May 2022

2021

TuringBench: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee

Findings of Conf. on Empirical Methods in Natural Language Processing (EMNLP-Findings), November 2021

2020

Authorship Attribution for Neural Text Generation

Adaku Uchendu, Thai Le, Kai Shu, Dongwon Lee

Conf. on Empirical Methods in Natural Language Processing (EMNLP), November 2020

2019

Characterizing Man-made vs. Machine-made Chatbot Dialogs

Adaku Uchendu, Jeffrey Cao, Qiaozhi Wang, Bo Luo, Dongwon Lee

Int'l Conf. on Truth and Trust Online (TTO), London, UK, October 2019

A Reverse Turing Test for Detecting Machine-Made Texts

Jialin Shao, Adaku Uchendu, Dongwon Lee

11th Int'l ACM Web Science Conf. (WebSci), Boston, MA, July 2019

Page updated

Google Sites

Report abuse