Weak Links in LinkedIn: Enhancing Fake Profile Detection in the Age of LLMs

Overview

About the project

Recent advances in large language models (LLMs) have enabled malicious actors to generate highly realistic fake profiles on professional platforms, such as LinkedIn. These AI-generated profiles bypass both manual inspection and traditional machine learning detection tools.

Our research investigates the vulnerability of existing detection systems and proposes a countermeasure: LLM-assisted adversarial training. This approach significantly improves model robustness, even against sophisticated synthetic profiles generated by state-of-the-art LLMs like GPT-4.

Motivation

LinkedIn has over 1.15 billion users who rely on the platform for professional engagement and credibility.

Fake profiles are used for phishing, scams, misinformation, and recruitment fraud.

LLM-generated profiles are now nearly indistinguishable from real ones—even to trained humans and AI systems.

Key Findings

– Existing detectors catch manual fakes (False Accept Rate: 6–7 percent)
– But fail on GPT-generated ones (False Accept Rate: 42–52 percent)

We propose GPT-assisted adversarial training, which reduces the False Accept Rate to 1–7% without increasing the False Reject Rate.

Ablation results:
– Best: combined numerical and textual embeddings
– Next: numerical-only
– Worst: textual-only

Even GPT-4Turbo and human reviewers struggle. Robust automated detectors remain essential.

Highlights of This Project

Introduced 600 GPT-4-generated fake profiles for adversarial training and evaluation.

Built hybrid detection systems using both textual embeddings and numerical profile features.

Employed advanced classifiers, including XGBoost and CatBoost, with calibrated probability outputs.

Conducted extensive benchmarking against GPT-4 and human evaluators.
Released a fully open-source dataset, codebase, and evaluation toolkit for reproducibility.

Who Should Read This?

This research is relevant to:

Researchers in AI safety, security, and social computing
LinkedIn policy teams and social platform defenders
Engineers building LLM detection and adversarial robustness tools
Anyone interested in misinformation, digital trust, and human-AI interactions

Navigation Preview

Datasets → How we collected and constructed legitimate, fake, GPT3.5, and GPT4 profiles
Implementation → Embedding models, feature design, classifiers, and adversarial training setup
Discussion → Detailed results, ablation studies, and benchmarking against humans and GPT-4
Conclusion → What we learned and where this research is headed next

The powerpoint of the presentation can be accessed using this link