Weak Links in LinkedIn: Enhancing Fake Profile Detection in the Age of LLMs
Overview
About the project
Recent advances in large language models (LLMs) have enabled malicious actors to generate highly realistic fake profiles on professional platforms, such as LinkedIn. These AI-generated profiles bypass both manual inspection and traditional machine learning detection tools.
Our research investigates the vulnerability of existing detection systems and proposes a countermeasure: LLM-assisted adversarial training. This approach significantly improves model robustness, even against sophisticated synthetic profiles generated by state-of-the-art LLMs like GPT-4.
Motivation
LinkedIn has over 1.15 billion users who rely on the platform for professional engagement and credibility.
Fake profiles are used for phishing, scams, misinformation, and recruitment fraud.
LLM-generated profiles are now nearly indistinguishable from real ones—even to trained humans and AI systems.
Key Findings
– Existing detectors catch manual fakes (False Accept Rate: 6–7 percent)
– But fail on GPT-generated ones (False Accept Rate: 42–52 percent)
We propose GPT-assisted adversarial training, which reduces the False Accept Rate to 1–7% without increasing the False Reject Rate.
Ablation results:
– Best: combined numerical and textual embeddings
– Next: numerical-only
– Worst: textual-only
Even GPT-4Turbo and human reviewers struggle. Robust automated detectors remain essential.
Highlights of This Project
Introduced 600 GPT-4-generated fake profiles for adversarial training and evaluation.
Built hybrid detection systems using both textual embeddings and numerical profile features.
Employed advanced classifiers, including XGBoost and CatBoost, with calibrated probability outputs.
Conducted extensive benchmarking against GPT-4 and human evaluators.
Released a fully open-source dataset, codebase, and evaluation toolkit for reproducibility.
Who Should Read This?
This research is relevant to:
Researchers in AI safety, security, and social computing
LinkedIn policy teams and social platform defenders
Engineers building LLM detection and adversarial robustness tools
Anyone interested in misinformation, digital trust, and human-AI interactions
Navigation Preview
Datasets → How we collected and constructed legitimate, fake, GPT3.5, and GPT4 profiles
Implementation → Embedding models, feature design, classifiers, and adversarial training setup
Discussion → Detailed results, ablation studies, and benchmarking against humans and GPT-4
Conclusion → What we learned and where this research is headed next
The powerpoint of the presentation can be accessed using this link