Results and discussion

1. Baseline Performance on Manual Fakes

To establish a clean reference point, we first evaluated profile detection performance in a traditional setting, with no LLM-generated content present. The training and test sets consisted of only:

The following performance was achieved using Flair embeddings + XGBoost classifier: