5. Feature Insights & Calibration
We conducted ablation studies to understand the relative contribution of different features and tested whether better calibration led to stronger model robustness under LLM-generated attacks.
Ablation: Feature Type vs. Robustness
STE Text Embeddings Only: High performance on manual fakes (F1 ≈ 96%) but dropped to 57–81% on LLM fakes.
Numerical Features Only: Moderate baseline F1 (~95%) but more resilient to LLM attacks (F1 ≈ 78–80%).
Combined (Text + Numeric): Highest performance across all test cases. Retained >97% F1 even under combined attack scenarios.
These results suggest that while LLMs are strong at mimicking human writing, they struggle to emulate platform-level behavioral signals (e.g., skill counts, endorsement ratios, connection networks). Numerical features captured those inconsistencies.
Calibration Effects
We evaluated model calibration using the Brier score. Well-calibrated models made sharper, more reliable predictions and suffered lower false accept rates.
Calibration complements feature design: even when feature sets are strong, a poorly calibrated model can overtrust its predictions. Our results show both matters.