Phase 1 trained and compared 24 experimental configurations crossing 4 model types, 3 feature selection strategies, and 2 resampling conditions.
Phase 2 applied dual XAI explainability exclusively to the best-performing model, generating compliance documentation directly mappable to EU AI Act obligations.
Log-transformation of Amount, z-score normalisation of Amount and Time. Zero missing values confirmed across 284,807 records.
Synthetic Minority Over-sampling corrected the 578:1 imbalance, expanding training data to 454,902 balanced records.
Random Forest + XGBoost as Level-1 base learners, Logistic Regression as Level-2 meta-learner via 5-fold cross-validation.
SHAP global attribution ranked 500 test instances. LIME constructed locally linear surrogates for individual fraud cases.
This design enabled the first direct empirical comparison of attribution-based vs. correlation-based feature selection under identical model configurations.