We challenge the technical assumption of reference-based phishing detectors and demonstrate that relying solely on a predefined reference list may not be adequate to cover the constantly evolving phishing webpages.
We propose DynaPhish, a systematic remedy (with reference expansion and behavioral invariants) for any reference-based phishing detectors, fixing their inherent limitations on deployment.
We create and release DynaPD, the largest dynamic phishing dataset that includes 6344 and live phishing webpages: this dataset provides a replicable environment for studying phishing behaviors, which can aid in developing new phishing detection solutions and further empirical studies in the community.
Our extensive experiments in both close-world and open-world environments show that \tool is effective and practical, significantly improving the recall of the state-of-the-art phishing detectors with minimal impact on precision and efficiency.
RQ1, RQ2, RQ4
DynaPD: We release sampled 30 phishing kits:
https://drive.google.com/file/d/1uFpy17PH3utB-9eXvxjIM1I2MKDTPWlQ/view
Setup instructions:
https://drive.google.com/drive/folders/1L8QDGs49PDLxFDhKhW-vnUaMZoRkGd6h?usp=sharing
6.3K Benign dataset:
https://drive.google.com/file/d/1AaAJO8CN7RNx7D_fPRTMlwgRYZ2iL2oC/view?usp=share_link
Enlarged target list after running on DynaPD (6.3K phishing kits) and 6.3K benign:
https://drive.google.com/file/d/1FWkBpEWHjlyaH5IJnxyRNEC6LMRpX16O/view?usp=share_link
RQ3
Enlarged target list after running on Phishpedia benchmark:
RQ4
Submission button locator database, train and test:
https://drive.google.com/drive/folders/1b3qJnyq1vgYZ8EKvxCCKQO_e7OpgB9_k?usp=sharing
RQ5
Reported phishing in field study by DynaPhishIntention:
https://drive.google.com/drive/folders/11lYmKhTVqsQUXZwUDMFVQ9uGkgkTKS_G?usp=sharing
Enlarged target list after one month of field study:
https://drive.google.com/file/d/1UKykkUTr8xIIYbaAU1h245R07RyM8gIw/view?usp=share_link
To release