Our Failure Cases in the
Closed-World Benchmark Datasets

False Positives on CSDMC Ham Dataset

False Negatives on Nazario Dataset

False Negatives on PhishPot Dataset

False Positives on CSDMC Ham Dataset

FP Reason 1: Indeed Suspicious Emails

A promotional email for EFF, but from spamassassin.taint.org

A promotional email for Yahoo Finance, but from spamassassin.taint.org

A promotional email for tesco.ie, but from reply.12hs.com

A promotional email for O'Reilly book, but from dogma.slashnull.org

False Negatives on Nazario Dataset

FN Reason 1: Outside our threat model (sender address spoofing)

The sender address is spoofed to be the real amazon.com

The sender address is spoofed to be the real psecu.com

The sender address is spoofed to be the real dhl.com

The sender address is spoofed to be the real standardbank.co.za

The sender address is spoofed to be the real wetransfer.com

FN Reason 2: Ambiguous identity

False Negatives on PhishPot Dataset

FN Reason 1: Unknown identity

FN Reason 2: Actions are visually salient but not lexically prominent

The call-to-action is "Staking Round 1"

The call-to-action is "Free Mint"

The call-to-action is "Your name came up..."

The call-to-action is "Surprise in your inbox..."

FN Reason 3: Outside our threat model (other types of scam / spam)

Generic spam, promotional email

Generic spam, promotional email

Generic spam, promotional email

Nigerian Prince Scam, imitating Warren Buffett, but we lack the actual email for Warren Buffett, so the identity is not verifiable

Nigerian Prince Scam, imitating Wood Forest, but we lack the actual email for Wood Forest, so the identity is not verifiable

Page updated

Google Sites

Report abuse