Web Interaction Model

WebInteraction System Design

In our WebInteraction system, classes form 3 levels of hierarchy.

Level 1
- PhishIntentionWrapper: A wrapper for PhishIntention
- SubmissionButtonLocator: An object detector for submission button
- OpenMMOCR: Calling OCR library from OpenMMlab
Level 2
- StateClass: check the status of the website, e.g. empty page, error page etc.
- StateAction: perform CRP transition proposed by PhishIntention
- Form: implements the essential logic where we define input detection, input rule matching, button detection, form filling, and form submission
  - Input detection: Since there are limited number of ways to implement a fillable input fields in HTML, we locate all <input>, <textarea> and <search> tags in the HTML document
  - Input type matching: To decide what is the type of credentials that a particular input is asking, we have 2 layers of matching
    - 1) The simpliest way is to look for keywords in HTML attributes
    - 2) If layer 1 is bypassed by HTML code obfuscation, we will use OCR to report the text surronding the input area
    - In total we keep 29 matching rules for 29 input types( i.e. email, first name, last name, username, userid, name prefix, password, phone area, phone, month, day, year, birthday, age, file upload, zipcode, city, country, state, street, building number, address, ssn, company name, credit card number, credit card ccv, credit card expiration date). The reason to keep such a comprehensitive list is that we want to avoid interaction failure because of "not filling up input with the required format".
  - Button detection: The submission button detection is an object detector trained on 1495 images
  - Button cleansing: We discard the "registration" button because clicking registration button will indeed proceed to the next page.
  - Form filling: Filling all inputs with required formats
  - Form submission: Clicking the most probable submission button
Level 3
- Web Interaction model: This is a more detailed illustration of Algorithm 3

Page updated

Google Sites

Report abuse