•H1B Visa petitions whose decisions were made in the FY 2018.
•The dataset consists of 6,54,348 rows and 52 columns.
•The target variable is case status which consists of 4 distinct values namely Certified, Certified-Withdrawn, Denied and Withdrawn. Since our focus is on denials events, we are considering other three categories as non-events.
• There are various employer as well as employee characteristics variables available in the dataset. Our dataset is highly imbalanced since we have 98.7% of the petitions approved or withdrawn and only 1.3% rejected.
•Source: https://www.uscis.gov/tools/reports-studies/h-1b-employer-data-hub-files