We analyzed bias among different demographic subgroups for imbalanced MIMIC III and SEER clinical datasets. Analyzed bias in different prediction tasks (e.g., in‑hospital mortality prediction, decompensation prediction using MIMIC III dataset, and breast and lung cancer survivability prediction using SEER dataset). Designed a double prioritized (DP) technique to reduce the bias and improve prediction accuracy among minority subgroups.
Supervisor: Dr. Olivera Kotevska
[June 2021 - August 2021]
Analyzed local different privacy (LDP) frameworks on IoT‑based streaming data. Comparative analysis of different LDP frameworks including distribution‑based approaches, randomized response‑based approaches, count sketch‑based approaches.
Supervisor: Dr. Daphne Yao
[August 2018 - ongoing]
We are working on a secure measure to detect vulnerable applications that are developed every day by many inexperienced third party developers. Installing these applications can cause user’s personal data leakage as most of the times they failed to write secure codes which causes the software to be vulnerable to attack. We aim to develop a user friendly software for android developers. We also identify all the probable cases which can cause an application vulnerable.
Supervisor: Dr. Tanzima Hashem
[January 2017 - July 2018]
We proposed to mine frequent itemsets using differential privacy to develop a way to mine only frequent itemsets without revealing user's personal sensitive data. We proved the practicality of our system via extensive experiments using real human genome datasets.
Supervisor: Dr. Tanzima Hashem
[August 2014 - September 2017]
We proposed a novel secret sharing approach to compute the probability of a patient to develop a specific disease without revealing sensitive data (patient's genome, name of the disease, disease marker of medical center, and the result of the query i.e. chances of the disease) to dishonest adversaries and ensuring authentication of the genomic data. We proved the practicality of our system via extensive experiments using real human genome datasets.