I have done this project under the guidance of Dr. Supriya M. This project explores the privacy risks that arise from the increasing use of big data. It introduces a new concept called “Victimization”, which highlights how individuals can become vulnerable when large and diverse datasets are collected and analyzed. To address this challenge, the work evaluates federated learning as a privacy-preserving alternative to traditional data processing. The study shows that federated learning can reduce these risks while still delivering strong model performance, making it a promising approach for safer and more responsible use of big data.
Pytorch Framework
Two datasets were used:
Collected from Kaggle
Includes features such as age, gender, body mass index (BMI), hypertension, heart disease, smoking history, HbA1c level, and blood glucose level
It consists of 9999 data samples
Collected from the National Institute of Diabetes and Digestive and Kidney Diseases
Includes features such as Pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI, Diabetes pedigree function, Age,
It consists of 768 data samples