Helping hospitals detect stroke risk early through data mining
This project successfully used machine learning to predict stroke risk and identify patients likely to have a stroke, especially those who may not show obvious symptoms. Logistic Regression was the most effective model for detecting actual stroke cases, making it suitable for early medical screening.
The model helped highlight top high-risk individuals based on health and lifestyle patterns like age, glucose levels, and BMI—supporting early intervention even in patients without known heart conditions.
Finally, the model was able to group patients into low, medium, and high-risk levels, helping healthcare providers plan more targeted prevention and care efforts.
Use More Health Data
Adding features like cholesterol, diet, exercise, or medication history could improve accuracy.
Track Changes Over Time
Using patient history (like changes in glucose or blood pressure) can help spot warning signs earlier.
Combine Models
Merging the strengths of different models (e.g., Logistic Regression + XGBoost) may give better results.
Reduce False Alarms
The model catches most stroke cases, but sometimes gives too many alerts. Future work should focus on improving precision without missing real cases.
Make It Real-Time
Build a system that lets doctors enter patient data and instantly see stroke risk—through apps or hospital software.