By Raymond Yu
Published 12/23/2025
One important warning sign for heart disease is cardiomegaly, or an enlarged heart, which can often be spotted through chest X-rays. Traditionally, doctors and radiologists interpret these images by hand, which takes time and expertise. This project explores how artificial intelligence (AI) can help by automatically detecting cardiomegaly from X-ray images.
The program, called CardiomegalyClassification, uses a deep learning model built with MobileNet, a neural network originally trained to recognize everyday images. By retraining it to focus on medical X-rays (a process called transfer learning), the system learns to identify signs of heart enlargement accurately and efficiently. The model was trained using a dataset of labeled X-rays and evaluated by accuracy, precision, and AUC scores.
This project shows that AI can support faster, more consistent, and more accessible medical screening.
Cardiomegaly is often a visible sign of underlying cardiovascular disease such as high blood pressure, cardiomyopathy, or heart failure. Detecting it early can help doctors treat these conditions before they become dangerous. However, reading chest X-rays manually requires professional expertise and can be time-consuming, especially in hospitals with many patients or limited resources.
Artificial intelligence (AI) offers a way to help. Deep learning systems can analyze medical images and recognize patterns that may be less noticeable to the human eye. By training these models on labeled data, AI can learn to distinguish between healthy and abnormal scans in seconds.
The goal of this project is to create an open-source AI tool that can detect cardiomegaly automatically using chest X-rays. The system is designed to be a demonstration to show students how AI may be applied to facilitate faster and more efficient screening in healthcare.
Data and Preparation
The dataset contained chest X-ray images labeled as either normal or cardiomegaly. To prepare the images for AI training, each one was resized to the same input size and normalized so that pixel brightness values were consistent.
Because real medical datasets can be small or uneven, data augmentation was used to create variety. This technique makes slight modifications to each image (rotating, flipping, or zooming) to help the model learn general patterns instead of memorizing specific images. This step reduces overfitting and makes the AI more reliable on new data.
Model Design
The core of the project is the MobileNet model, which was designed for efficiency and pre-trained on everyday images. Instead of building a deep learning model from scratch, the project uses transfer learning—a method that starts from a model already trained on millions of images and fine-tunes it for the new task of detecting cardiomegaly.
The use of transfer learning and starting from a pretrained model would help the system learn quickly and ensure that it needed far less data than a brand-new neural network. The Global Average Pooling in the CNN would help prevent overfitting (resulting from the memorization of details specific to the training images rather than the patterns of cardiomegaly), improving real-world performance.
The model’s structure includes:
MobileNet backbone: extracts visual features from X-rays, such as edges and shapes.
Global Average Pooling layer: simplifies the features and reduces overfitting by averaging each feature map (map of values corresponding to a portion of the image) into a single value to reduce the number of parameters.
Dense layer (512 neurons, ReLU activation): acts as the main decision-making layer of the CNN; helps the model learn relationships between features and heart size.
Output layer (Sigmoid activation): produces a probability between 0 and 1, showing how likely the image indicates cardiomegaly.
X-ray Images
Positive cases are labeled 1.0, while Negative cases are labeled 0.0
MobileNet
This portion of the code sets the base model parameters and adds the necessary layers for a CNN.
Training and Evaluation
The model was trained for multiple epochs (iterations through the entire dataset) with a set batch size of 32 (weights are updated every 32 images) using the Adam (Adaptive Moment Estimation) optimizer and a starting learning rate of 1 × 10⁻⁴. The Adam optimizer is an optimization algorithm that adjusts the learning rate for each parameter during training to help the model learn faster while preventing overly large updates that cause it to “overshoot” its adjustments. The binary cross-entropy loss function, or a standard loss function for binary (yes/no) classification problems, was used to calculate the difference between the predicted probabilities and the true labels.
The model’s performance was evaluated using:
Accuracy: overall percentage of correct predictions.
Precision and Recall: how well the model manages false positives and false negatives.
F1-Score: a combined measure of precision and recall.
ROC Curve and Area Under the Curve (AUC): how clearly the model separates normal and enlarged hearts.
Confusion Matrix Values: numbers that show the counts of 2 types of correct predictions and errors.
Overall, CardiomegalyClassification demonstrates how AI-supported screening can improve the efficiency and accessibility of healthcare by lessening the workload of medical professionals and allowing more patients to be screened in the same amount of time. If the data of a similar model were sampled to reduce bias and the parameters and overall architecture were optimized, it could potentially be reliable enough to assist in actual healthcare settings to make early detection easier.
Future ideas or improvements could include:
Adding explainable AI tools, like heatmaps that show which parts of the image influenced the AI’s decision.
Exploring other models like EfficientNet and evaluating their advantages, disadvantages, and performance.
Expanding the dataset to include diverse cases while minimizing bias
Github Link: