Detecting Lung Illnesses Using Machine Learning

By: Abdullah Shahid , Nick Yount, Darren Whilite

Describing the Problem

Introduction

For many untrained professionals, distinguishing the X-ray of a healthy pair of lungs and unhealthy lungs is a real challenge, a task that requires attention to the most minute detail. So to help with this challenge we saw to help with the power of machine learning. Computers are able to see the difference between a photo at the pixel level. Something so small that humans can’t differentiate. It’s with this ability we hope to develop the computer's ability to process data into finding healthy lungs and unhealthy lungs by comparing them to images of other lungs.

Try to tell the difference between these images: Which one is Covid-19/Pneumonia/Normal ?




The Problem

We wanted to find out whether or not machine learning could tell the difference between lungs with pneumonia or healthy lungs from just x-ray images. We took this further by testing whether or not we could use lung X-rays to tell if the patient was infected with COVID-19.

Secondly, we wanted to experiment and see whether transfer learning or a model made form scratch would work better.


Approach to CNN Model



A brief intro to CNNs

Convolutional Neural Networks have different layers and each layer has filters. With these filters the network will extract features from the images and condense them from 3D tensors to a 2D vectors which the model will then use to train on.

How does it apply to detecting diseases in lungs?

Using Convolutional Neural networks, we can extract features from X-ray images. For example, in the demonstration below the Conv2d() produces a tensor of outputs and extracts features from the image and then max pool will take the maximum outputs and the cycle will continue depending on the amount of players you have in your model. At the end, the flatten and dense layers will make the 3D tensors to vectors and the Sigmoid function will be used to create predictions that your model will output. We decided on CNN since it is the best and most popular way of machine learning with large image datasets.


Our Approach to CNN

We used Keras libraries to create our CNN model from scratch we used Conv2d() to do convolution and maxpool2d() for max pooling. Our model also used flatten and dense layers and a Sigmoid function at the end for probability. For our loss function we used binary cross entropy which will measure performance of our model where the output will be a probability between 0 and 1.

Additionally, with every layer of the CNN model we made the filter increase by two times every convolutional layer. Therefore more features would be extracted with every layer.

The same model will be used for both Pneumonia and COVID-19 models.

Our model:

Model Summary

Data Augmentation

A major setback we experienced in the project was not finding enough X-ray data for Covid-19. Initially, we thought there was no way around this so we focused more on perfecting our model. However, along the way we discovered the existence and power of data augmentation. We looked into various techniques like flipping the image and adding noise. However, after much experimentation flipping and rotating the images was our final choice. Adding noise lead to low accuracy rates and high amount of loss. We obtained the code online and the code will augment the data and save it directly to the source folders. The code has a function where you can specify the amount of images need for augmentation . This was useful for us as we needed large amount of data to account for the imbalances.

An example of data augmentation can be seen in the image below:

The program will flip and rotate the images and for some cases we adding a line of code to allow for a slight zoom into the image.


The Results for CNN Model

Pneumonia vs Normal

Our model performed very well in this case we had a small amount of false positives and negatives in the evaluation or validation set. We split the data so we had 624 images in total for the validation data set. We trained the model for 9 epochs and we received favorable results with 92 percent accuracy. We did not increase the number of epoch since we were afraid of over fitting. Additionally, we did not think it would be necessary since we had a large number of steps. We also tried 20 epochs since it is standard however, we got fluctuations in the accuracy rate and a lower final accuracy rate. We believe this is most likely because of data imbalance and since we had a small datasets.

Below for evaluation we have included a confusion matrix where we can see that 16 false positives and 37 false negatives were recorded. We believe these results are good since they are not that high and are reasonable given the size of the training datasets.

After the confusion matrix we can see the training loss and accuracy graphs.

The training accuracy and loss increased and decreased by a lot through all the epochs. The rate of change for both was very high and the loss went towards relatively quickly and the model accuracy increased relatively quickly as well. We believe if we had more data and trained for more epochs a more steady decrease and increase would be seen.

Covid-19 and Normal

For the COVID-19 and normal model we received similar results and high accuracy rates. We again augmented the testing datasets and the validation datasets to increase to 624 images. Doing this allowed us to better compare the results of the two models. We trained this model with 9 epochs and received a 90% accuracy rate. We did experiment with more epochs but again it lead to a lower accuracy level so we stopped at 9. We believe this could have been caused by the data imbalance.

A confusion matrix has been included below and again we got positive results with 11 false positives but 47 false negatives. We are happy with the results since we are limited on the size of our datasets.

Along with the confusion matrix there is the loss and accuracy graphs.

Our loss and accuracy graphs were similar to the pneumonia example with the graphs changing drastically throughout the epoch. The accuracy went towards 1.00 and the loss went towards 0 therefore the model was training as we wanted.

Transfer Learning Model

What is transfer learning and how is it used ?

Transfer learning is using a pretrained model and fitting new data to it so it can perform according to your needs. For example, the most popular pretrained model is the Renset-50 model. Which is already built and model with layers. Additionally, it has been trained on a large image database called image-net. Therefore, it can provide predictions to what certain images are. We used the Resnet-50 model for our program. We started out on our process by first giving the model some example images and seeing how it preformed without training.

The Process

Preliminary Predictions

Any pre-trained model for CNN can give predictions of data before training. So we tried this with our pneumonia dataset. We gave it 12 images and it identified all of the Normal x-ray images as normal, but failed to identify any pneumonia cases. Therefore, we concluded that the image database the model was trained on included x-ray images of normal lungs therefore it was only successful at identifying normal lungs. Therefore, we had to train the model so it could predict both infected and non-infected cases. We did not try preliminary predictions with COVID-19 datasets since we knew it would yield similar results.

Training for New Data

We trained the pretrained model differently than the traditional way, we did not use multiple epochs. Instead we used only 1 epoch and evaluated the model at different steps. We found through research that this would be the best approach since we had a limited number of datasets. Therefore, we set a condition that the model would evaluate every 5 steps. Additionally, for both models we added the condition that the model should stop training once it hits 90% accuracy. We did this on account that the accuracy of our models accuracy only fell after reaching 90% if we kept training them.

How did we test the model?

We could not find a way to create confusion matrix graphs for the model therefore we evaluated the model the same way as we did for the evaluation for predictions. We used the show predictions function again and the program displayed the prediction of the trained model.

Pneumonia and Normal

The accuracy rate of the pneumonia and normal model was 89% and it took 840 steps to reach this accuracy. The model did not reach a higher number of accuracy so we decided to stop it here contrary to our 90% accuracy condition. We tested the model and received around 6 false positives. The model did not function as well as we had expected but the results were still within our percentage error we deemed acceptable . We believe the lower accuracy was in due to the data imbalance between the normal and pneumonia datasets.

Covid-19 and Normal

With this model we saw a much higher accuracy rate which was around 98%. This was surprising since we expected a lower accuracy rate since we had a much smaller data set. Additionally, we did not augment any of our data for either model. Additionally, when we reviewed the predictions for the model we did not receive any false positives or negatives. This was surprising since we did not expect this at all and expected the exact opposite. This model out of all of them performed the best. However, we suspect that there might be over fitting occurring since we reached such a high accuracy number.

Conclusions

  • Transfer Learning is more efficient and accurate and we noticed that we observed less false positives in the test data. Additionally, it took much less time to train and give evaluations at every 5 steps which was much more useful to see the models progress than just the loss and accuracy.

  • Large datasets of images are necessary. We realized that in order for CNN in the medical imaging field to be efficient and accurate they must have large datasets to train on since augmentation can only be used to an extent.

  • Data imbalance leads to less accurate models. Imbalance between data causes the accuracy to be lower and the model ends up not having enough data to train on.

  • There is potential for machine learning in the medical imaging field if more data is collected and better models are constructed.

Future Steps

  • The only way CNN based projects can be improved is with larger datasets. So in the future I am to gather much more data and observe how well the model does then.

  • More fine tuning for the transfer learning model. The transfer learning model could be tuned and it would be interesting to observe and see how different Resnet models train on the data and what results they give.

  • Creating a web app where users can upload images of their lungs for a comprehensive check. I plan to create a comprehensive application were users can upload images of their lung X-rays and a result can be given as to what the model detects.

  • Include Lung Cancer detection


Sources and Acknowledgements

Data sources:

https://www.kaggle.com/tawsifurrahman/covid19-radiography-database

https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

CNN Model Guide: https://www.youtube.com/watch?v=-5MrYeGb1iI

https://github.com/AndyyZhu/PneumoniaDetectionModel

Data Augmentation Code: https://gist.github.com/tomahim/9ef72befd43f5c106e592425453cb6ae

Model Used: Resnet 50 from Pytorch

Acknowledgements: We would like to acknowledge Professor Donald Smith who guided us along the way to create this machine learning project.