There are a wide variety of neurological imaging technologies that scientists use to visualize the human brain. CT scans and Magnetic Resonance Imaging (MRI) images are often used to give detailed images of the anatomy of the brain, while techniques like fMRI and EEG are utilized to measure brain activity and stimulation.
For the purposes of this study, we have decided to utilize MRI images taken from three reputable databases. Each database is sourced from a scientific institute with a focus on neuro-degenerative disease and Alzheimer's.
For this process we utilized MATLAB, alongside a variety of developer toolboxes to aid us. Our Data Pre-Processing pipeline consisted of the following three steps:
Image Normalization
Region of interest (ROI) Segmentation
ImageID-Based Data Collection
All MRI images have slight variations in the placement and angle of the brain. In order to properly apply the ROI Segmentation process, all images must have the brain in the same place within the frame. In. Image Normalization is the process which makes all images uniform in their positioning, leaving the only variations between each image within the content of the brain itself, rather than other factors such as positioning and angle. To accomplish this, we used the spm12 MATLAB toolbox.
MRI images are used to gain volumetric data on individual pieces of the brain. When taking an aggrigate of many CN people's brains, there are common patterns observed in the volumetric mass of specific regions of those brains. The same is true of people in the MCI and AD categories. To measure as detailed as we possible can, we utilized 8 different templates of the brain and utilized the cat12 MATLAB toolbox to apply said segmentation to our normalized MRI images from the above databases. This generated 914 individual Region on Interest Points per MRI image, and each of those 914 points corresponds to a unique portion of the brain.
When the ROI segmentation data is complete, we are given a variety of .xml and .csv files with all our relevant information. The program generates these files in a folder corresponding to each patient, but we wanted for our model to ingest a single .csv file with all ROI information with a wide variety of patient data. Thus, we created a python script to take all the individual files and combine them into one. This change allowed for easy ingestion, and potentially far faster processing times.