By Yusuke Kikuchi
Disorders in the human body are sometimes not visible from the outside. The human body is mostly covered by skin, and it is not realistic and too risky to open up the patient’s body every time a doctor needs to check the inside status. Most of the medical imaging technologies are non-invasive or only require minimal invasion. In addition to the patient’s visible symptoms, medical images provide an aspect of the inner status of the patient’s body. Hence, medical images and the interpretation of them are a vital part of daily clinical procedures.
The first medical imaging technology is X-ray. X-ray was identified by German physics professor Wilhelm Röntgen in 1895 and he applied it to image human body. Röntgen received the first Nobel Prize in 1901 for this work. Following X-ray, many types of medical imaging methods have been invented and helped doctors understand the disease more deeply. The well-known examples are the following.
X-ray: X-ray is the name of electromagnetic radio wave. The visible light is also in the family of electromagnetic radio waves, but X-rays are not visible. X-ray has much higher energy compared to visible light, which enables it to travel through the human body. Some of the X-ray waves are absorbed by some of the human body structures such as bones or tumors and the gradation shows the inner structure of the human body. X-rays are mainly used to detect injuries like broken bones or detect disease in the lungs.
CT: The full name is Computed Tomography. CT also uses X-ray, but this technology can produce a 3D view of the human body. X-ray is projected to the patient’s body from various directions and the X-ray waves that have gone through are collected and analyzed by a computer to make a 3D scan. CT scans are used to detect problems in bones and joints, detect cancer, or heart problems.
MRI: MRI is an acronym for Magnetic Resonance Imaging. MRI uses a very strong magnet to generate a magnetic field to capture the movement of hydrogen nuclei in the human body to image the inside of the human body. MRI is also a 3D scan, and it has less risk compared to X-ray since MRI does not use the radiation. MRI is mainly used to scan brains.
Because of its importance and the advancement in technology, the annual number of medical images taken keeps increasing. WHO estimated that 3.6 billion medical images are taken in 2016. As a result, the workload of doctors, radiologists, and imaging specialists is overwhelming. Also, the automation of extracting information from medical images has been a longstanding problem in the field of medical imaging. Since the medical images are very high dimensional object, it has been a very challenging problem and the assistance by computers was substantially limited. This situation drastically changed by the emergence of deep learning.
Deep learning is an area of machine learning that uses deep neural networks to learn the target task. This technology itself is not a new idea, indeed the history of research in neural networks began in early 1940’s. However, training neural network, especially a big powerful network, requires a lot of data and plenty of computational resource, both of which were not available at the time, and the neural network technology needed to undergo the winter period for a while. The stage for deep learning was slowly set by early 2010’s. Aside from the advancement in the theory of neural network, there are two key technological advancements that helped deep learning flourish.
Amount of data: Following the prevalence of computer, most of the data are stored and processed by computers.
Advancement in computer hardware: The computer hardware had significant improvements. Especially, the advancement GPU (graphics processing unit) is important. GPU can process data parallely in much more efficient way than the usual computational hardware CPU (central processing unit).
While deep learning has many subfields, the area involving image processing is called computer vision. Computer vision is one of the fields that evolved the most. A large part of the evolution was lead by ImageNet Large Scale Visual Recognition Challenge (ILSVRC). ImageNet is a real-world image data set consisting of more than 14 million annotated images in 1,000 categories such as sports car, eel, iPod, etc. The state-of-art architecture for this data set is used as a base architecture for many computer vision applications. The big wave of computer vision technology soon arrived at medical image analysis. In the next few paragraphs, we will see a few sets of computer vision applications in medical image analysis.
The most straightforward application would be disease detection. Technically, this is a binary classification problem in which a medical image from a patient is the input and the output is the probability of the patient having the target disease. If there are stages of sevrity, we can modify the problem to a multi-class classification (i.e. the output is the probability of benign, stage 1, stage 2, etc) to incorporate that aspect. An example is the detection and the grading of diabetic retinopathy (DR). DR is a diabetic complication that damages the retina. In the diagnosis of DR, an image modality called color fundus photograph (CFP) is used. CFP is taken by a microscopy after dilating the pupil, and it shows the enface appearance of the retina. One of the pioneer works was done by a group led by researchers in Google. Their model showed high sensitivity and high specificity and showed the effectiveness of deep learning approach in medical image analysis.
Another popular application is segmentation. The purpose of segmentation is to locate and draw a boundary of the target object. The object can be organs, tumors, disease lesions, blood vessels, etc. The segmentation itself is valuable, but it can be used for downstream tasks. For example, one can compute the volume of cancer to monitor the progress of the disease. Or, the statistics (volume, area, length, etc) can be used to predict the disease risk, disease progression rate, etc.
A more creative way of using AI is the biomarker discovery. This area is not extensively explored yet in research or in practice, but the author strongly believes that this is an application in the next generation. A biomarker is a substance that indicates disease activities. Especially in medical imaging analysis, radiologists or physicians look for imaging biomarkers in the image. When we train a neural network, we do not tell the network where to look. In fact, the network finds the important image features by looking at a lot of images. Simply speaking, we are able to know the image features the network uses by backtracking where the network looks. However, deep neural networks are highly complicated, hence it is very hard to understand what they do inside. That being said, there is an apparent advantage of AI: AI can look at much smaller objects. Usually, the change made by disease is very small in the early stage of the disease. If AI can find those small biomarkers, the ability of early detection of diseases would be greatly increased.
We reviewed AI and medical image analysis in this article. We focused on a good side of AI, but AI is not perfect.