Comparing the Architecture and Performance of AlexNet, Faster R-CNN, and YOLOv4 in the Multiclass Classification of Alzheimer Brain MRI Scans
Alzheimer’s Disease (AD) is a common, yet terminal illness that is characterized by progressive memory loss, disorientation, and pathological markers, including senile plaques and neurofibrillary tangles in the brain. It is the sixth leading cause of death worldwide and the most common form of dementia, accounting for up to 80% of dementia diagnoses. According to the Center for Disease Control and Prevention (CDC) in 2014 more than five million people were living with AD, and this is expected to almost triple to fourteen million people by 2060. Despite its prevalence, there is currently no cure for this degenerative disease.
Although advancements in diagnostic imaging such as magnetic resonance imaging (MRIs) have led to a greater understanding of the diagnosis and treatment of Alzheimer’s Disease (AD), medical professionals are still required to analyze the images, which is a time-consuming and error-prone process. With the help of neural network models, diagnoses can be reached more accurately and efficiently.
In this study, we compared the performance of three well-known CNN-based algorithms — AlexNet, Faster R-CNN, and YOLOv4 — to determine which one was most accurate at performing multiclass classification of brain MRI scans of AD patients. The dataset utilized was obtained from Kaggle and contained 6400 training and testing MRI images divided into four classes (NonDemented, VeryMildDemented, MildDemented, and ModerateDemented). The ModerateDemented class was extremely underrepresented. To obtain more accurate results, images were added to that class through data augmentation.
Experiments were conducted using Google Colab’s Tesla P100 GPU. Transfer learning was applied to all three pre-trained models and the datasets were adjusted according to their respective parameters.
AlexNet: The pre-trained AlexNet model and its parameter values were imported from the PyTorch library and trained with the Alzheimer's brain MRI dataset. The input data consisted of a batch size of 32 with a height of 227, a width of 227, and a depth of 3, with the AlexNet running through 170 epochs of the training data.
Faster R-CNN: The Detectron2 implementation of Faster R-CNN (X101-FPN) was employed in this study. Detectron2 is an open-source object detection platform based on the PyTorch library that includes implementations of Faster R-CNN and other computer vision algorithms. The Detectron2 API provides the model architecture as well as its pre-trained weights which are based on the COCO (Common Objects in Context) dataset. The Alzheimer's brain MRI images were then converted into COCO JSON file format and the model was trained through 8000 epochs.
YOLOv4: The Darknet framework, a custom framework written by Joseph Redmon, was used to implement YOLOv4. The initial weights of the Darknet YOLOv4 algorithm were based on the COCO dataset. The model was subsequently trained with the Alzheimer's brain MRI dataset through 8000 epochs.
Post-augmentation, AlexNet had the highest MAP (Mean Average Precision), detecting the object of interest 100% of the time, while YOLOv4 and Faster R-CNN had an mAP of 84% and 99% respectively. However, YOLOv4 had the best performance on the confusion matrix, especially for the ModerateDemented images.
As revealed in our experiment, one-stage detectors like YOLOv4 were faster and more accurate than two-stage detectors like Faster R-CNN. Our study successfully implemented these models and made valuable contributions to medical image diagnosis, opening avenues for future research and development.
(This research was conducted by the AI4ALL team that included: Ria Mirchandani, Caroline Yoon, Sonica Prakash, Archita Khaire, Alyssia Naran, Anupama Nair, and Supraja Ganti)