What are the future works in "Deep residual nets for improved alzheimer’s diagnosis" ?

As part of future work, the authors hope to understand more closely the contribution of pretraining versus depth.

What was the purpose of this study?

Using an 80/20 train/test set split, the authors aimed to assess their hypothesis that pretrained residual networks would improve AD diagnosis.

What is the purpose of this paper?

The authors validate their hypothesis that pretrained deep residual networks improve AD diagnosis by performing 3-way classification (AD vs. MCI vs. healthy) on brain MRIs provided by the Alzheimer’s Disease Neuroimaging Inititative (ADNI).

(Open Access) Deep Residual Nets for Improved Alzheimer's Diagnosis (2017) | Aly Valliani

Q: What are the contributions in "Deep residual nets for improved alzheimer’s diagnosis" ?

The authors propose a framework that leverages deep CNNs pretrained on large, non-biomedical image data sets. Their hypothesis, which the authors affirm empirically, is that these pretrained networks learn cross-domain features that improve low-level interpretation of images. The authors evaluate their model on brain imaging data to show their approach improves the ability to diagnose Alzheimer ’ s Disease from patient brain MRIs.

Q: What is the main shortcoming of existing CNN-based approaches?

The main shortcoming of existing CNN-based approaches is that the networks are shallow, with only a single layer of convolutions to learn a latent feature representation of the data.

Q: How many layers are used to predict the outputs of the network?

The authors add two fully-connected layers, with 1000 and 100 hidden units respectively, that predict three outputs using a softmax classifier.

Q: What is the architecture for the residual network?

The architecture allows multiple pathways for gradients to flow through the network, which permits the creation of much deeper networks without the burden of vanishing gradients.

Deep Residual Nets for Improved Alzheimer’s Diagnosis

Aly Valliani

Swarthmore College

aavalliani@gmail.com

Ameet Soni

Swarthmore College

soni@cs.swarthmore.edu

ABSTRACT

The eld of image analysis has seen large gains in recent years due

to advances in deep convolutional neural networks (CNNs). Work

in biomedical imaging domains, however, has seen more limited

success primarily due to limited training data, which is often ex-

pensive to collect. We propose a framework that leverages deep

CNNs pretrained on large, non-biomedical image data sets. Our

hypothesis, which we arm empirically, is that these pretrained

networks learn cross-domain features that improve low-level inter-

pretation of images. We evaluate our model on brain imaging data

to show our approach improves the ability to diagnose Alzheimer’s

Disease from patient brain MRIs. Importantly, our results show that

pretraining and the use of deep residual networks are crucial to

seeing large improvements in diagnosis accuracy.

ACM Reference format:

Aly Valliani and Ameet Soni. 2017. Deep Residual Nets for Improved Alzheimer’s

Diagnosis. In Proceedings of ACM-BCB ’17, Boston, MA, USA, August 20-23,

2017, 2 pages.

https://doi.org/10.1145/3107411.3108224

1 INTRODUCTION

Alzheimer’s Disease (AD) is a neurodegenerative disorder aecting

over 5.3 million people in the United States. Diagnosis by experts

is dicult, usually occurring much after symptoms have set in,

and can only be veried via a postmortem autopsy. Compounding

matters is that early signs of Alzheimer’s are dicult to dierentiate

from mild cognitive impairments (MCIs), which are often a result of

aging rather than onset of disease. While promising, brain imaging

remains an underutilized resource for aiding medical experts in

performing early diagnosis due to limitations of the human eye.

Automating this analysis for decision support is itself hindered by

the limited size of image data sets to help train models, particularly

deep convolutional neural networks(CNNs) [

] which have shown

promise in many imaging problems. This limitation exists in many

bioimaging tasks, where the data is too expensive or sensitive to

generate in large quantities.

Suk and Shen [

] rst proposed deep learning approaches for

AD diagnosis, utilizing a sparse autoencoder with a multi-modal

SVM to combine MRI and PET images from a patient. Gupta et

al. [

] showed improvements through the use of a single-layer CNN

pretrained on a small set of natural images. Payan and Montana [

]

Permission to make digital or hard copies of part or all of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for third-party components of this work must be honored.

For all other uses, contact the owner/author(s).

ACM-BCB ’17, August 20-23, 2017, Boston, MA, USA

ACM ISBN 978-1-4503-4722-8/17/08.

https://doi.org/10.1145/3107411.3108224

Figure 1: An overview of our approach. ResNet Feature Ex-

tractor refers to the pretrained 18-layer residual network de-

ned by He et al. [2]. Each layer denes a typical 3x3 convo-

lution layer (not all layers are shown) with the addition of

residuals (arcing arrows across layers). The last two layers

are a fully connected neural network trained from scratch

for AD classication.

further extended this framework to 3D CNNs. The main shortcom-

ing of existing CNN-based approaches is that the networks are

shallow, with only a single layer of convolutions to learn a latent

feature representation of the data. This limitation prevents the

learning of hierarchical representations of the data, which is crucial

to medical tasks where morphological changes are often subtle and

multifaceted. Learning deep networks, however, is dicult due to

vanishing gradients and the need for very large training sets.

We propose the use of pretrained residual network models for

predicting Alzheimer’s Disease from brain images. Specically,

we utilize the the ResNet [

] network which nished atop the

2015 ILSVRC ImageNet competition. This network is trained on

millions of natural images, thus overcoming the limitation of data

size. Second, the residual architecture allows for the learning of

“very deep” networks, which are empirically more accurate and

easier to optimize.

We validate our hypothesis that pretrained deep residual net-

works improve AD diagnosis by performing 3-way classication

(AD vs. MCI vs. healthy) on brain MRIs provided by the Alzheimer’s

Disease Neuroimaging Inititative (ADNI). Our results show that

deeper and pretrained neural networks surpass shallower networks

in classication accuracy.

2 APPROACH

Convolutional neural networks are hierarchical methods for end-

to-end feature learning. They are composed of a series of nonlinear

functions that transform pixels from an input image into class

scores for prediction. Convolutional layers break the input into

receptive elds where weights are tied across elds – preventing

ACM-BCB ’17, August 20-23, 2017, Boston, MA, USA Aly Valliani and Ameet Soni

overparametization as in multilayer perceptrons. The layer-wise

architecture enables the network to learn increasingly abstract

spatial features to dierentiate object categories.

The residual neural network is a CNN variant that employs short-

cut connections (as seen in Fig. 1) to allow input from lower layers

of the network to be available to nodes at higher layers. These con-

nections are constructed from residuals blocks that approximate a

residual function using input transformed from the preceding layer

and identity mappings of input from layers much further down

the network. Specically, if a typical layer aims to learn a latent

representation

F (x )

, a residual block models this representation

with the inclusion of an identity connection; i.e.,

H (x) = F (x) + x

(see He et al. [

] for details). The architecture allows multiple path-

ways for gradients to ow through the network, which permits the

creation of much deeper networks without the burden of vanishing

gradients. The residual blocks have also been more recently con-

ceptualized as independent networks, thereby making the residual

network an ensemble of multiple independent networks.

For our task, we construct a deep residual network consisting

of 18 layers modeled after the ResNet-18 architecture. To initialize

weights, we use the already learned weights on the ImageNet data

set as specied in He et al. [

] (thus, the term pretrained). Since

our task is dierent than ImageNet, we only take the convolutional

layers as lters (i.e., we omit the fully connected classier). We

add two fully-connected layers, with 1000 and 100 hidden units

respectively, that predict three outputs using a softmax classier.

The pretrained network is ne-tuned on MRI (see below) with

real-time data augmentation (ane transformations) to prevent

overtting. All networks include batch normalization after every

convolutional layer and utilize the ReLU activation function. Net-

works were trained with mini-batch stochastic gradient descent

using an early-stopping criteria.

3 RESULTS AND DISCUSSION

Data used in the preparation of this article were obtained from

the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database

(adni.loni.ucla.edu). Our data set includes the median axial slice

of 660 ADNI images (only the rst image for each patient was

maintained to avoid data leakage), 188 of which were diagnosed as

having Alzheimer’s Disease (AD), 243 as Mild Cognitive Impairment

(MCI) and 229 as Cognitively Normal (CN) based on examination

by a medical expert. Each image was skull-stripped and registered.

Using an 80/20 train/test set split, we aimed to assess our hypothesis

that pretrained residual networks would improve AD diagnosis. To

do so, we asked the following questions:

• Q1

: Does a pretrained residual network transfer to the MRI

domain to improve prediction in AD diagnosis?

• Q2: Does pretraining inuence ResNet’s success?

• Q3

: Does data augmentation improve the ResNet’s ability to

adapt to MRI images?

To answer these questions, we compare four dierent classiers

using accuracy on predicting on the 2-class AD vs. CN problem as

well as the more dicult 3-way classication (AD vs. MCI vs. CN).

The results on the held aside test set are in Table 1. All approaches

are implemented using the Torch7 library.

Model AD vs. CN 3-way

Baseline CNN 73.8% 49.2%

ResNet 77.5% 50.8%

Pretrained ResNet 78.8% 56.1%

Pretrained ResNet + augmentation 81.3% 56.8%

Table 1: Accuracy on Alzheimer’s Disease (AD) vs. Cogni-

tively Normal (CN) classication and 3-way classication

(AD vs. MCI vs. CN)

For

, we trained two networks – the proposed approach in

Sec. 2 (pretrained ResNet + augmentation) and a baseline CNN of

one convolutional layer containing 5x5 kernels and 64 feature maps,

and two fully-connected layers containing 1000 and 100 hidden

units, respectively, each with dropout prior to the non-linearity (to

approximate existing approaches [

]). Our results show a large

improvement in accuracy over the baseline CNN model on both

tasks; thus we can answer

Q1 armatively

– the ResNet structure

successfully adapts to the MRI domain and improves prediction.

asks whether this improvement is due to pretraining, the

deep residual structure, or both? Our results show that both fea-

tures are important. The ResNet using randomly initialized weights

improves upon the baseline CNN in both tasks. Furthermore, pre-

training boasts higher accuracy on both tasks than the randomly

initialized ResNet. Thus, we can answer

Q2 armatively

that both

are key aspects to the result, though with dierent magnitudes of

importance in the two tasks we tested.

Lastly, a key method for regularizing networks and simulating

more data is the use of real-time data augmentation (ane transfor-

mations of the data through rotations, ips and translations during

training). The last two rows in Table 1 examine a pretrained ResNet

without and with data augmentation, respectively. We can also

answer

Q3 armatively

, as augmentation improves accuracy on

both tasks.

The results of this initial work show that our framework makes

signicant contributions through the use of both pretraining and

very deep residual neural networks. As part of future work, we

hope to understand more closely the contribution of pretraining

versus depth. Additionally, we are currently extending the network

to use 3D convolutions and exploring other avenues for volumetric

analysis. Finally, we plan to transfer our model to the more im-

portant medical question of early diagnosis: can we predict which

patients with MCI are likely to later develop Alzheimer’s Disease.

REFERENCES

[1]

Ashish Gupta, Murat Ayhan, and Anthony Maida. 2013. Natural Image Bases to

Represent Neuroimaging Data. In Proceedings of the 30th International Conference

on Machine Learning. Atlanta, Georgia, USA, 987–994.

[2]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual

Learning for Image Recognition. CoRR abs/1512.03385 (2015).

[3]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haner. 1998. Gradient-

based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–

2324.

[4]

Adrien Payan and Giovanni Montana. 2015. Predicting Alzheimer’s disease: a neu-

roimaging study with 3D convolutional neural networks. CoRR abs/1502.02506

(2015).

[5]

Heung-Il Suk and Dinggang Shen. 2013. Deep Learning-Based Feature Representa-

tion for AD/MCI Classication. In Proceedings of the 16th International Conference

on Medical Image Computing and Computer-Assisted Intervention. 583–590.

Deep Residual Nets for Improved Alzheimer's Diagnosis

Figures

Citations

Convolutional neural networks for classification of Alzheimer's disease: Overview and reproducible evaluation.

Deep learning to detect Alzheimer's disease from neuroimaging: A systematic literature review.

Deep Learning and Neurology: A Systematic Review.

Binary Classification of Alzheimer’s Disease Using sMRI Imaging Modality and Deep Learning

Transfer Learning for Alzheimer's Disease Detection on MRI Images

References

Deep Residual Learning for Image Recognition

Deep Residual Learning for Image Recognition

GradientBased Learning Applied to Document Recognition

Deep Learning-Based Feature Representation for AD/MCI Classification

Predicting Alzheimer's disease: a neuroimaging study with 3D convolutional neural networks

Related Papers (5)

Deep Residual Learning for Image Recognition

Going deeper with convolutions

Convolutional neural networks for classification of Alzheimer's disease: Overview and reproducible evaluation.

Hierarchical Fully Convolutional Network for Joint Atrophy Localization and Alzheimer's Disease Diagnosis Using Structural MRI

Open Access Series of Imaging Studies (OASIS): Cross-sectional MRI Data in Young, Middle Aged, Nondemented, and Demented Older Adults

Frequently Asked Questions (13)

Q1. What are the contributions in "Deep residual nets for improved alzheimer’s diagnosis" ?

Q2. What are the future works in "Deep residual nets for improved alzheimer’s diagnosis" ?

Q3. What is the key method for regularizing networks and simulating more data?

Q4. What is the architecture of the network?

Q5. What is the main shortcoming of existing CNN-based approaches?

Q6. What was the purpose of this study?

Q7. How many layers are used to predict the outputs of the network?

Q8. What is the architecture for the residual network?

Q9. What is the main limitation of the CNN?

Q10. What is the way to train a neural network?

Q11. What is the way to train a network?

Q12. What is the purpose of this paper?

Q13. What is the name of the paper?