Alzheimer's Dementia Recognition through Spontaneous Speech
The ADReSS Challenge
News:
Dementia is a category of neurodegenerative diseases that entails
a long-term and usually gradual decrease of cognitive functioning.
The main risk factor for dementia is age, and therefore its
greatest incidence is amongst the elderly. Due to the severity of
the situation worldwide, institutions and researchers are
investing considerably on dementia prevention and early detection,
focusing on disease progression. There is a need for
cost-effective and scalable methods for detection of dementia from
its most subtle forms, such as the preclinical stage of Subjective
Memory Loss (SML), to more severe conditions like Mild Cognitive
Impairment (MCI) and Alzheimer's Dementia (AD) itself.
While a number of studies have investigated speech and language
features for the detection of Alzheimer's Disease and mild
cognitive impairment, and proposed various signal processing and
machine learning methods for this prediction task, the field still
lacks balanced and standardised data sets on which these different
approaches can be systematically compared.
The main objective of the ADReSS challenge is to make
available a benchmark dataset of
spontaneous speech, which is acoustically pre-processed
and balanced in terms of age and gender, defining a shared task
through which different approaches to AD recognition in spontaneous
speech can be compared. We expect that this challenge will bring
together groups working on this active area of research, and
provide the community with the very first comprehensive comparison
of different approaches to AD recognition using this
benchmark dataset.
In sum:
- The ADReSS Challenge will target a difficult automatic
prediction problem of societal and medical relevance, namely, the
detection of cognitive impairment and
Alzheimer's Dementia (AD). To the best of our knowledge, this will
be the first such shared-task event focused on AD.
- While a number of researchers have proposed speech processing and
natural language procesing approaches to AD recognition through
speech, their studies have used different, often unbalanced and
acoustically varied data sets, consequently hindering
reproducibility and comparability of approaches. The ADReSS
Challenge will provide a forum for those different research groups
to test their existing methods (or develop novel approaches) on a
new shared standardized dataset.
- Th ADReSS Challenge dataset has been carefully selected so as to
mitigate common biases often overlooked in evaluations of AD
detection methods, including repeated occurrences
of speech from the same participant (common in longitudinal
datasets), variations in audio quality, and imbalances of gender and
age distribution.
- Unlike some tests performed in clinical settings, where short
speech samples are collected under controlled conditions, this task
focuses AD recognition using spontaneous speech.
How to participate
The ADReSS challenge consists of two tasks:
- an AD classification task, where you are required to produce a
model to predict the label (AD or non-AD) for a speech session. Your
model can use speech data, language data (transcipts are
provided), or both.
- an MMSE score regression task, where you will create a model to
infer the subject's Mini Mental Status Examination (MMSE) score
based on speech and/or language data.
You may choose to do one of these tasks, or both. You will be
provided with access to a training set (see relevant section below), and two weeks prior to the paper
submission deadline you will be given access to a separate set on which to
test your model. You may send your results to us for scoring up to 5 times
. You are required to submit all your attempts (up to 5 per task) together, in
separate files (see detailed instructions in the Readme file
distributed with the test set, below).
You will also be expected to
submit a paper to INTERSPEECH 2020, describing your approach and
results. If your paper is accepted, it will be presented at the
conference in the ADReSS special session.
Access to the data set
In order to gain access to the ADReSS data, you will need
to become a member of
DementiaBank (free of charge) by contacting Brian MacWhinney
on this email. You should include
your contact information and affiliation, as well as a general
statement on how you plan to use the data, with specific mention to
the ADReSS challenge. If you are a student, please ask your
supervisor to join as a member as well. This membership will give you full
access to the DementiaBank database, where the ADReSS data set will be
available and clearly identified. For further information,
visit DementiaBank.
Once you have become a member of DementiaBank, please email us
at Fasih.Haider@ed.ac.uk for futher
instructions.
The test data are now available! Please email ADReSS_is2020@ed.ac.uk for instructions on how to download it.
The data set
The DementiaBank directory to which you will gain access will contain
only the training data for the ADReSS Challenge. This will
consists of four folders of data (full enhanced
audio, normalised sub-chunks, transcriptions) as well as two text
files with information on age, gender and MMSE scores for participants
with and without a diagnosis of AD (cc_meta_data.txt,
cd_meta_data.txt). A README file is also included for further
details.
The composition of the full dataset is shown below:
| AD | non-AD |
Age Interval | Male | Female | Male | Female |
[50, 55) | 2 | 0 | 2 | 0 |
[55, 60) | 7 | 6 | 7 | 6 |
[60, 65) | 4 | 9 | 4 | 9 |
[65, 70) | 9 | 14 | 9 | 14 |
[70, 75) | 9 | 11 | 9 | 11 |
[75, 80) | 4 | 3 | 4 | 3 |
Total | 35 | 43 | 35 | 43 |
Each session was segmented for voice activity using a voice activity
detection system based on a signal energy threshold. We set the log
energy threshold parameter to 65dB with a maximum duration of 10
seconds per speech segment. The segmented dataset contains 1,955
speech segments from 78 non-AD subjects and 2122 speech segments from
78 AD subjects. The average number of speech segments produced per
participant was 24.86 (standard deviationsd= 12.84). Audio volume was
normalised across all speech segments to control for variation caused
by recording conditions, such as microphone placement.
Performance Metrics
Task 1 (AD classification) will be evaluated through the following metrics:
\[ \displaystyle \operatorname {Accuracy} = {\frac { TN + TP }{N} } \]
and
\[ \displaystyle \operatorname {F_1} = { 2 \frac { \pi \times \rho
}{\pi + \rho} } \]
where
\[ \displaystyle \operatorname {\pi} = { \frac { TP }{TP + FP} }, \]
\[ \displaystyle \operatorname {\rho} = { \frac { TP }{TP + FN} }, \]
N is the number of patients, TP is the number of true
positives, TN is the number of true negatives, FP is the number of
false positives and FN the number of false negatives.
Task 2 (MMSE prediction) will be evaluated using the root mean squared error:
\[ \displaystyle \operatorname {RMSE} ={\sqrt {\frac {\sum _{i=1}^{N}({\hat {y}}_{i}-y_{i})^{2}}{N}}}. \]
where $\hat{y}$ is the predicted MMSE score amd $y$ is the patient's
actual MMSE score.
Baseline Results
A basic set of baseline results can be found in the paper
below. Papers submitted to this Challenge using the ADReSS dataset
need to cite it as follows:
- S. Luz, F. Haider, S. de la Fuente, D. Fromm, and B. MacWhinney.
Alzheimer's dementia recognition through spontaneous speech: The
ADReSS challenge.
In Proceedings of INTERSPEECH 2020, Shanghai, China, 2020.
[ bib |
http ]
Test set labels and MMSE scores
The test set prediction targets are now available for download through
this link. These could be useful, for instance, if you wish to
prepare an extended version of your paper for our
Frontiers Special
Research Topic on Alzheimer's dementia recognition through
spontaneous speech
Important Dates
- January 24, 2020: ADReSS Challenged announced, training data made available
- March 15, 2020: test data made available
-
March 17, 2020 April 23, 2020: Submission of results opens (period for
submision: April 23 to May 8)
- May 8, 2020: Paper submission deadline
- July 24, 2020: Paper acceptance/rejection notification
- October 26-29, 2020: INTERSPEECH'2020, in Shanghai, China.
See
other
important dates on the INTERSPEECH 2020 website.
Paper Submission
Please format your paper following
the INTERSPEECH
2020 guidelines, and submit it indicating that it is meant for the ADReSS Challenge.
Papers submitted to this Challenge need to cite:
- S. Luz, F. Haider, S. de la Fuente, D. Fromm, and B. MacWhinney.
Alzheimer's dementia recognition through spontaneous speech: The
ADReSS challenge.
In Proceedings of INTERSPEECH 2020, Shanghai, China, 2020.
[ bib |
http ]
Special Issue
Revised and extended versions of the papers accepted for the ADReSS
Challenge can also be submitted to
a Special
Research Topic on Alzheimer's dementia recognition through
spontaneous speech jointly
hosted by journals Frontiers in Aging Neuroscience, Frontiers in
Psychology
and Frontiers in Computer Science.
Organizers
Saturnino Luz is a Reader at the Usher Institute,
University of Edinburgh's Medical School. He works in medical
informatics, devising and applying machine learning, signal
processing and natural language processing methods in the study of
behaviour and communication in healthcare contexts. His main
research interest is the computational modelling of behavioural and
biological changes caused by neurodegenerative diseases, with focus
on the analysis of vocal and linguistic signals in Alzheimers's
disease.
|
Fasih Haider is a Research Fellow at
Usher Institute, University of Edinburgh's Medical School, UK. His areas of interest
are Social Signal Processing and Artificial Intelligence.
Before joining the Usher Institute, he was a Research Engineer at the ADAPT
Centre where he worked on methods of Social Signal Processing for video
intelligence. He holds a PhD in Computer Science from Trinity College
Dublin, Ireland. Currently, he is investigating the use of
social signal processing and machine learning for monitoring cognitive
health.
|
Sofia de la Fuente graduated in Psychology (BSc
Hons) at the Universidad Complutense de Madrid in 2015, and later in
Methodology for Behavioural and Health Sciences (MSc Hons) by the
Universidad Autonoma de Madrid in 2017. Recently, she became an
Associate Fellow of the Higher Education Academy, and is currently
finishing a Doctoral Training Programmein Precision Medicine at the University of Edinburgh. Her
doctoral research is an exploratory study of
psycholinguistics, linguistics, paralinguistics and acoustic features
that may help predict dementia onset later in life.
|
Davida Fromm is a Special Faculty member in the
Psychology Department at Carnegie Mellon University. Her research
interests have focused on aphasia, dementia, and apraxia of speech in
adults. For the past 12 years, she has helped to develop a large
shared database of multi-media discourse samples for a variety of
neurogenic communication disorders. The database includes educational
resources and research tools for an increasing number of automated
language analyses.
|
Brian MacWhinney is Teresa Heinz Professor of
Psychology, Computational Linguistics,and Modern Languages at Carnegie
Mellon University. He received his Ph.D. in psycholinguistics in 1974
from the University of California at Berkeley. With Elizabeth Bates,
he developed a model of first and second language processing and
acquisition based on competition between item-based patterns. In
1984, he and Catherine Snow co-founded the CHILDES (Child Language Data
Exchange System) Project for the computational study of child
language transcript data. This system has extended to 13 additional
research areas such aphasiology, second language learning, TBI,
Conversation Analysis, developmental disfluency and others in the shape
of the TalkBank Project. MacWhinney's recent work includes studies
of online learning of second language vocabulary and grammar,
situationally embedded second language learning, neural network
modeling of lexical development, fMRI studies of children with focal
brain lesions, and ERP studies of between-language competition. He
is also exploring the role of grammatical constructions in the marking
of perspective shifting, the determination of linguistic forms across
contrasting time frames, and the construction of mental models in
scientific reasoning. Recent edited books include The Handbook of
Language Emergence (Wiley) and Competing Motivations in Grammar and
Usage (Oxford).
|
Sponsorship
The ADReSS Challenge acknowledges the support and sponsorship of the European Union's Horizon 2020 research
programme, under grant agreement No 769661, towards the SAAM project,
and of the above mentioned Frontiers publications.
|