SORA

Advancing, promoting and sharing knowledge of health through excellence in teaching, clinical practice and research into the prevention and treatment of illness

Features and machine learning classification of connected speech samples from patients with autopsy proven Alzheimer's disease with and without additional vascular pathology.

Rentoumi, V; Raoufian, L; Ahmed, S; de Jager, CA; Garrard, P (2014) Features and machine learning classification of connected speech samples from patients with autopsy proven Alzheimer's disease with and without additional vascular pathology. J Alzheimers Dis, 42 (Suppl 3). S3-S17. ISSN 1875-8908 https://doi.org/10.3233/JAD-140555
SGUL Authors: Garrard, Peter

[img] Microsoft Word (.doc) Accepted Version
Available under License ["licenses_description_publisher" not defined].

Download (549kB)

Abstract

Mixed vascular and Alzheimer-type dementia and pure Alzheimer's disease are both associated with changes in spoken language. These changes have, however, seldom been subjected to systematic comparison. In the present study, we analyzed language samples obtained during the course of a longitudinal clinical study from patients in whom one or other pathology was verified at post mortem. The aims of the study were twofold: first, to confirm the presence of differences in language produced by members of the two groups using quantitative methods of evaluation; and secondly to ascertain the most informative sources of variation between the groups. We adopted a computational approach to evaluate digitized transcripts of connected speech along a range of language-related dimensions. We then used machine learning text classification to assign the samples to one of the two pathological groups on the basis of these features. The classifiers' accuracies were tested using simple lexical features, syntactic features, and more complex statistical and information theory characteristics. Maximum accuracy was achieved when word occurrences and frequencies alone were used. Features based on syntactic and lexical complexity yielded lower discrimination scores, but all combinations of features showed significantly better performance than a baseline condition in which every transcript was assigned randomly to one of the two classes. The classification results illustrate the word content specific differences in the spoken language of the two groups. In addition, those with mixed pathology were found to exhibit a marked reduction in lexical variation and complexity compared to their pure AD counterparts.

Item Type: Article
Additional Information: The final publication is available at IOS Press through http://dx.doi.org/10.3233/JAD-140555
Keywords: Alzheimer's disease, computational methods, diagnosis, language, machine learning, vascular dementia, Aged, Aged, 80 and over, Alzheimer Disease, Artificial Intelligence, Autopsy, Female, Humans, Information Theory, Language, Longitudinal Studies, Male, Middle Aged, Neuropsychological Tests, Speech, Vascular Diseases, Humans, Alzheimer Disease, Vascular Diseases, Autopsy, Longitudinal Studies, Language, Speech, Neuropsychological Tests, Artificial Intelligence, Information Theory, Aged, Aged, 80 and over, Middle Aged, Female, Male, Alzheimer's disease, computational methods, diagnosis, language, machine learning, vascular dementia, 1103 Clinical Sciences, 1702 Cognitive Sciences, 1109 Neurosciences, Neurology & Neurosurgery
SGUL Research Institute / Research Centre: Academic Structure > Molecular and Clinical Sciences Research Institute (MCS)
Journal or Publication Title: J Alzheimers Dis
ISSN: 1875-8908
Language: eng
Dates:
DateEvent
2 September 2014Published
3 July 2014Accepted
Publisher License: Publisher's own licence
Projects:
Project IDFunderFunder ID
G0801370Medical Research Councilhttp://dx.doi.org/10.13039/501100000265
PubMed ID: 25061045
Web of Science ID: WOS:000341595800002
Go to PubMed abstract
URI: https://openaccess.sgul.ac.uk/id/eprint/111768
Publisher's version: https://doi.org/10.3233/JAD-140555

Actions (login required)

Edit Item Edit Item