SORA

Advancing, promoting and sharing knowledge of health through excellence in teaching, clinical practice and research into the prevention and treatment of illness

Trustworthy Evaluation of Clinical AI for Analysis of Medical Images in Diverse Populations

Fajtl, J; Welikala, RA; Barman, S; Chambers, R; Bolter, L; Anderson, J; Olvera-Barrios, A; Shakespeare, R; Egan, C; Owen, CG; et al. Fajtl, J; Welikala, RA; Barman, S; Chambers, R; Bolter, L; Anderson, J; Olvera-Barrios, A; Shakespeare, R; Egan, C; Owen, CG; Tufail, A; Rudnicka, AR (2024) Trustworthy Evaluation of Clinical AI for Analysis of Medical Images in Diverse Populations. NEJM AI, 1 (9). ISSN 2836-9386 https://doi.org/10.1056/aioa2400353
SGUL Authors: Rudnicka, Alicja Regina Owen, Christopher Grant

[img] Microsoft Word (.docx) Accepted Version
Restricted to Repository staff only until 13 February 2025.
Available under License ["licenses_description_publisher" not defined].

Download (1MB)
[img] Microsoft Word (.docx) (Supplementary Material) Supplemental Material
Restricted to Repository staff only until 13 February 2025.
Available under License ["licenses_description_publisher" not defined].

Download (567kB)

Abstract

Background The deployment of algorithms in health care screening programs has been hindered by a lack of agreed-upon methodology to evaluate trustworthiness and equity. We outline transferable methodology for independent evaluation of algorithms using a routine, high-volume, multiethnic national diabetic eye screening program as an exemplar. Automated retinal image analysis systems (ARIAS), including artificial intelligence (AI), for detection of diabetic retinopathy (DR) could substantially increase image-grading capacity. We report technical and operational considerations relevant to implementation and evaluation in large-scale population screening. Methods Twenty-five vendors with current or pending Conformité Européene Class IIa ARIAS for DR detection from retinal images were invited. Sample data (6268 images) were provided to confirm that ARIAS outputs could be replicated in a trusted research environment. We curated consecutive routine screening encounters between January 1, 2021 and December 31, 2022 at the North East London Diabetic Eye Screening Programme for evaluation. Sample size calculations focused on precision for detection of severe DR by population subgroups, particularly ethnicity. Vendor algorithms did not have access to human grading data or other metadata during image processing. Results Eight of 25 eligible vendors participated. In total, 202,886 encounters were evaluated, representing 1.2 million images from 32% white, 17% Black, and 39% South Asian ethnic groups, including approximately 25,000 cases requiring referral to ophthalmology for review and treatment. Image resolutions varied from 150 × 300 to 6000 × 4000 pixels. Time from study invitation to ARIAS installation and algorithm verification ranged from 96 to 460 days; image processing required between 13.5 hours and 105 days. Conclusions This comparison of ARIAS at scale on a range of images with different characteristics, including a population of different ethnicities, wide age range, levels of deprivation, and spectrum of DR, provides the framework for transparent, equitable, robust, and trustworthy evaluation of clinical AI in screening to inform standards in health care before deployment. (Funded by the NHS Transformation Directorate and The Health Foundation and managed by the National Institute for Health and Social Care Research.)

Item Type: Article
Additional Information: From NEJM AI, Trustworthy Evaluation of Clinical AI for Analysis of Medical Images in Diverse Populations, Trustworthy Evaluation of Clinical AI for Analysis of Medical Images in Diverse Populations, 1(9), Copyright © 2024. Massachusetts Medical Society. Reprinted with permission.
SGUL Research Institute / Research Centre: Academic Structure > Population Health Research Institute (INPH)
Journal or Publication Title: NEJM AI
ISSN: 2836-9386
Language: en
Dates:
DateEvent
22 August 2024Published
13 August 2024Published Online
25 June 2024Accepted
Publisher License: Publisher's own licence
Projects:
Project IDFunderFunder ID
AI_HI200008National Institute for Social Care and Health Researchhttp://dx.doi.org/10.13039/100009250
URI: https://openaccess.sgul.ac.uk/id/eprint/116912
Publisher's version: https://doi.org/10.1056/aioa2400353

Actions (login required)

Edit Item Edit Item