SORA

Advancing, promoting and sharing knowledge of health through excellence in teaching, clinical practice and research into the prevention and treatment of illness

Can ChatGPT make surgical decisions with confidence similar to experienced knee surgeons?

Musbahi, O; Nurek, M; Pouris, K; Vella-Baldacchino, M; Bottle, A; Hing, C; Kostopoulou, O; Cobb, JP; Jones, GG (2024) Can ChatGPT make surgical decisions with confidence similar to experienced knee surgeons? Knee, 51. pp. 120-129. ISSN 0968-0160 https://doi.org/10.1016/j.knee.2024.08.015
SGUL Authors: Hing, Caroline Blanca

[img] PDF Published Version
Available under License Creative Commons Attribution Non-commercial.

Download (2MB)

Abstract

BACKGROUND: Unicompartmental knee replacements (UKRs) have become an increasingly attractive option for end-stage single-compartment knee osteoarthritis (OA). However, there remains controversy in patient selection. Natural language processing (NLP) is a form of artificial intelligence (AI). We aimed to determine whether general-purpose open-source natural language programs can make decisions regarding a patient's suitability for a total knee replacement (TKR) or a UKR and how confident AI NLP programs are in surgical decision making. METHODS: We conducted a case-based cohort study using data from a separate study, where participants (73 surgeons and AI NLP programs) were presented with 32 fictitious clinical case scenarios that simulated patients with predominantly medial knee OA who would require surgery. Using the overall UKR/TKR judgments of the 73 experienced knee surgeons as the gold standard reference, we calculated the sensitivity, specificity, and positive predictive value of AI NLP programs to identify whether a patient should undergo UKR. RESULTS: There was disagreement between the surgeons and ChatGPT in only five scenarios (15.6%). With the 73 surgeons' decision as the gold standard, the sensitivity of ChatGPT in determining whether a patient should undergo UKR was 0.91 (95% confidence interval (CI): 0.71 to 0.98). The positive predictive value for ChatGPT was 0.87 (95% CI: 0.72 to 0.94). ChatGPT was more confident in its UKR decision making (surgeon mean confidence = 1.7, ChatGPT mean confidence = 2.4). CONCLUSIONS: It has been demonstrated that ChatGPT can make surgical decisions, and exceeded the confidence of experienced knee surgeons with substantial inter-rater agreement when deciding whether a patient was most appropriate for a UKR.

Item Type: Article
Additional Information: © 2024 IMPERIAL COLLEGE LONDON. Published by Elsevier B.V. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Keywords: Artificial intelligence, Decision making, Knee arthroplasty, Natural language processing, Humans, Arthroplasty, Replacement, Knee, Osteoarthritis, Knee, Clinical Decision-Making, Natural Language Processing, Artificial Intelligence, Female, Male, Patient Selection, Clinical Competence, Middle Aged, Surgeons
SGUL Research Institute / Research Centre: Academic Structure > Cardiovascular & Genomics Research Institute
Academic Structure > Cardiovascular & Genomics Research Institute > Clinical Cardiology
Journal or Publication Title: Knee
ISSN: 0968-0160
Language: eng
Media of Output: Print-Electronic
Related URLs:
Publisher License: Creative Commons: Attribution-Noncommercial 4.0
Projects:
Project IDFunderFunder ID
302632National Institute for Health Researchhttp://dx.doi.org/10.13039/501100000272
PSTRC-2016-004Patient Safety Translational Research Centrehttps://doi.org/10.13039/501100013631
PubMed ID: 39255525
Go to PubMed abstract
URI: https://openaccess.sgul.ac.uk/id/eprint/117432
Publisher's version: https://doi.org/10.1016/j.knee.2024.08.015

Actions (login required)

Edit Item Edit Item