SORA

Advancing, promoting and sharing knowledge of health through excellence in teaching, clinical practice and research into the prevention and treatment of illness

A visually grounded language model for fetal ultrasound understanding.

Guo, X; Alsharid, M; Zhao, H; Wang, Y; Lander, J; Papageorghiou, AT; Noble, JA (2026) A visually grounded language model for fetal ultrasound understanding. Nat Biomed Eng. ISSN 2157-846X https://doi.org/10.1038/s41551-025-01578-3
SGUL Authors: Papageorghiou, Aris

[img] PDF Published Version
Available under License Creative Commons Attribution.

Download (13MB)
[img] PDF (Reporting Summary) Supporting information
Download (62kB)
[img] PDF (Peer Review File) Supporting information
Available under License Creative Commons Attribution.

Download (1MB)
[img] Video (QuickTime) (Supplementary Video 1) Supporting information
Download (118MB)

Abstract

Freehand fetal ultrasound examinations require substantial clinical skill. Here we propose Sonomate (mate of a sonographer), an AI assistant to a user during fetal ultrasound examinations. Sonomate is based on aligning video features and text features derived from transcribed audio to facilitate real-time interactions between an ultrasound machine and a user. Our approach combines coarse-grained video-text alignment with fine-grained image-sentence alignment to build a robust visually grounded language model capable of understanding fetal ultrasound videos. To tackle the challenges associated with heterogeneous language and asynchronous content in real-world video-audio pairs, we design the anatomy-aware alignment and context label correction in the fine-grained alignment. Sonomate is effective at anatomy detection in fetal ultrasound images without the need for retraining on manually annotated data. Furthermore, Sonomate shows promising performance in visual question answering for both fetal ultrasound images and videos. Guardrails are built to ensure the safety of Sonomate during deployment. This advancement paves the way towards AI-assistive technology being used to support sonography training and enhanced diagnostic capabilities.

Item Type: Article
Additional Information: Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. © Crown 2026
SGUL Research Institute / Research Centre: Academic Structure > Institute of Medical, Biomedical and Allied Health Education (IMBE)
Academic Structure > Institute of Medical, Biomedical and Allied Health Education (IMBE) > Centre for Clinical Education (INMECE )
Journal or Publication Title: Nat Biomed Eng
ISSN: 2157-846X
Language: eng
Related URLs:
Publisher License: Creative Commons: Attribution 4.0
Projects:
Project IDFunderFunder ID
EP/X040186/1UK Research and Innovationhttps://doi.org/10.13039/100014013
EP/T028572/1Engineering and Physical Sciences Research Councilhttp://dx.doi.org/10.13039/501100000266
ERC-ADG-2015 694581European Research Councilhttp://dx.doi.org/10.13039/501100000781
22203525Hong Kong Research Grants CouncilUNSPECIFIED
UNSPECIFIEDNational Institute for Health and Care Research (NIHR) Oxford Biomedical Research CentreUNSPECIFIED
PubMed ID: 41540148
Dates:
Date Event
2026-01-15 Published Online
2025-10-24 Accepted
Go to PubMed abstract
URI: https://openaccess.sgul.ac.uk/id/eprint/118201
Publisher's version: https://doi.org/10.1038/s41551-025-01578-3

Actions (login required)

Edit Item Edit Item