SORA

Advancing, promoting and sharing knowledge of health through excellence in teaching, clinical practice and research into the prevention and treatment of illness

Ultrasound Image Representation Learning by Modeling Sonographer Visual Attention.

Droste, R; Cai, Y; Sharma, H; Chatelain, P; Drukker, L; Papageorghiou, AT; Noble, JA (2019) Ultrasound Image Representation Learning by Modeling Sonographer Visual Attention. In: 26th International Conference on Information Processing in Medical Imaging (IPMI), Inf Process Med Imaging, June 2-7 2019, Hong Kong University of Science and Technology.
SGUL Authors: Papageorghiou, Aris

[img]
Preview
PDF Accepted Version
Available under License ["licenses_description_publisher" not defined].

Download (3MB) | Preview

Abstract

Image representations are commonly learned from class labels, which are a simplistic approximation of human image understanding. In this paper we demonstrate that transferable representations of images can be learned without manual annotations by modeling human visual attention. The basis of our analyses is a unique gaze tracking dataset of sonographers performing routine clinical fetal anomaly screenings. Models of sonographer visual attention are learned by training a convolutional neural network (CNN) to predict gaze on ultrasound video frames through visual saliency prediction or gaze-point regression. We evaluate the transferability of the learned representations to the task of ultrasound standard plane detection in two contexts. Firstly, we perform transfer learning by fine-tuning the CNN with a limited number of labeled standard plane images. We find that fine-tuning the saliency predictor is superior to training from random initialization, with an average F1-score improvement of 9.6% overall and 15.3% for the cardiac planes. Secondly, we train a simple softmax regression on the feature activations of each CNN layer in order to evaluate the representations independently of transfer learning hyper-parameters. We find that the attention models derive strong representations, approaching the precision of a fully-supervised baseline model for all but the last layer.

Item Type: Conference or Workshop Item (Poster)
Additional Information: This is a post-peer-review, pre-copyedit version of an article published in Information Processing in Medical Imaging. IPMI 2019. Lecture Notes in Computer Science, vol 11492. The final authenticated version is available online at: http://dx.doi.org/10.1007/978-3-030-20351-1_46
Keywords: Convolutional neural networks, Fetal ultrasound, Gaze tracking, Representation learning, Saliency prediction, Self-supervised learning, Transfer learning, Representation learning, Gaze tracking, Fetal ultrasound, Self-supervised learning, Saliency prediction, Transfer learning, Convolutional neural networks
SGUL Research Institute / Research Centre: Academic Structure > Institute of Medical & Biomedical Education (IMBE)
Academic Structure > Institute of Medical & Biomedical Education (IMBE) > Centre for Clinical Education (INMECE )
Journal or Publication Title: Inf Process Med Imaging
ISSN: 1011-2499
Dates:
DateEvent
June 2019Published
22 May 2019Published Online
26 February 2019Accepted
Publisher License: Publisher's own licence
Projects:
Project IDFunderFunder ID
694581European Research Councilhttp://dx.doi.org/10.13039/501100000781
EP/R013853/1Engineering and Physical Sciences Research Councilhttp://dx.doi.org/10.13039/501100000266
EP/M013774/1Engineering and Physical Sciences Research Councilhttp://dx.doi.org/10.13039/501100000266
PubMed ID: 31992944
Web of Science ID: WOS:000493380900046
Go to PubMed abstract
URI: https://openaccess.sgul.ac.uk/id/eprint/111944
Publisher's version: https://doi.org/10.1007/978-3-030-20351-1_46

Actions (login required)

Edit Item Edit Item