Open AccessPosted Content
Unsupervised Multimodal Representation Learning across Medical Images and Reports
TLDR
This paper established baseline joint embedding results measured via both local and global retrieval methods on the soon-to-be released MIMIC-CXR dataset consisting of both chest X-ray images and the associated radiology reports.Abstract:
Joint embeddings between medical imaging modalities and associated radiology reports have the potential to offer significant benefits to the clinical community, ranging from cross-domain retrieval to conditional generation of reports to the broader goals of multimodal representation learning. In this work, we establish baseline joint embedding results measured via both local and global retrieval methods on the soon to be released MIMIC-CXR dataset consisting of both chest X-ray images and the associated radiology reports. We examine both supervised and unsupervised methods on this task and show that for document retrieval tasks with the learned representations, only a limited amount of supervision is needed to yield results comparable to those of fully-supervised methods.read more
Citations
More filters
Posted Content
Clinically Accurate Chest X-Ray Report Generation
Guanxiong Liu,Tzu-Ming Harry Hsu,Matthew B. A. McDermott,Willie Boag,Wei-Hung Weng,Peter Szolovits,Marzyeh Ghassemi +6 more
TL;DR: A domain-aware automatic chest X-ray radiology report generation system which first predicts what topics will be discussed in the report, then conditionally generates sentences corresponding to these topics, and is fine-tuned using reinforcement learning.
Journal ArticleDOI
Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
TL;DR: This paper proposes a taxonomy of 6 core technical challenges: representation, alignment, reasoning, generation, transference, and quantification covering historical and recent trends, and defines two key principles of modality heterogeneity and interconnections that have driven subsequent innovations.
Book ChapterDOI
Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing
Benedikt Böcking,Naoto Usuyama,Shruthi Bannur,Daniel C. Castro,Anton Schwaighofer,Stephanie L. Hyland,T. Baumann,Aditya Nori,Javier Alvarez-Valle,H. Poon,Ozan Oktay +10 more
TL;DR: This paper proposed a self-supervised joint vision-language approach with a focus on better text modelling, which achieved state-of-the-art results in radiology natural language inference through its improved vocabulary and novel language pretraining objective leveraging semantics and discourse characteristics in radiological reports.
Posted Content
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li,Hanyin Wang,Yuan Luo +2 more
TL;DR: External evaluation using the OpenI dataset shows that the joint embedding learned by pre-trained LXMERT, VisualBERT, UNIER and PixelBERT models demonstrates performance improvement of 1.4% in thoracic finding classification tasks compared to a pioneering CNN + RNN model.
Proceedings ArticleDOI
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li,Hanyin Wang,Yuan Luo +2 more
TL;DR: In this article, the authors adopt four pre-trained models: LXMERT, VisualBERT, UNIER and PixelBERT to learn multimodal representation from MIMIC-CXR images and associated reports.
References
More filters
Proceedings ArticleDOI
Glove: Global Vectors for Word Representation
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Journal ArticleDOI
Dermatologist-level classification of skin cancer with deep neural networks
Andre Esteva,Brett Kuprel,Roberto A. Novoa,Justin M. Ko,Susan M. Swetter,Susan M. Swetter,Helen M. Blau,Sebastian Thrun +7 more
TL;DR: This work demonstrates an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists, trained end-to-end from images directly, using only pixels and disease labels as inputs.
Book ChapterDOI
Domain-adversarial training of neural networks
Yaroslav Ganin,Evgeniya Ustinova,Hana Ajakan,Pascal Germain,Hugo Larochelle,François Laviolette,Mario Marchand,Victor Lempitsky +7 more
TL;DR: In this article, a new representation learning approach for domain adaptation is proposed, in which data at training and test time come from similar but different distributions, and features that cannot discriminate between the training (source) and test (target) domains are used to promote the emergence of features that are discriminative for the main learning task on the source domain.
Journal ArticleDOI
Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs
Varun Gulshan,Lily Peng,Marc Coram,Martin C. Stumpe,Derek Wu,Arunachalam Narayanaswamy,Subhashini Venugopalan,Kasumi Widner,Tom Madams,Jorge Cuadros,Ramasamy Kim,Rajiv Raman,Philip C. Nelson,Jessica L. Mega,Dale R. Webster +14 more
TL;DR: An algorithm based on deep machine learning had high sensitivity and specificity for detecting referable diabetic retinopathy and diabetic macular edema in retinal fundus photographs from adults with diabetes.
Journal ArticleDOI
Cumulated gain-based evaluation of IR techniques
TL;DR: This article proposes several novel measures that compute the cumulative gain the user obtains by examining the retrieval result up to a given ranked position, and test results indicate that the proposed measures credit IR methods for their ability to retrieve highly relevant documents and allow testing of statistical significance of effectiveness differences.