Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations

doi:10.1038/s41746-022-00635-4

Open AccessJournal ArticleDOI

Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations

Niccolò Marini, +19 more

- 22 Jul 2022 -

npj digital medicine

- Vol. 5, Iss: 1

TLDR

In this paper , the authors proposed and evaluated an approach to eliminate the need for manual annotations to train computer-aided diagnosis tools in digital pathology, which includes two components, to automatically extract semantically meaningful concepts from diagnostic reports and use them as weak labels to train convolutional neural networks (CNNs) for histopathology diagnosis.

Abstract:

The digitalization of clinical workflows and the increasing performance of deep learning algorithms are paving the way towards new methods for tackling cancer diagnosis. However, the availability of medical specialists to annotate digitized images and free-text diagnostic reports does not scale with the need for large datasets required to train robust computer-aided diagnosis methods that can target the high variability of clinical cases and data produced. This work proposes and evaluates an approach to eliminate the need for manual annotations to train computer-aided diagnosis tools in digital pathology. The approach includes two components, to automatically extract semantically meaningful concepts from diagnostic reports and use them as weak labels to train convolutional neural networks (CNNs) for histopathology diagnosis. The approach is trained (through 10-fold cross-validation) on 3'769 clinical images and reports, provided by two hospitals and tested on over 11'000 images from private and publicly available datasets. The CNN, trained with automatically generated labels, is compared with the same architecture trained with manual labels. Results show that combining text analysis and end-to-end deep neural networks allows building computer-aided diagnosis tools that reach solid performance (micro-accuracy = 0.908 at image-level) based only on existing clinical data without the need for manual annotations.

Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations

Citations

Data-driven color augmentation for H&E stained images in computational pathology

Empowering digital pathology applications through explainable knowledge extraction tools

Attention-based Interpretable Regression of Gene Expression in Histology

Attention-Based Interpretable Regression of Gene Expression in Histology

Interpretable classification of pathology whole-slide images using attention based context-aware graph convolutional neural network

References

Attention is All you Need

Visualizing Data using t-SNE

Distributed Representations of Words and Phrases and their Compositionality

Interrater reliability: the kappa statistic

Occurrence of the potent mutagens 2- nitrobenzanthrone and 3-nitrobenzanthrone in fine airborne particles

Related Papers (5)

Deep Convolutional Neural Network Using Triplets of Faces, Deep Ensemble, and Score-Level Fusion for Face Recognition

Vision Based Anomalous Human Behaviour Detection Using CNN and Transfer Learning

Handwritten Digit Recognition of MNIST dataset using Deep Learning state-of-the-art Artificial Neural Network (ANN) and Convolutional Neural Network (CNN)

Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations

SVM and ELM: Who Wins? Object Recognition with Deep Convolutional Features from ImageNet