Author
John-Melle Bokhorst
Other affiliations: Leiden University Medical Center
Bio: John-Melle Bokhorst is an academic researcher from Radboud University Nijmegen. The author has contributed to research in topics: Medicine & Tumor budding. The author has an hindex of 6, co-authored 11 publications receiving 220 citations. Previous affiliations of John-Melle Bokhorst include Leiden University Medical Center.
Papers
More filters
••
TL;DR: In this article, the authors compared stain color augmentation and normalization techniques and quantified their effect on CNN classification performance using a heterogeneous dataset of hematoxylin and eosin histopathology images from 4 organs and 9 pathology laboratories.
362 citations
••
TL;DR: In this article, the authors proposed to reduce performance variability by using consistent generative adversarial networks (CycleGAN) to remove staining variation, which improves upon the regular CycleGAN by incorporating residual learning.
39 citations
••
TL;DR: Despite reports of moderate-to-substantial agreement with respect to tumor budding grade, agreement withrespect to individual pan cytokeratin-stained tumor buds is moderate at most, suggesting a machine learning approach may prove especially useful for a more robust assessment of individual tumor buds.
26 citations
••
TL;DR: In this paper , the authors proposed and evaluated an approach to eliminate the need for manual annotations to train computer-aided diagnosis tools in digital pathology, which includes two components, to automatically extract semantically meaningful concepts from diagnostic reports and use them as weak labels to train convolutional neural networks (CNNs) for histopathology diagnosis.
Abstract: The digitalization of clinical workflows and the increasing performance of deep learning algorithms are paving the way towards new methods for tackling cancer diagnosis. However, the availability of medical specialists to annotate digitized images and free-text diagnostic reports does not scale with the need for large datasets required to train robust computer-aided diagnosis methods that can target the high variability of clinical cases and data produced. This work proposes and evaluates an approach to eliminate the need for manual annotations to train computer-aided diagnosis tools in digital pathology. The approach includes two components, to automatically extract semantically meaningful concepts from diagnostic reports and use them as weak labels to train convolutional neural networks (CNNs) for histopathology diagnosis. The approach is trained (through 10-fold cross-validation) on 3'769 clinical images and reports, provided by two hospitals and tested on over 11'000 images from private and publicly available datasets. The CNN, trained with automatically generated labels, is compared with the same architecture trained with manual labels. Results show that combining text analysis and end-to-end deep neural networks allows building computer-aided diagnosis tools that reach solid performance (micro-accuracy = 0.908 at image-level) based only on existing clinical data without the need for manual annotations.
21 citations
13 Dec 2018
TL;DR: This work introduces and compares two approaches of loss balancing when sparse annotations are provided, namely (1) instance based balancing and (2) mini-batch based balancing, and considers a scenario of full supervision in the form of dense annotations.
18 citations
Cited by
More filters
•
TL;DR: WILDS is presented, a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, and is hoped to encourage the development of general-purpose methods that are anchored to real-world distribution shifts and that work well across different applications and problem settings.
Abstract: Distribution shifts -- where the training distribution differs from the test distribution -- can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild. Despite their ubiquity, these real-world distribution shifts are under-represented in the datasets widely used in the ML community today. To address this gap, we present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts which naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. On each dataset, we show that standard training results in substantially lower out-of-distribution than in-distribution performance, and that this gap remains even with models trained by existing methods for handling distribution shifts. This underscores the need for new training methods that produce models which are more robust to the types of distribution shifts that arise in practice. To facilitate method development, we provide an open-source package that automates dataset loading, contains default model architectures and hyperparameters, and standardizes evaluations. Code and leaderboards are available at this https URL.
579 citations
••
TL;DR: This article provides a detailed review of the solutions above, summarizing both the technical novelties and empirical results, and compares the benefits and requirements of the surveyed methodologies and provides recommended solutions.
487 citations
••
27 Jul 2020
TL;DR: Deep transfer learning is used to quantify histopathological patterns across 17,355 hematoxylin and eosin-stained histopathology slide images from 28 cancer types and correlate these with matched genomic, transcriptomic and survival data, showing the remarkable potential of computer vision in characterizing the molecular basis of tumor Histopathology.
Abstract: We use deep transfer learning to quantify histopathological patterns across 17,355 hematoxylin and eosin-stained histopathology slide images from 28 cancer types and correlate these with matched genomic, transcriptomic and survival data. This approach accurately classifies cancer types and provides spatially resolved tumor and normal tissue distinction. Automatically learned computational histopathological features correlate with a large range of recurrent genetic aberrations across cancer types. This includes whole-genome duplications, which display universal features across cancer types, individual chromosomal aneuploidies, focal amplifications and deletions, as well as driver gene mutations. There are widespread associations between bulk gene expression levels and histopathology, which reflect tumor composition and enable the localization of transcriptomically defined tumor-infiltrating lymphocytes. Computational histopathology augments prognosis based on histopathological subtyping and grading, and highlights prognostically relevant areas such as necrosis or lymphocytic aggregates. These findings show the remarkable potential of computer vision in characterizing the molecular basis of tumor histopathology. Two papers by Kather and colleagues and Gerstung and colleagues develop workflows to predict a wide range of molecular alterations from pan-cancer digital pathology slides.
307 citations
••
TL;DR: A comprehensive review of state-of-the-art deep learning approaches that have been used in the context of histopathological image analysis can be found in this paper, where a survey of over 130 papers is presented.
260 citations
••
University of Illinois at Chicago1, Case Western Reserve University2, Indian Institute of Technology Bombay3, The Chinese University of Hong Kong4, Beijing University of Posts and Telecommunications5, Peking University6, University of Oklahoma7, University of Warwick8, Shanghai Jiao Tong University9, University of North Carolina at Chapel Hill10, Zhejiang University11, Sun Yat-sen University12, University of Hong Kong13, Medical University of Vienna14, Loughborough University15, Royal Institute of Technology16, Carnegie Mellon University17, University of Illinois at Urbana–Champaign18, Vietnam National University, Ho Chi Minh City19, Sejong University20, Indian Institute of Technology Madras21, University of California, Berkeley22, Hong Kong University of Science and Technology23, Islamic Azad University24, RWTH Aachen University25, University of Science and Technology of China26, University of Lübeck27, Agilent Technologies28, Shenzhen University29, Nanjing University of Science and Technology30, Tata Consultancy Services31, Korea University32, Polytechnic University of Valencia33, Old Dominion University34, Jadavpur University35, University of Castilla–La Mancha36, Cognizant37, Xiamen University38, Tongji University39
TL;DR: Several of the top techniques compared favorably to an individual human annotator and can be used with confidence for nuclear morphometrics as well as heavy data augmentation in the MoNuSeg 2018 challenge.
Abstract: Generalized nucleus segmentation techniques can contribute greatly to reducing the time to develop and validate visual biomarkers for new digital pathology datasets. We summarize the results of MoNuSeg 2018 Challenge whose objective was to develop generalizable nuclei segmentation techniques in digital pathology. The challenge was an official satellite event of the MICCAI 2018 conference in which 32 teams with more than 80 participants from geographically diverse institutes participated. Contestants were given a training set with 30 images from seven organs with annotations of 21,623 individual nuclei. A test dataset with 14 images taken from seven organs, including two organs that did not appear in the training set was released without annotations. Entries were evaluated based on average aggregated Jaccard index (AJI) on the test set to prioritize accurate instance segmentation as opposed to mere semantic segmentation. More than half the teams that completed the challenge outperformed a previous baseline. Among the trends observed that contributed to increased accuracy were the use of color normalization as well as heavy data augmentation. Additionally, fully convolutional networks inspired by variants of U-Net, FCN, and Mask-RCNN were popularly used, typically based on ResNet or VGG base architectures. Watershed segmentation on predicted semantic segmentation maps was a popular post-processing strategy. Several of the top techniques compared favorably to an individual human annotator and can be used with confidence for nuclear morphometrics.
251 citations