scispace - formally typeset
Search or ask a question

Showing papers by "Veronika Cheplygina published in 2022"


Journal ArticleDOI
TL;DR: In this article , the authors review roadblocks to developing and assessing methods in computer analysis of medical images and provide recommendations on how to further address these problems in the future, and also discuss on-going efforts to counteract these problems.
Abstract: Research in computer analysis of medical images bears many promises to improve patients' health. However, a number of systematic challenges are slowing down the progress of the field, from limitations of the data, such as biases, to research incentives, such as optimizing for publication. In this paper we review roadblocks to developing and assessing methods. Building our analysis on evidence from the literature and data challenges, we show that at every step, potential biases can creep in. On a positive note, we also discuss on-going efforts to counteract these problems. Finally we provide recommendations on how to further address these problems in the future.

114 citations


Journal ArticleDOI
TL;DR: The Metrics Reloaded framework was developed in a multi-stage Delphi process and is based on the novel concept of a problem fingerprint – a structured representation of the given problem that captures all aspects that are relevant for metric selection from the domain interest to the properties of the target structure(s), data set and algorithm output.
Abstract: Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, a large international expert consortium createdMetrics Reloaded, a comprehensive framework guiding researchers towards choosing metrics in a problem-aware manner. Following the convergence of ML methodology across application domains,Metrics Reloaded fosters the convergence of validation methodology. The framework was developed in a multi-stage Delphi process and is based on the novel concept of a problem fingerprint – a structured representation of the given problem that captures all aspects that are relevant for metric selection from the domain interest to the properties of the target structure(s), data set and algorithm output. Metrics Reloaded targets image analysis problems that can be interpreted as a classification task at image, object or pixel level, namely image-level classification, object detection, semantic segmentation, and instance segmentation tasks. Users are guided through the process of selecting and applying appropriate validation metrics while being made aware of potential pitfalls. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool, which also provides a common point of access to explore weaknesses and strengths of the most common validation metrics. An instantiation of the framework for various biological and medical image analysis use cases demonstrates its broad applicability across domains.

46 citations


Journal ArticleDOI
TL;DR: In this paper , only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%), while 48% of respondents applied postprocessing steps.
Abstract: participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

6 citations


Proceedings ArticleDOI
07 Mar 2022
TL;DR: This paper focuses on rolling-elements bearings and proposes a framework for predicting their degradation stages automatically using a k-means bearing lifetime segmentation method based on high-frequency bearing vibration signal embedded in a latent low-dimensional subspace using an AutoEncoder.
Abstract: In the pharmaceutical industry, the maintenance of production machines must be audited by the regulator. In this context, the problem of predictive maintenance is not when to maintain a machine, but what parts to maintain at a given point in time. The focus shifts from the entire machine to its component parts and prediction becomes a classification problem. In this paper, we focus on rolling-elements bearings and we propose a framework for predicting their degradation stages automatically. Our main contribution is a k-means bearing lifetime segmentation method based on high-frequency bearing vibration signal embedded in a latent low-dimensional subspace using an AutoEncoder. Given high-frequency vibration data, our framework generates a labeled dataset that is used to train a supervised model for bearing degradation stage detection. Our experimental results, based on the publicly available FEMTO Bearing run-to-failure dataset, show that our framework is scalable and that it provides reliable and actionable predictions for a range of different bearings.

2 citations


03 Jun 2022
TL;DR: Metrics Reloaded as discussed by the authors is a comprehensive framework to guide researchers in the problem-aware selection of metrics, based on the concept of a problem fingerprint, which captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), data set and algorithm output.
Abstract: Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international expert consortium created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. The framework was developed in a multi-stage Delphi process and is based on the novel concept of a problem fingerprint - a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), data set and algorithm output. Based on the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as a classification task at image, object or pixel level, namely image-level classification, object detection, semantic segmentation, and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool, which also provides a point of access to explore weaknesses, strengths and specific recommendations for the most common validation metrics. The broad applicability of our framework across domains is demonstrated by an instantiation for various biological and medical image analysis use cases.

2 citations


Journal ArticleDOI
TL;DR: In this article , the authors provide several strategies for learning from and dealing with failure instead of ignoring it, while still taking into account individual differences between academics, and these simple rules allow academics to further develop their own strategies for failing successfully.
Abstract: Failure is an integral part of life and by extension academia. At the same time, failure is often ignored, with potentially negative consequences both for the science and the scientists involved. This article provides several strategies for learning from and dealing with failure instead of ignoring it. Hopefully, our recommendations are widely applicable, while still taking into account individual differences between academics. These simple rules allow academics to further develop their own strategies for failing successfully in academia.

1 citations


Journal ArticleDOI
TL;DR: In this article , the authors present a case study on chest X-rays using two publicly available datasets and share annotations for a subset of pneumothorax images with drains, and conclude with general recommendations for medical image classification.
Abstract: The availability of large public datasets and the increased amount of computing power have shifted the interest of the medical community to high-performance algorithms. How-ever, little attention is paid to the quality of the data and their annotations. High performance on benchmark datasets may be reported without considering possible shortcuts or artifacts in the data, besides, models are not tested on sub-population groups. With this work, we aim to raise awareness about shortcuts problems. We validate previous findings, and present a case study on chest X-rays using two publicly available datasets. We share annotations for a subset of pneumothorax images with drains. We conclude with general recommendations for medical image classification. We make our code available 1 .

Journal ArticleDOI
TL;DR: An evaluation metric for data containing Japanese written media and annotations of furigana is proposed which is similar to the evaluation protocols used in object detection except that it allows groups of objects to be labeled by one annotation.
Abstract: — Furigana are pronunciation notes used in Japanese writing. Being able to detect these can help improve optical character recognition (OCR) performance or make more accurate digital copies of Japanese written media by correctly displaying furigana . This project focuses on detecting furigana in Japanese books and comics. While there has been research into the detection of Japanese text in general, there are currently no proposed methods for detecting furigana . We construct a new dataset containing Japanese written media and annotations of furigana . We propose an evaluation metric for such data which is similar to the evaluation protocols used in object detection except that it allows groups of objects to be labeled by one annotation. We propose a method for detection of furigana that is based on mathematical morphology and connected component analysis. We evaluate the detections of the dataset and compare different methods for text extraction. We also evaluate different types of images such as books and comics individually and discuss the challenges of each type of image. The proposed method reaches an F1-score of 76% on the dataset. The method performs well on regular books, but less so on comics, and books of irregular format. Finally, we show that the proposed method can improve the performance of OCR by 5% on the manga109 dataset. Source code is available via https://github.com/nikolajkb/ FuriganaDetection 1

Journal Article
TL;DR: A benchmark of recent prior-based losses for medical image segmentation is established to provide intuition onto which losses to choose given a particular task or dataset, based on dataset characteristics and properties.
Abstract: Today, deep convolutional neural networks (CNNs) have demonstrated state-of-the-art performance for medical image segmentation, on various imaging modalities and tasks. Despite early success, segmentation networks may still generate anatomically aberrant segmentations, with holes or inaccuracies near the object boundaries. To enforce anatomical plausibility, recent research studies have focused on incorporating prior knowledge such as object shape or boundary, as constraints in the loss function. Prior integrated could be low-level referring to reformulated representations extracted from the ground-truth segmentations, or high-level representing external medical information such as the organ's shape or size. Over the past few years, prior-based losses exhibited a rising interest in the research field since they allow integration of expert knowledge while still being architecture-agnostic. However, given the diversity of prior-based losses on different medical imaging challenges and tasks, it has become hard to identify what loss works best for which dataset. In this paper, we establish a benchmark of recent prior-based losses for medical image segmentation. The main objective is to provide intuition onto which losses to choose given a particular task or dataset. To this end, four low-level and high-level prior-based losses are selected. The considered losses are validated on 8 different datasets from a variety of medical image segmentation challenges including the Decathlon, the ISLES and the WMH challenge. Results show that whereas low-level prior-based losses can guarantee an increase in performance over the Dice loss baseline regardless of the dataset characteristics, high-level prior-based losses can increase anatomical plausibility as per data characteristics.