Top 12 papers published by Veronika Cheplygina from Eindhoven University of Technology in 2020

Journal Article•DOI•

Ten simple rules for getting started on Twitter as a scientist

[...]

Veronika Cheplygina¹, Felienne Hermans², Felienne Hermans³, Casper J. Albers⁴, Natalia Z. Bielczyk, Ionica Smeets² - Show less +2 more•Institutions (4)

Eindhoven University of Technology¹, Leiden University², Delft University of Technology³, University of Groningen⁴

10 Feb 2020-PLOS Computational Biology

TL;DR: Ten simple rules to help researchers who are planning to start their journey on Twitter to take their first steps and advance their careers using Twitter.

...read moreread less

Abstract: Twitter is one of the most popular social media platforms, with over 320 million active users as of February 2019. Twitter users can enjoy free content delivered by other users whom they actively decide to follow. However, unlike in other areas where Twitter is used passively (e.g., to follow influential figures and/or information agencies), in science it can be used in a much more active, collaborative way: to ask for advice, to form new bonds and scientific collaborations, to announce jobs and find employees, to find new mentors and jobs. This is particularly important in the early stages of a scientific career, during which lack of collaboration or delayed access to information can have the most impact. For these reasons, using Twitter appropriately [1] can be more than just a social media activity; it can be a real career incubator in which researchers can develop their professional circles, launch new research projects and get helped by the community at various stages of the projects. Twitter is a tool that facilitates decentralization in science; you are able to present yourself to the community, to develop your personal brand, to set up a dialogue with people inside and outside your research field and to create or join professional environment in your field without mediators such as your direct boss. This article is written by a group of researchers who have a strong feeling that they have personally benefited from using Twitter, both research-wise and network-wise. We (@DrVeronikaCH, @Felienne, @CaAl, @nbielczyk_neuro, @ionicasmeets) share our personal experience and advice in the form of ten simple rules, and we hope that this material will help a number of researchers who are planning to start their journey on Twitter to take their first steps and advance their careers using Twitter.

...read moreread less

48 citations

Journal Article•DOI•

A survey of crowdsourcing in medical image analysis

[...]

Silas Nyboe Ørting¹, Andrew Doyle², Matthias Hirth³, Arno van Hilten⁴, Oana Inel⁵, Oana Inel⁶, Christopher R. Madan⁷, Panagiotis Mavridis⁶, Helen Spiers⁸, Veronika Cheplygina - Show less +6 more•Institutions (8)

University of Copenhagen¹, McGill University², Technische Universität Ilmenau³, Erasmus University Rotterdam⁴, VU University Amsterdam⁵, Delft University of Technology⁶, University of Nottingham⁷, University of Oxford⁸

01 Dec 2020

TL;DR: This survey reviews studies applying crowdsourcing to the analysis of medical images, published prior to July 2018, and identifies common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach.

...read moreread less

Abstract: Rapid advances in image processing capabilities have been seen across many domains, fostered by the application of machine learning algorithms to "big-data". However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with producing large amounts of high-quality meta-data. Recently, there has been growing interest in the application of crowdsourcing for this purpose; a technique that has proven effective for creating large-scale datasets across a range of disciplines, from computer vision to astrophysics. Despite the growing popularity of this approach, there has not yet been a comprehensive literature review to provide guidance to researchers considering using crowdsourcing methodologies in their own medical imaging analysis. In this survey, we review studies applying crowdsourcing to the analysis of medical images, published prior to July 2018. We identify common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach. Finally, we discuss future opportunities for development within this emerging domain.

...read moreread less

42 citations

Journal Article•DOI•

Effective Self-Management for Early Career Researchers in the Natural and Life Sciences.

[...]

Natalia Z. Bielczyk, Ayaka Ando¹, AmanPreet Badhwar², AmanPreet Badhwar³, Chiara Caldinelli⁴, Mengxia Gao⁵, Amelie Haugg⁶, Amelie Haugg⁷, Leanna M. Hernandez⁸, Kaori L. Ito⁹, Daniel Kessler¹⁰, Daniel J. Lurie¹¹, Meena M. Makary¹², Meena M. Makary¹³, Meena M. Makary¹⁴, Aki Nikolaidis¹⁵, Michele Veldsman¹⁶, Michele Veldsman¹⁷, Chris Allen¹⁸, Adriana Bankston, Katherine L. Bottenhorn¹⁹, Ricarda Braukmann, Vince D. Calhoun²⁰, Veronika Cheplygina²¹, Catarina Costa Boffino²², Ece Ercan²³, Karolina Finc²⁴, Heidi Foo²⁵, Ali Khatibi²⁶, Christian La²⁷, David M. A. Mehler²⁸, Sridar Narayanan²⁹, Russell A. Poldrack²⁷, Pradeep Reddy Raamana, Taylor Salo¹⁹, Claire Godard-Sebillotte³⁰, Lucina Q. Uddin³¹, Davide Valeriani³², Sofie L. Valk³³, Courtney C. Walton³⁴, Phillip G. D. Ward³⁵, Julio A. Yanes³⁶, Xinqi Zhou³⁷ - Show less +39 more•Institutions (37)

Heidelberg University¹, Université de Montréal², Queen Mary University of London³, Trinity College, Dublin⁴, University of Hong Kong⁵, ETH Zurich⁶, University of Zurich⁷, University of California, Los Angeles⁸, University of Southern California⁹, University of Michigan¹⁰, University of California, Berkeley¹¹, Cairo University¹², Harvard University¹³, Yale University¹⁴, MIND Institute¹⁵, University of Oxford¹⁶, Florey Institute of Neuroscience and Mental Health¹⁷, Cardiff University¹⁸, Florida International University¹⁹, Georgia Institute of Technology²⁰, Eindhoven University of Technology²¹, University of São Paulo²², Leiden University Medical Center²³, Nicolaus Copernicus University in Toruń²⁴, University of New South Wales²⁵, University of Birmingham²⁶, Stanford University²⁷, University of Münster²⁸, Montreal Neurological Institute and Hospital²⁹, McGill University³⁰, University of Miami³¹, Massachusetts Eye and Ear Infirmary³², Max Planck Society³³, University of Queensland³⁴, Monash University³⁵, Auburn University³⁶, University of Electronic Science and Technology of China³⁷

22 Apr 2020-Neuron

TL;DR: The Organization for Human Brain Mapping undertook a group effort to gather helpful advice for ECRs in self-management, finding that early career researchers are faced with a range of competing pressures in academia.

...read moreread less

19 citations

Posted Content•

Risk of Training Diagnostic Algorithms on Data with Demographic Bias

[...]

Samaneh Abbasi-Sureshjani¹, Ralf Raumanns¹, Britt E. J. Michels¹, Gerard Schouten¹, Veronika Cheplygina¹ - Show less +1 more•Institutions (1)

Eindhoven University of Technology¹

20 May 2020-arXiv: Learning

TL;DR: A survey of the MICCAI 2018 proceedings is conducted to investigate the common practice in medical image analysis applications and shows that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups.

...read moreread less

Abstract: One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where the predictive algorithms are designed mainly based on a limited or given set of medical images and demographic variables such as age, sex and race are not taken into account. In this work, we conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications. Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used, and the diagnosis is purely based on images. In order to highlight the importance of considering the demographics in diagnosis tasks, we used a publicly available dataset of skin lesions. We then demonstrate that a classifier with an overall area under the curve (AUC) of 0.83 has variable performance between 0.76 and 0.91 on subgroups based on age and sex, even though the training set was relatively balanced. Moreover, we show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups. Finally, we discuss the implications of these results and provide recommendations for further research.

...read moreread less

13 citations

Book Chapter•DOI•

Risk of Training Diagnostic Algorithms on Data with Demographic Bias

[...]

Samaneh Abbasi-Sureshjani¹, Ralf Raumanns¹, Britt E. J. Michels¹, Gerard Schouten¹, Veronika Cheplygina¹ - Show less +1 more•Institutions (1)

Eindhoven University of Technology¹

04 Oct 2020

TL;DR: In this article, the authors conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications and highlight the importance of considering the demographics in diagnosis tasks, using a publicly available dataset of skin lesions.

...read moreread less

Abstract: One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where predictive algorithms are designed mainly based on a given set of medical images, and demographic variables such as age, sex and race are not taken into account. In this work, we conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications. Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used, and the diagnosis is purely based on images. In order to highlight the importance of considering the demographics in diagnosis tasks, we used a publicly available dataset of skin lesions. We then demonstrate that a classifier with an overall area under the curve (AUC) of 0.83 has variable performance between 0.76 and 0.91 on subgroups based on age and sex, even though the training set was relatively balanced. Moreover, we show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups. Finally, we discuss the implications of these results and provide recommendations for further research.

...read moreread less

13 citations

Journal Article•DOI•

CrowdDetective: Wisdom of the Crowds for Detecting Abnormalities in Medical Scans

[...]

Veronika Cheplygina¹•Institutions (1)

Eindhoven University of Technology¹

02 Dec 2020-Revue De Geographie Alpine-journal of Alpine Research

TL;DR: This work proposes to use the "wisdom of the crowds" – internet users without specific expertise – to improve the predictions of the algorithms, and will validate these methods on three challenging detection tasks in chest computed tomography, histopathology images, and endoscopy video.

...read moreread less

Abstract: Machine learning (ML) has great potential for early diagnosis of disease from medical scans, and at times, has even been shown to outperform experts. However, ML algorithms need large amounts of annotated data – scans with outlined abnormalities - for good performance. The time-consuming annotation process limits the progress of ML in this field. To address the annotation problem, multiple instance learning (MIL) algorithms were proposed, which learn from scans that have been diagnosed, but not annotated in detail. Unfortunately, these algorithms are not good enough at predicting where the abnormalities are located, which is important for diagnosis and prognosis of disease. This limits the application of these algorithms in research and in clinical practice. I propose to use the “wisdom of the crowds” –internet users without specific expertise – to improve the predictions of the algorithms. While the crowd does not have experience with medical imaging, recent studies and pilot data I collected show they can still provide useful information about the images, for example by saying whether images are visually similar or not. Such information has not been leveraged before in medical imaging applications. I will validate these methods on three challenging detection tasks in chest computed tomography, histopathology images, and endoscopy video. Understanding how the crowd can contribute to applications that typically require expert knowledge will allow harnessing the potential of large unannotated sets of data, training more reliable algorithms, and ultimately paving the way towards using ML algorithms in clinical practice.Keywords: machine learning, artificial intelligence, medical imaging, crowdsourcing, computer-aided diagnosis

...read moreread less

3 citations

Book Chapter•DOI•

Predicting Scores of Medical Imaging Segmentation Methods with Meta-learning

[...]

Tom van Sonsbeek¹, Veronika Cheplygina¹•Institutions (1)

Eindhoven University of Technology¹

04 Oct 2020

TL;DR: This work investigates meta-learning for segmentation across ten datasets of different organs and modalities by proposing four ways to represent each dataset by meta-features: one based on statistical features of the images, three based on deep learning features, and two based on support vector regression and deep neural networks.

...read moreread less

Abstract: Deep learning has led to state-of-the-art results for many medical imaging tasks, such as segmentation of different anatomical structures. With the increased numbers of deep learning publications and openly available code, the approach to choosing a model for a new task becomes more complicated, while time and (computational) resources are limited. A possible solution to choosing a model efficiently is meta-learning, a learning method in which prior performance of a model is used to predict the performance for new tasks. We investigate meta-learning for segmentation across ten datasets of different organs and modalities. We propose four ways to represent each dataset by meta-features: one based on statistical features of the images and three are based on deep learning features. We use support vector regression and deep neural networks to learn the relationship between the meta-features and prior model performance. On three external test datasets these methods give Dice scores within 0.10 of the true performance. These results demonstrate the potential of meta-learning in medical imaging.

...read moreread less

2 citations

Posted Content•

High-level Prior-based Loss Functions for Medical Image Segmentation: A Survey

[...]

Rosana El Jurdia¹, Caroline Petitjean¹, Paul Honeine¹, Veronika Cheplygina², Fahed Abdallah - Show less +1 more•Institutions (2)

University of Rouen¹, Eindhoven University of Technology²

16 Nov 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors focus on high level prior, embedded at the loss function level, and categorize the articles according to the nature of the prior: the object shape, size, topology, and the inter-regions constraints.

...read moreread less

Abstract: Today, deep convolutional neural networks (CNNs) have demonstrated state of the art performance for supervised medical image segmentation, across various imaging modalities and tasks. Despite early success, segmentation networks may still generate anatomically aberrant segmentations, with holes or inaccuracies near the object boundaries. To mitigate this effect, recent research works have focused on incorporating spatial information or prior knowledge to enforce anatomically plausible segmentation. If the integration of prior knowledge in image segmentation is not a new topic in classical optimization approaches, it is today an increasing trend in CNN based image segmentation, as shown by the growing literature on the topic. In this survey, we focus on high level prior, embedded at the loss function level. We categorize the articles according to the nature of the prior: the object shape, size, topology, and the inter-regions constraints. We highlight strengths and limitations of current approaches, discuss the challenge related to the design and the integration of prior-based losses, and the optimization strategies, and draw future research directions.

...read moreread less

2 citations

Posted Content•

Predicting Scores of Medical Imaging Segmentation Methods with Meta-Learning

[...]

Tom van Sonsbeek¹, Veronika Cheplygina¹•Institutions (1)

Eindhoven University of Technology¹

08 May 2020-arXiv: Image and Video Processing

TL;DR: In this paper, the authors investigate meta-learning for segmentation across ten datasets of different organs and modalities and propose four ways to represent each dataset by meta-features: one based on statistical features of the images and three are based on deep learning features.

...read moreread less

Abstract: Deep learning has led to state-of-the-art results for many medical imaging tasks, such as segmentation of different anatomical structures. With the increased numbers of deep learning publications and openly available code, the approach to choosing a model for a new task becomes more complicated, while time and (computational) resources are limited. A possible solution to choosing a model efficiently is meta-learning, a learning method in which prior performance of a model is used to predict the performance for new tasks. We investigate meta-learning for segmentation across ten datasets of different organs and modalities. We propose four ways to represent each dataset by meta-features: one based on statistical features of the images and three are based on deep learning features. We use support vector regression and deep neural networks to learn the relationship between the meta-features and prior model performance. On three external test datasets these methods give Dice scores within 0.10 of the true performance. These results demonstrate the potential of meta-learning in medical imaging.

...read moreread less

1 citations

Posted Content•

Multi-task Ensembles with Crowdsourced Features Improve Skin Lesion Diagnosis

[...]

Ralf Raumanns, Elif K Contar, Gerard Schouten, Veronika Cheplygina

28 Apr 2020-arXiv: Human-Computer Interaction

TL;DR: It is shown that multi-task models with individual crowdsourced features have limited effect on the model, but when combined in an ensembles, leads to improved generalisation.

...read moreread less

Abstract: Machine learning has a recognised need for large amounts of annotated data. Due to the high cost of expert annotations, crowdsourcing, where non-experts are asked to label or outline images, has been proposed as an alternative. Although many promising results are reported, the quality of diagnostic crowdsourced labels is still unclear. We propose to address this by instead asking the crowd about visual features of the images, which can be provided more intuitively, and by using these features in a multi-task learning framework through ensemble strategies. We compare our proposed approach to a baseline model with a set of 2000 skin lesions from the ISIC 2017 challenge dataset. The baseline model only predicts a binary label from the skin lesion image, while our multi-task model also predicts one of the following features: asymmetry of the lesion, border irregularity and color. We show that multi-task models with individual crowdsourced features have limited effect on the model, but when combined in an ensembles, leads to improved generalisation. The area under the receiver operating characteristic curve is 0.794 for the baseline model and 0.811 and 0.808 for multi-task ensembles respectively. Finally, we discuss the findings, identify some limitations and recommend directions for further research. The code of the models is available at this https URL.

...read moreread less

1 citations

Posted Content•

Multi-task Learning with Crowdsourced Features Improves Skin Lesion Diagnosis

[...]

Ralf Raumanns, Elif K Contar, Gerard Schouten, Veronika Cheplygina

28 Apr 2020

TL;DR: It is shown that crowd features in combination with multi-task learning leads to improved generalisation and some limitations and recommend directions for further research.

...read moreread less

Abstract: Machine learning has a recognised need for large amounts of annotated data. Due to the high cost of expert annotations, crowdsourcing, where non-experts are asked to label or outline images, has been proposed as an alternative. Although many promising results are reported, the quality of diagnostic crowdsourced labels is still unclear. We propose to address this by instead asking the crowd about visual features of the images, which can be provided more intuitively, and by using these features in a multi-task learning framework through ensemble strategies. We compare our proposed approach to a baseline model with a set of 2000 skin lesions from the ISIC 2017 challenge dataset. The baseline model only predicts a binary label from the skin lesion image, while our multi-task model also predicts one of the following features: asymmetry of the lesion, border irregularity and color. We show that multi-task models with individual crowdsourced features have limited effect on the model, but when combined in an ensembles, leads to improved generalisation. The area under the receiver operating characteristic curve is 0.794 for the baseline model and 0.811 and 0.808 for multi-task ensembles respectively. Finally, we discuss the findings, identify some limitations and recommend directions for further research. The code of the models is available at this https URL.

...read moreread less

Posted Content•

Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

[...]

Linde S. Hesse, Pim A. de Jong, Josien P. W. Pluim, Veronika Cheplygina

30 Jun 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work suggests that a pre-trained feature extractor can be used as primary tumor origin classifier for lung nodules, eliminating the need for elaborate fine-tuning of a new network and large datasets.

...read moreread less

Abstract: Early detection of lung cancer has been proven to decrease mortality significantly. A recent development in computed tomography (CT), spectral CT, can potentially improve diagnostic accuracy, as it yields more information per scan than regular CT. However, the shear workload involved with analyzing a large number of scans drives the need for automated diagnosis methods. Therefore, we propose a detection and classification system for lung nodules in CT scans. Furthermore, we want to observe whether spectral images can increase classifier performance. For the detection of nodules we trained a VGG-like 3D convolutional neural net (CNN). To obtain a primary tumor classifier for our dataset we pre-trained a 3D CNN with similar architecture on nodule malignancies of a large publicly available dataset, the LIDC-IDRI dataset. Subsequently we used this pre-trained network as feature extractor for the nodules in our dataset. The resulting feature vectors were classified into two (benign/malignant) and three (benign/primary lung cancer/metastases) classes using support vector machine (SVM). This classification was performed both on nodule- and scan-level. We obtained state-of-the art performance for detection and malignancy regression on the LIDC-IDRI database. Classification performance on our own dataset was higher for scan- than for nodule-level predictions. For the three-class scan-level classification we obtained an accuracy of 78\%. Spectral features did increase classifier performance, but not significantly. Our work suggests that a pre-trained feature extractor can be used as primary tumor origin classifier for lung nodules, eliminating the need for elaborate fine-tuning of a new network and large datasets. Code is available at \url{this https URL}.

...read moreread less

Showing papers by "Veronika Cheplygina published in 2020"