Home
/
Authors
/
Valeriia Cherepanova

Author

Valeriia Cherepanova

Other affiliations: University College London

Bio: Valeriia Cherepanova is an academic researcher from University of Maryland, College Park. The author has contributed to research in topics: Facial recognition system & Meta learning (computer science). The author has an hindex of 6, co-authored 12 publications receiving 84 citations. Previous affiliations of Valeriia Cherepanova include University College London.

Papers

PDF

Open Access

More filters

Posted Content•

Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff

[...]

Eitan Borgnia¹, Valeriia Cherepanova¹, Liam Fowl¹, Amin Ghiasi¹, Jonas Geiping², Micah Goldblum¹, Tom Goldstein¹, Arjun Gupta¹ - Show less +4 more•Institutions (2)

University of Maryland, College Park¹, University of Siegen²

18 Nov 2020-arXiv: Cryptography and Security

TL;DR: It is found that strong data augmentations, such as mixup and CutMix, can significantly diminish the threat of poisoning and backdoor attacks without trading off performance.

...read moreread less

Abstract: Data poisoning and backdoor attacks manipulate victim models by maliciously modifying training data. In light of this growing threat, a recent survey of industry professionals revealed heightened fear in the private sector regarding data poisoning. Many previous defenses against poisoning either fail in the face of increasingly strong attacks, or they significantly degrade performance. However, we find that strong data augmentations, such as mixup and CutMix, can significantly diminish the threat of poisoning and backdoor attacks without trading off performance. We further verify the effectiveness of this simple defense against adaptive poisoning methods, and we compare to baselines including the popular differentially private SGD (DP-SGD) defense. In the context of backdoors, CutMix greatly mitigates the attack while simultaneously increasing validation accuracy by 9%.

...read moreread less

49 citations

Proceedings Article•

Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks

[...]

Micah Goldblum¹, Liam Fowl¹, Renkun Ni¹, Steven Reich¹, Valeriia Cherepanova¹, Tom Goldstein¹ - Show less +2 more•Institutions (1)

University of Maryland, College Park¹

12 Jul 2020

TL;DR: A better understanding of the underlying mechanics of meta-learning is developed and a regularizer is developed which boosts the performance of standard training routines for few-shot classification.

...read moreread less

Abstract: Meta-learning algorithms produce feature extractors which achieve state-of-the-art performance on few-shot classification. While the literature is rich with meta-learning methods, little is known about why the resulting feature extractors perform so well. We develop a better understanding of the underlying mechanics of meta-learning and the difference between models trained using meta-learning and models which are trained classically. In doing so, we introduce and verify several hypotheses for why meta-learned models perform better. Furthermore, we develop a regularizer which boosts the performance of standard training routines for few-shot classification. In many cases, our routine outperforms meta-learning while simultaneously running an order of magnitude faster.

...read moreread less

34 citations

Journal Article•DOI•

Deep learning of HIV field-based rapid tests.

[...]

Valérian Turbé¹, Carina Herbst, Thobeka Mngomezulu, Sepehr Meshkinfamfard¹, Nondumiso Dlamini, Thembani Mhlongo, Theresa Smit, Valeriia Cherepanova², Koki Shimada², Jobie Budd², Jobie Budd¹, Nestor Arsenov¹, Steven G. Gray, Deenan Pillay², Kobus Herbst, Maryam Shahmanesh², Rachel A. McKendry², Rachel A. McKendry¹ - Show less +14 more•Institutions (2)

London Centre for Nanotechnology¹, University College London²

17 Jun 2021-Nature Medicine

TL;DR: In this article, the authors used deep learning to classify images of rapid human immunodeficiency virus (HIV) tests acquired in rural South Africa using newly developed image capture protocols with the Samsung SM-P585 tablet.

...read moreread less

Abstract: Although deep learning algorithms show increasing promise for disease diagnosis, their use with rapid diagnostic tests performed in the field has not been extensively tested. Here we use deep learning to classify images of rapid human immunodeficiency virus (HIV) tests acquired in rural South Africa. Using newly developed image capture protocols with the Samsung SM-P585 tablet, 60 fieldworkers routinely collected images of HIV lateral flow tests. From a library of 11,374 images, deep learning algorithms were trained to classify tests as positive or negative. A pilot field study of the algorithms deployed as a mobile application demonstrated high levels of sensitivity (97.8%) and specificity (100%) compared with traditional visual interpretation by humans—experienced nurses and newly trained community health worker staff—and reduced the number of false positives and false negatives. Our findings lay the foundations for a new paradigm of deep learning–enabled diagnostics in low- and middle-income countries, termed REASSURED diagnostics1, an acronym for real-time connectivity, ease of specimen collection, affordable, sensitive, specific, user-friendly, rapid, equipment-free and deliverable. Such diagnostics have the potential to provide a platform for workforce training, quality assurance, decision support and mobile connectivity to inform disease control strategies, strengthen healthcare system efficiency and improve patient outcomes and outbreak management in emerging infections. In a pilot field study conducted in rural South Africa, deep learning algorithms can accurately classify rapid HIV tests as positive or negative, highlighting the potential of deep learning–enabled diagnostics for use in low- and middle-income countries.

...read moreread less

27 citations

Journal Article•DOI•

Development of PancRISK, a urine biomarker-based risk score for stratified screening of pancreatic cancer patients.

[...]

Oleg Blyuss¹, Oleg Blyuss², Oleg Blyuss³, Alexey Zaikin³, Alexey Zaikin⁴, Valeriia Cherepanova⁴, Daniel Munblit³, Daniel Munblit⁵, Elena M. Kiseleva⁶, Olga M. Prytomanova⁶, Stephen W. Duffy², Tatjana Crnogorac-Jurcevic² - Show less +8 more•Institutions (6)

University of Hertfordshire¹, Queen Mary University of London², I.M. Sechenov First Moscow State Medical University³, University College London⁴, National Institutes of Health⁵, Oles Honchar Dnipropetrovsk National University⁶

01 Mar 2020-British Journal of Cancer

TL;DR: PancRISK score enables easy interpretation of the biomarker panel data and is currently being tested to confirm that it can be used for stratification of patients at risk of developing pancreatic cancer completely non-invasively, using urine samples.

...read moreread less

Abstract: An accurate and simple risk prediction model that would facilitate earlier detection of pancreatic adenocarcinoma (PDAC) is not available at present. In this study, we compare different algorithms of risk prediction in order to select the best one for constructing a biomarker-based risk score, PancRISK. Three hundred and seventy-nine patients with available measurements of three urine biomarkers, (LYVE1, REG1B and TFF1) using retrospectively collected samples, as well as creatinine and age, were randomly split into training and validation sets, following stratification into cases (PDAC) and controls (healthy patients). Several machine learning algorithms were used, and their performance characteristics were compared. The latter included AUC (area under ROC curve) and sensitivity at clinically relevant specificity. None of the algorithms significantly outperformed all others. A logistic regression model, the easiest to interpret, was incorporated into a PancRISK score and subsequently evaluated on the whole data set. The PancRISK performance could be even further improved when CA19-9, commonly used PDAC biomarker, is added to the model. PancRISK score enables easy interpretation of the biomarker panel data and is currently being tested to confirm that it can be used for stratification of patients at risk of developing pancreatic cancer completely non-invasively, using urine samples.

...read moreread less

27 citations

Proceedings Article•DOI•

Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff

[...]

Eitan Borgnia¹, Valeriia Cherepanova¹, Liam Fowl¹, Amin Ghiasi¹, Jonas Geiping², Micah Goldblum¹, Tom Goldstein¹, Arjun Gupta¹ - Show less +4 more•Institutions (2)

University of Maryland, College Park¹, University of Siegen²

06 Jun 2021

TL;DR: In this paper, strong data augmentations, such as mixup and CutMix, can significantly diminish the threat of poisoning and backdoor attacks without trading off performance, and they further verify the effectiveness of this simple defense against adaptive poisoning methods, and compare to baselines including the popular differentially private SGD (DP-SGD) defense.

...read moreread less

22 citations

Cited by

PDF

Open Access

More filters

Posted Content•

Federated Optimization in Heterogeneous Networks

[...]

Tian Li¹, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith - Show less +2 more•Institutions (1)

Carnegie Mellon University¹

14 Dec 2018-arXiv: Learning

TL;DR: FedProx as discussed by the authors is a generalization and re-parametrization of FedAvg, which is the state-of-the-art method for federated learning.

...read moreread less

Abstract: Federated Learning is a distributed learning paradigm with two key challenges that differentiate it from traditional distributed optimization: (1) significant variability in terms of the systems characteristics on each device in the network (systems heterogeneity), and (2) non-identically distributed data across the network (statistical heterogeneity). In this work, we introduce a framework, FedProx, to tackle heterogeneity in federated networks. FedProx can be viewed as a generalization and re-parametrization of FedAvg, the current state-of-the-art method for federated learning. While this re-parameterization makes only minor modifications to the method itself, these modifications have important ramifications both in theory and in practice. Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity). Practically, we demonstrate that FedProx allows for more robust convergence than FedAvg across a suite of realistic federated datasets. In particular, in highly heterogeneous settings, FedProx demonstrates significantly more stable and accurate convergence behavior relative to FedAvg---improving absolute test accuracy by 22% on average.

...read moreread less

490 citations

Posted Content•

Backdoor Learning: A Survey

[...]

Yiming Li, Baoyuan Wu, Yong Jiang, Zhifeng Li, Shu-Tao Xia - Show less +1 more

17 Jul 2020-arXiv: Cryptography and Security

TL;DR: This paper summarizes and categorizes existing backdoor attacks and defenses based on their characteristics, and provides a unified framework for analyzing poisoning-based backdoor attacks.

...read moreread less

Abstract: Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs), such that the attacked model performs well on benign samples, whereas its prediction will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. This threat could happen when the training process is not fully controlled, such as training on third-party datasets or adopting third-party models, which poses a new and realistic threat. Although backdoor learning is an emerging and rapidly growing research area, its systematic review, however, remains blank. In this paper, we present the first comprehensive survey of this realm. We summarize and categorize existing backdoor attacks and defenses based on their characteristics, and provide a unified framework for analyzing poisoning-based backdoor attacks. Besides, we also analyze the relation between backdoor attacks and relevant fields ($i.e.,$ adversarial attacks and data poisoning), and summarize widely adopted benchmark datasets. Finally, we briefly outline certain future research directions relying upon reviewed works. A curated list of backdoor-related resources is also available at \url{this https URL}.

...read moreread less

260 citations

Posted Content•

Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective

[...]

Yi Zeng¹, Won Park², Z. Morley Mao², Ruoxi Jia¹•Institutions (2)

Virginia Tech¹, University of Michigan²

07 Apr 2021-arXiv: Learning

TL;DR: This work revisits existing backdoor triggers from a frequency perspective and performs a comprehensive analysis, which shows that many current backdoor attacks exhibit severe high-frequency artifacts, which persist across different datasets and resolutions.

...read moreread less

Abstract: Backdoor attacks have been considered a severe security threat to deep learning Such attacks can make models perform abnormally on inputs with predefined triggers and still retain state-of-the-art performance on clean data While backdoor attacks have been thoroughly investigated in the image domain from both attackers' and defenders' sides, an analysis in the frequency domain has been missing thus far This paper first revisits existing backdoor triggers from a frequency perspective and performs a comprehensive analysis Our results show that many current backdoor attacks exhibit severe high-frequency artifacts, which persist across different datasets and resolutions We further demonstrate these high-frequency artifacts enable a simple way to detect existing backdoor triggers at a detection rate of 9850% without prior knowledge of the attack details and the target model Acknowledging previous attacks' weaknesses, we propose a practical way to create smooth backdoor triggers without high-frequency artifacts and study their detectability We show that existing defense works can benefit by incorporating these smooth triggers into their design consideration Moreover, we show that the detector tuned over stronger smooth triggers can generalize well to unseen weak smooth triggers In short, our work emphasizes the importance of considering frequency analysis when designing both backdoor attacks and defenses in deep learning

...read moreread less

78 citations

Proceedings Article•

Backdoor Defense via Decoupling the Training Process

[...]

Kunzhe Huang, Yiming Li, Baoyuan Wu, Zhan Qin, Kui Ren - Show less +1 more

05 Feb 2022

TL;DR: This work proposes a novel backdoor defense via decoupling the original end-to-end training process into three stages, and reveals that poisoned samples tend to cluster together in the feature space of the attacked DNN model, which is mostly due to the endto- end supervised training paradigm.

...read moreread less

Abstract: Recent studies have revealed that deep neural networks (DNNs) are vulnerable to backdoor attacks, where attackers embed hidden backdoors in the DNN model by poisoning a few training samples. The attacked model behaves normally on benign samples, whereas its prediction will be maliciously changed when the backdoor is activated. We reveal that poisoned samples tend to cluster together in the feature space of the attacked DNN model, which is mostly due to the end-to-end supervised training paradigm. Inspired by this observation, we propose a novel backdoor defense via decoupling the original end-to-end training process into three stages. Specifically, we first learn the backbone of a DNN model via \emph{self-supervised learning} based on training samples without their labels. The learned backbone will map samples with the same ground-truth label to similar locations in the feature space. Then, we freeze the parameters of the learned backbone and train the remaining fully connected layers via standard training with all (labeled) training samples. Lastly, to further alleviate side-effects of poisoned samples in the second stage, we remove labels of some `low-credible' samples determined based on the learned model and conduct a \emph{semi-supervised fine-tuning} of the whole model. Extensive experiments on multiple benchmark datasets and DNN models verify that the proposed defense is effective in reducing backdoor threats while preserving high accuracy in predicting benign samples. Our code is available at \url{https://github.com/SCLBD/DBD}.

...read moreread less

56 citations

Journal Article•DOI•

A faecal microbiota signature with high specificity for pancreatic cancer

[...]

Ece Kartal, Thomas Schmidt, Esther Molina-Montes, Sandra Rodriguez-Perales, Jakob Wirbel, Oleksandr M. Maistrenko, Wasiu Akanni, Bilal Alashkar Alhamwe, Renato J. Alves, Alfredo Carrato, Hans-Peter Erasmus, Lidia Estudillo, Fabian Finkelmeier, Anthony Fullam, Anna Głazek, Paulina Gomez-Rubio, Rajna Hercog, Ferris Jung, Stefanie Kandels, Stephan Kersting, Melanie Langheinrich, Mirari Marquez, Xavier Molero, Askarbek N Orakov, Thea Van Rossum, Raúl Torres-Ruiz, Anja Telzerow, Konrad Zych, Vladimir Benes, Georg Zeller, Jonel Trebicka, Francisco X. Real, Núria Malats, Peer Bork - Show less +30 more

08 Mar 2022-Gut

TL;DR: Faecal metagenomic classifiers performed much better than saliva-based classifiers and identified patients with PDAC with an accuracy of up to 0.84 area under the receiver operating characteristic curve (AUROC) based on a set of 27 microbial species, with consistent accuracy across early and late disease stages.

...read moreread less

Abstract: Background Recent evidence suggests a role for the microbiome in pancreatic ductal adenocarcinoma (PDAC) aetiology and progression. Objective To explore the faecal and salivary microbiota as potential diagnostic biomarkers. Methods We applied shotgun metagenomic and 16S rRNA amplicon sequencing to samples from a Spanish case–control study (n=136), including 57 cases, 50 controls, and 29 patients with chronic pancreatitis in the discovery phase, and from a German case–control study (n=76), in the validation phase. Results Faecal metagenomic classifiers performed much better than saliva-based classifiers and identified patients with PDAC with an accuracy of up to 0.84 area under the receiver operating characteristic curve (AUROC) based on a set of 27 microbial species, with consistent accuracy across early and late disease stages. Performance further improved to up to 0.94 AUROC when we combined our microbiome-based predictions with serum levels of carbohydrate antigen (CA) 19–9, the only current non-invasive, Food and Drug Administration approved, low specificity PDAC diagnostic biomarker. Furthermore, a microbiota-based classification model confined to PDAC-enriched species was highly disease-specific when validated against 25 publicly available metagenomic study populations for various health conditions (n=5792). Both microbiome-based models had a high prediction accuracy on a German validation population (n=76). Several faecal PDAC marker species were detectable in pancreatic tumour and non-tumour tissue using 16S rRNA sequencing and fluorescence in situ hybridisation. Conclusion Taken together, our results indicate that non-invasive, robust and specific faecal microbiota-based screening for the early detection of PDAC is feasible.

...read moreread less

56 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

Collapse