scispace - formally typeset
Search or ask a question
Author

Yinzhi Cao

Other affiliations: Cornell University, Lehigh University, Northwestern University  ...read more
Bio: Yinzhi Cao is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Computer science & JavaScript. The author has an hindex of 19, co-authored 48 publications receiving 2156 citations. Previous affiliations of Yinzhi Cao include Cornell University & Lehigh University.


Papers
More filters
Proceedings ArticleDOI
14 Oct 2017
TL;DR: DeepXplore efficiently finds thousands of incorrect corner case behaviors in state-of-the-art DL models with thousands of neurons trained on five popular datasets including ImageNet and Udacity self-driving challenge data.
Abstract: Deep learning (DL) systems are increasingly deployed in safety- and security-critical domains including self-driving cars and malware detection, where the correctness and predictability of a system's behavior for corner case inputs are of great importance Existing DL testing depends heavily on manually labeled data and therefore often fails to expose erroneous behaviors for rare inputs We design, implement, and evaluate DeepXplore, the first whitebox framework for systematically testing real-world DL systems First, we introduce neuron coverage for systematically measuring the parts of a DL system exercised by test inputs Next, we leverage multiple DL systems with similar functionality as cross-referencing oracles to avoid manual checking Finally, we demonstrate how finding inputs for DL systems that both trigger many differential behaviors and achieve high neuron coverage can be represented as a joint optimization problem and solved efficiently using gradient-based search techniques DeepXplore efficiently finds thousands of incorrect corner case behaviors (eg, self-driving cars crashing into guard rails and malware masquerading as benign software) in state-of-the-art DL models with thousands of neurons trained on five popular datasets including ImageNet and Udacity self-driving challenge data For all tested DL models, on average, DeepXplore generated one test input demonstrating incorrect behavior within one second while running only on a commodity laptop We further show that the test inputs generated by DeepXplore can also be used to retrain the corresponding DL model to improve the model's accuracy by up to 3%

884 citations

Proceedings ArticleDOI
TL;DR: DeepXplore as discussed by the authors is a white box framework for systematically testing real-world deep learning (DL) systems, which leverages multiple DL systems with similar functionality as cross-referencing oracles to avoid manual checking.
Abstract: Deep learning (DL) systems are increasingly deployed in safety- and security-critical domains including self-driving cars and malware detection, where the correctness and predictability of a system's behavior for corner case inputs are of great importance. Existing DL testing depends heavily on manually labeled data and therefore often fails to expose erroneous behaviors for rare inputs. We design, implement, and evaluate DeepXplore, the first whitebox framework for systematically testing real-world DL systems. First, we introduce neuron coverage for systematically measuring the parts of a DL system exercised by test inputs. Next, we leverage multiple DL systems with similar functionality as cross-referencing oracles to avoid manual checking. Finally, we demonstrate how finding inputs for DL systems that both trigger many differential behaviors and achieve high neuron coverage can be represented as a joint optimization problem and solved efficiently using gradient-based search techniques. DeepXplore efficiently finds thousands of incorrect corner case behaviors (e.g., self-driving cars crashing into guard rails and malware masquerading as benign software) in state-of-the-art DL models with thousands of neurons trained on five popular datasets including ImageNet and Udacity self-driving challenge data. For all tested DL models, on average, DeepXplore generated one test input demonstrating incorrect behavior within one second while running only on a commodity laptop. We further show that the test inputs generated by DeepXplore can also be used to retrain the corresponding DL model to improve the model's accuracy by up to 3%.

651 citations

Proceedings ArticleDOI
17 May 2015
TL;DR: This paper presents a general, efficient unlearning approach by transforming learning algorithms used by a system into a summation form, and applies to all stages of machine learning, including feature selection and modeling.
Abstract: Today's systems produce a rapidly exploding amount of data, and the data further derives more data, forming a complex data propagation network that we call the data's lineage. There are many reasons that users want systems to forget certain data including its lineage. From a privacy perspective, users who become concerned with new privacy risks of a system often want the system to forget their data and lineage. From a security perspective, if an attacker pollutes an anomaly detector by injecting manually crafted data into the training data set, the detector must forget the injected data to regain security. From a usability perspective, a user can remove noise and incorrect entries so that a recommendation engine gives useful recommendations. Therefore, we envision forgetting systems, capable of forgetting certain data and their lineages, completely and quickly. This paper focuses on making learning systems forget, the process of which we call machine unlearning, or simply unlearning. We present a general, efficient unlearning approach by transforming learning algorithms used by a system into a summation form. To forget a training data sample, our approach simply updates a small number of summations -- asymptotically faster than retraining from scratch. Our approach is general, because the summation form is from the statistical query learning in which many machine learning algorithms can be implemented. Our approach also applies to all stages of machine learning, including feature selection and modeling. Our evaluation, on four diverse learning systems and real-world workloads, shows that our approach is general, effective, fast, and easy to use.

267 citations

Proceedings ArticleDOI
01 Jan 2015
TL;DR: The sheer number of mobile applications prompted researchers from academia and industry to develop static analysis techniques that scrutinize these applications for vulnerabilities and malicious functionality.
Abstract: Android users can choose from over one million applications (apps) offered through the official Google Play marketplace Furthermore, a wealth of alternative sources for Android applications is available for users to choose from These range from curated stores, such as Amazon’s Appstore to less legitimate sources that offer pirated content The sheer number of mobile applications prompted researchers from academia and industry to develop static analysis techniques that scrutinize these applications for vulnerabilities and malicious functionality Android applications always execute in the context of the Android framework — a comprehensive collection of functionality that developers can conveniently use from their applications The prolific use of the framework poses unique challenges for the analysis of Android applications

186 citations

Proceedings ArticleDOI
01 Jan 2017
TL;DR: This paper proposes a browser fingerprinting technique that can track users not only within a single browser but also across different browsers on the same machine, and can achieve higher uniqueness rate than the only cross-browser approach in the literature with similar stability.
Abstract: In this paper, we propose a browser fingerprinting technique that can track users not only within a single browser but also across different browsers on the same machine. Specifically, our approach utilizes many novel OS and hardware level features, such as those from graphics cards, CPU, and installed writing scripts. We extract these features by asking browsers to perform tasks that rely on corresponding OS and hardware functionalities. Our evaluation shows that our approach can successfully identify 99.24% of users as opposed to 90.84% for state of the art on single-browser fingerprinting against the same dataset. Further, our approach can achieve higher uniqueness rate than the only cross-browser approach in the literature with similar stability.

163 citations


Cited by
More filters
01 Jan 2009
TL;DR: This paper presents a meta-modelling framework for modeling and testing the robustness of the modeled systems and some of the techniques used in this framework have been developed and tested in the field.
Abstract: ing WS1S Systems to Verify Parameterized Networks . . . . . . . . . . . . 188 Kai Baukus, Saddek Bensalem, Yassine Lakhnech and Karsten Stahl FMona: A Tool for Expressing Validation Techniques over Infinite State Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 J.-P. Bodeveix and M. Filali Transitive Closures of Regular Relations for Verifying Infinite-State Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Bengt Jonsson and Marcus Nilsson Diagnostic and Test Generation Using Static Analysis to Improve Automatic Test Generation . . . . . . . . . . . . . 235 Marius Bozga, Jean-Claude Fernandez and Lucian Ghirvu Efficient Diagnostic Generation for Boolean Equation Systems . . . . . . . . . . . . 251 Radu Mateescu Efficient Model-Checking Compositional State Space Generation with Partial Order Reductions for Asynchronous Communicating Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Jean-Pierre Krimm and Laurent Mounier Checking for CFFD-Preorder with Tester Processes . . . . . . . . . . . . . . . . . . . . . . . 283 Juhana Helovuo and Antti Valmari Fair Bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Thomas A. Henzinger and Sriram K. Rajamani Integrating Low Level Symmetries into Reachability Analysis . . . . . . . . . . . . . 315 Karsten Schmidt Model-Checking Tools Model Checking Support for the ASM High-Level Language . . . . . . . . . . . . . . 331 Giuseppe Del Castillo and Kirsten Winter Table of

1,687 citations

Journal ArticleDOI
TL;DR: In this paper, the authors review recent findings on adversarial examples for DNNs, summarize the methods for generating adversarial samples, and propose a taxonomy of these methods.
Abstract: With rapid progress and significant successes in a wide spectrum of applications, deep learning is being applied in many safety-critical environments. However, deep neural networks (DNNs) have been recently found vulnerable to well-designed input samples called adversarial examples . Adversarial perturbations are imperceptible to human but can easily fool DNNs in the testing/deploying stage. The vulnerability to adversarial examples becomes one of the major risks for applying DNNs in safety-critical environments. Therefore, attacks and defenses on adversarial examples draw great attention. In this paper, we review recent findings on adversarial examples for DNNs, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods. Under the taxonomy, applications for adversarial examples are investigated. We further elaborate on countermeasures for adversarial examples. In addition, three major challenges in adversarial examples and the potential solutions are discussed.

1,203 citations

Proceedings ArticleDOI
27 May 2018
TL;DR: DeepTest is a systematic testing tool for automatically detecting erroneous behaviors of DNN-driven vehicles that can potentially lead to fatal crashes and systematically explore different parts of the DNN logic by generating test inputs that maximize the numbers of activated neurons.
Abstract: Recent advances in Deep Neural Networks (DNNs) have led to the development of DNN-driven autonomous cars that, using sensors like camera, LiDAR, etc., can drive without any human intervention. Most major manufacturers including Tesla, GM, Ford, BMW, and Waymo/Google are working on building and testing different types of autonomous vehicles. The lawmakers of several US states including California, Texas, and New York have passed new legislation to fast-track the process of testing and deployment of autonomous vehicles on their roads. However, despite their spectacular progress, DNNs, just like traditional software, often demonstrate incorrect or unexpected corner-case behaviors that can lead to potentially fatal collisions. Several such real-world accidents involving autonomous cars have already happened including one which resulted in a fatality. Most existing testing techniques for DNN-driven vehicles are heavily dependent on the manual collection of test data under different driving conditions which become prohibitively expensive as the number of test conditions increases. In this paper, we design, implement, and evaluate DeepTest, a systematic testing tool for automatically detecting erroneous behaviors of DNN-driven vehicles that can potentially lead to fatal crashes. First, our tool is designed to automatically generated test cases leveraging real-world changes in driving conditions like rain, fog, lighting conditions, etc. DeepTest systematically explore different parts of the DNN logic by generating test inputs that maximize the numbers of activated neurons. DeepTest found thousands of erroneous behaviors under different realistic driving conditions (e.g., blurring, rain, fog, etc.) many of which lead to potentially fatal crashes in three top performing DNNs in the Udacity self-driving car challenge.

963 citations

Posted Content
TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.
Abstract: We give an overview of recent exciting achievements of deep reinforcement learning (RL). We discuss six core elements, six important mechanisms, and twelve applications. We start with background of machine learning, deep learning and reinforcement learning. Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration. After that, we discuss important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn. Then we discuss various applications of RL, including games, in particular, AlphaGo, robotics, natural language processing, including dialogue systems, machine translation, and text generation, computer vision, neural architecture design, business management, finance, healthcare, Industry 4.0, smart grid, intelligent transportation systems, and computer systems. We mention topics not reviewed yet, and list a collection of RL resources. After presenting a brief summary, we close with discussions. Please see Deep Reinforcement Learning, arXiv:1810.06339, for a significant update.

935 citations