DeepXplore: Automated Whitebox Testing of Deep Learning Systems

doi:10.1145/3132747.3132785

Open AccessProceedings ArticleDOI

DeepXplore: Automated Whitebox Testing of Deep Learning Systems

- pp 1-18

TLDR

DeepXplore efficiently finds thousands of incorrect corner case behaviors in state-of-the-art DL models with thousands of neurons trained on five popular datasets including ImageNet and Udacity self-driving challenge data.

Abstract:

Deep learning (DL) systems are increasingly deployed in safety- and security-critical domains including self-driving cars and malware detection, where the correctness and predictability of a system's behavior for corner case inputs are of great importance Existing DL testing depends heavily on manually labeled data and therefore often fails to expose erroneous behaviors for rare inputs We design, implement, and evaluate DeepXplore, the first whitebox framework for systematically testing real-world DL systems First, we introduce neuron coverage for systematically measuring the parts of a DL system exercised by test inputs Next, we leverage multiple DL systems with similar functionality as cross-referencing oracles to avoid manual checking Finally, we demonstrate how finding inputs for DL systems that both trigger many differential behaviors and achieve high neuron coverage can be represented as a joint optimization problem and solved efficiently using gradient-based search techniques DeepXplore efficiently finds thousands of incorrect corner case behaviors (eg, self-driving cars crashing into guard rails and malware masquerading as benign software) in state-of-the-art DL models with thousands of neurons trained on five popular datasets including ImageNet and Udacity self-driving challenge data For all tested DL models, on average, DeepXplore generated one test input demonstrating incorrect behavior within one second while running only on a commodity laptop We further show that the test inputs generated by DeepXplore can also be used to retrain the corresponding DL model to improve the model's accuracy by up to 3%

DeepXplore: Automated Whitebox Testing of Deep Learning Systems

Citations

Data Mining Practical Machine Learning Tools and Techniques

Adversarial Examples: Attacks and Defenses for Deep Learning

DeepTest: automated testing of deep-neural-network-driven autonomous cars

AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation

Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

References

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet: A large-scale hierarchical image database

Gradient-based learning applied to document recognition

Image quality assessment: from error visibility to structural similarity

Related Papers (5)

Explaining and Harnessing Adversarial Examples

Towards Evaluating the Robustness of Neural Networks

Intriguing properties of neural networks

Deep Residual Learning for Image Recognition

Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks