scispace - formally typeset
Proceedings ArticleDOI

CRADLE: Cr oss-backend v a lidation to D etect and L ocalize bugs in D e ep learning libraries

Reads0
Chats0
TLDR
This work proposes CRADLE, a new approach that performs cross-implementation inconsistency checking to detect bugs in DL libraries, and leverages anomaly propagation tracking and analysis to localize faulty functions inDL libraries that cause the bugs.
Abstract
Deep learning (DL) systems are widely used in domains including aircraft collision avoidance systems, Alzheimer's disease diagnosis, and autonomous driving cars. Despite the requirement for high reliability, DL systems are difficult to test. Existing DL testing work focuses on testing the DL models, not the implementations (e.g., DL software libraries) of the models. One key challenge of testing DL libraries is the difficulty of knowing the expected output of DL libraries given an input instance. Fortunately, there are multiple implementations of the same DL algorithms in different DL libraries. Thus, we propose CRADLE, a new approach that focuses on finding and localizing bugs in DL software libraries. CRADLE (1) performs cross-implementation inconsistency checking to detect bugs in DL libraries, and (2) leverages anomaly propagation tracking and analysis to localize faulty functions in DL libraries that cause the bugs. We evaluate CRADLE on three libraries (TensorFlow, CNTK, and Theano), 11 datasets (including ImageNet, MNIST, and KGS Go game), and 30 pre-trained models. CRADLE detects 12 bugs and 104 unique inconsistencies, and highlights functions relevant to the causes of inconsistencies for all 104 unique inconsistencies.

read more

Citations
More filters
Posted Content

Machine Learning Testing: Survey, Landscapes and Horizons

TL;DR: This paper provides a comprehensive survey of techniques for testing machine learning systems; Machine Learning Testing (ML testing) research, covering 144 papers on testing properties, testing components, and application scenarios.
Proceedings ArticleDOI

Deep learning library testing via effective model generation

TL;DR: This work designs a series of mutation rules for DL models, with the purpose of exploring different invoking sequences of library code and hard-to-trigger behaviors, and proposes a heuristic strategy to guide the model generation process towards the direction of amplifying the inconsistent degrees of the inconsistencies between different DL libraries caused by bugs.
Proceedings ArticleDOI

Repairing deep neural networks: fix patterns and challenges

TL;DR: This work presents a comprehensive study of bug fix patterns of Deep Neural Network (DNN) and investigates challenges in repairs and patterns that are utilized when manually repairing DNNs to address challenges faced by developers when fixing bugs.
Proceedings ArticleDOI

Problems and opportunities in training deep learning software systems: an analysis of variance

TL;DR: In this paper, the authors study the variance of deep learning systems and the awareness of this variance among researchers and practitioners, and find that only 19.5±3% of papers in recent top software engineering (SE), artificial intelligence (AI), and systems conferences use multiple identical training runs to quantify the variance in their DL approaches.
Proceedings ArticleDOI

Audee: automated testing for deep learning frameworks

TL;DR: Audee as discussed by the authors adopts a search-based approach and implements three different mutation strategies to generate diverse test cases by exploring combinations of model structures, parameters, weights and inputs, which is able to detect three types of bugs: logical bugs, crashes and Not-a-Number (NaN) errors.
References
More filters
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Proceedings ArticleDOI

TensorFlow: a system for large-scale machine learning

TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.
Proceedings Article

Intriguing properties of neural networks

TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.
Proceedings Article

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning

TL;DR: In this article, the authors show that training with residual connections accelerates the training of Inception networks significantly, and they also present several new streamlined architectures for both residual and non-residual Inception Networks.
Related Papers (5)