Topic

Symbolic data analysis

About: Symbolic data analysis is a research topic. Over the lifetime, 1931 publications have been published within this topic receiving 43610 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Symbolic Boolean manipulation with ordered binary-decision diagrams

[...]

Randal E. Bryant¹•Institutions (1)

Carnegie Mellon University¹

01 Sep 1992-ACM Computing Surveys

TL;DR: The OBDD data structure is described and a number of applications that have been solved by OBDd-based symbolic analysis are surveyed.

...read moreread less

Abstract: Ordered Binary-Decision Diagrams (OBDDs) represent Boolean functions as directed acyclic graphs. They form a canonical representation, making testing of functional properties such as satisfiability and equivalence straightforward. A number of operations on Boolean functions can be implemented as graph algorithms on OBDD data structures. Using OBDDs, a wide variety of problems can be solved through symbolic analysis. First, the possible variations in system parameters and operating conditions are encoded with Boolean variables. Then the system is evaluated for all variations by a sequence of OBDD operations. Researchers have thus solved a number of problems in digital-system design, finite-state system analysis, artificial intelligence, and mathematical logic. This paper describes the OBDD data structure and surveys a number of applications that have been solved by OBDD-based symbolic analysis.

...read moreread less

2,196 citations

Proceedings Article•DOI•

A symbolic representation of time series, with implications for streaming algorithms

[...]

Jessica Lin¹, Eamonn Keogh¹, Stefano Lonardi¹, Bill Chiu¹•Institutions (1)

University of California, Riverside¹

13 Jun 2003

TL;DR: A new symbolic representation of time series is introduced that is unique in that it allows dimensionality/numerosity reduction, and it also allows distance measures to be defined on the symbolic approach that lower bound corresponding distance measuresdefined on the original series.

...read moreread less

Abstract: The parallel explosions of interest in streaming data, and data mining of time series have had surprisingly little intersection. This is in spite of the fact that time series data are typically streaming data. The main reason for this apparent paradox is the fact that the vast majority of work on streaming data explicitly assumes that the data is discrete, whereas the vast majority of time series data is real valued.Many researchers have also considered transforming real valued time series into symbolic representations, nothing that such representations would potentially allow researchers to avail of the wealth of data structures and algorithms from the text processing and bioinformatics communities, in addition to allowing formerly "batch-only" problems to be tackled by the streaming community. While many symbolic representations of time series have been introduced over the past decades, they all suffer from three fatal flaws. Firstly, the dimensionality of the symbolic representation is the same as the original data, and virtually all data mining algorithms scale poorly with dimensionality. Secondly, although distance measures can be defined on the symbolic approaches, these distance measures have little correlation with distance measures defined on the original time series. Finally, most of these symbolic approaches require one to have access to all the data, before creating the symbolic representation. This last feature explicitly thwarts efforts to use the representations with streaming algorithms.In this work we introduce a new symbolic representation of time series. Our representation is unique in that it allows dimensionality/numerosity reduction, and it also allows distance measures to be defined on the symbolic approach that lower bound corresponding distance measures defined on the original series. As we shall demonstrate, this latter feature is particularly exciting because it allows one to run certain data mining algorithms on the efficiently manipulated symbolic representation, while producing identical results to the algorithms that operate on the original data. Finally, our representation allows the real valued data to be converted in a streaming fashion, with only an infinitesimal time and space overhead.We will demonstrate the utility of our representation on the classic data mining tasks of clustering, classification, query by content and anomaly detection.

...read moreread less

1,922 citations

Journal Article•DOI•

Algorithm 832: UMFPACK V4.3---an unsymmetric-pattern multifrontal method

[...]

Timothy A. Davis¹•Institutions (1)

University of Florida¹

01 Jun 2004-ACM Transactions on Mathematical Software

TL;DR: An ANSI C code for sparse LU factorization is presented that combines a column pre-ordering strategy with a right-looking unsymmetric-pattern multifrontal numerical factorization, and an upper bound on fill-in, work, and memory usage is computed.

...read moreread less

Abstract: An ANSI C code for sparse LU factorization is presented that combines a column pre-ordering strategy with a right-looking unsymmetric-pattern multifrontal numerical factorization. The pre-ordering and symbolic analysis phase computes an upper bound on fill-in, work, and memory usage during the subsequent numerical factorization. User-callable routines are provided for ordering and analyzing a sparse matrix, computing the numerical factorization, solving a system with the LU factors, transposing and permuting a sparse matrix, and converting between sparse matrix representations. The simple user interface shields the user from the details of the complex sparse factorization data structures by returning simple handles to opaque objects. Additional user-callable routines are provided for printing and extracting the contents of these opaque objects. An even simpler way to use the package is through its MATLAB interface. UMFPACK is incorporated as a built-in operator in MATLAB 6.5 as x = Abb when A is sparse and unsymmetric.

...read moreread less

1,469 citations

Journal Article•DOI•

Experiencing SAX: a novel symbolic representation of time series

[...]

Jessica Lin¹, Eamonn Keogh², Li Wei², Stefano Lonardi²•Institutions (2)

George Mason University¹, University of California, Riverside²

01 Oct 2007-Data Mining and Knowledge Discovery

TL;DR: The utility of the new symbolic representation of time series formed is demonstrated, which allows dimensionality/numerosity reduction, and it also allows distance measures to be defined on the symbolic approach that lower bound corresponding distance measuresdefined on the original series.

...read moreread less

Abstract: Many high level representations of time series have been proposed for data mining, including Fourier transforms, wavelets, eigenwaves, piecewise polynomial models, etc. Many researchers have also considered symbolic representations of time series, noting that such representations would potentiality allow researchers to avail of the wealth of data structures and algorithms from the text processing and bioinformatics communities. While many symbolic representations of time series have been introduced over the past decades, they all suffer from two fatal flaws. First, the dimensionality of the symbolic representation is the same as the original data, and virtually all data mining algorithms scale poorly with dimensionality. Second, although distance measures can be defined on the symbolic approaches, these distance measures have little correlation with distance measures defined on the original time series. In this work we formulate a new symbolic representation of time series. Our representation is unique in that it allows dimensionality/numerosity reduction, and it also allows distance measures to be defined on the symbolic approach that lower bound corresponding distance measures defined on the original series. As we shall demonstrate, this latter feature is particularly exciting because it allows one to run certain data mining algorithms on the efficiently manipulated symbolic representation, while producing identical results to the algorithms that operate on the original data. In particular, we will demonstrate the utility of our representation on various data mining tasks of clustering, classification, query by content, anomaly detection, motif discovery, and visualization.

...read moreread less

1,452 citations

Proceedings Article•DOI•

AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation

[...]

Timon Gehr¹, Matthew Mirman¹, Dana Drachsler-Cohen¹, Petar Tsankov¹, Swarat Chaudhuri¹, Martin Vechev¹ - Show less +2 more•Institutions (1)

ETH Zurich¹

20 May 2018

TL;DR: This work presents AI2, the first sound and scalable analyzer for deep neural networks, and introduces abstract transformers that capture the behavior of fully connected and convolutional neural network layers with rectified linear unit activations (ReLU), as well as max pooling layers.

...read moreread less

Abstract: We present AI2, the first sound and scalable analyzer for deep neural networks. Based on overapproximation, AI2 can automatically prove safety properties (e.g., robustness) of realistic neural networks (e.g., convolutional neural networks). The key insight behind AI2 is to phrase reasoning about safety and robustness of neural networks in terms of classic abstract interpretation, enabling us to leverage decades of advances in that area. Concretely, we introduce abstract transformers that capture the behavior of fully connected and convolutional neural network layers with rectified linear unit activations (ReLU), as well as max pooling layers. This allows us to handle real-world neural networks, which are often built out of those types of layers. We present a complete implementation of AI2 together with an extensive evaluation on 20 neural networks. Our results demonstrate that: (i) AI2 is precise enough to prove useful specifications (e.g., robustness), (ii) AI2 can be used to certify the effectiveness of state-of-the-art defenses for neural networks, (iii) AI2 is significantly faster than existing analyzers based on symbolic analysis, which often take hours to verify simple fully connected networks, and (iv) AI2 can handle deep convolutional networks, which are beyond the reach of existing methods.

...read moreread less

841 citations

Collapse

Network Information

Performance

Metrics

1,977

Papers

46,687

Citations

No. of papers in the topic in previous years
Year	Papers
2023	11
2022	35
2021	36
2020	37
2019	41
2018	60

Symbolic data analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics