Conference

International Conference on Data Engineering

About: International Conference on Data Engineering is an academic conference. The conference publishes majorly in the area(s): Relational database & Query optimization. Over the lifetime, 6255 publications have been published by the conference receiving 251626 citations.

...read moreread less

Topics: Relational database, Query optimization, Query language, Data modeling, Sargable ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Mining sequential patterns

[...]

Rakesh Agrawal¹, Ramakrishnan Srikant¹•Institutions (1)

IBM¹

06 Mar 1995

TL;DR: Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.

...read moreread less

Abstract: We are given a large database of customer transactions, where each transaction consists of customer-id, transaction time, and the items bought in the transaction. We introduce the problem of mining sequential patterns over such databases. We present three algorithms to solve this problem, and empirically evaluate their performance using synthetic data. Two of the proposed algorithms, AprioriSome and AprioriAll, have comparable performance, albeit AprioriSome performs a little better when the minimum number of customers that must support a sequential pattern is low. Scale-up experiments show that both AprioriSome and AprioriAll scale linearly with the number of customer transactions. They also have excellent scale-up properties with respect to the number of transactions per customer and the number of items in a transaction. >

...read moreread less

5,663 citations

Proceedings Article•DOI•

t-Closeness: Privacy Beyond k-Anonymity and l-Diversity

[...]

Ninghui Li¹, Tiancheng Li¹, Suresh Venkatasubramanian²•Institutions (2)

Purdue University¹, AT&T Labs²

15 Apr 2007

TL;DR: T-closeness as mentioned in this paper requires that the distribution of a sensitive attribute in any equivalence class is close to the distributions of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold t).

...read moreread less

Abstract: The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (ie, a set of records that are indistinguishable from each other with respect to certain "identifying" attributes) contains at least k records Recently, several authors have recognized that k-anonymity cannot prevent attribute disclosure The notion of l-diversity has been proposed to address this; l-diversity requires that each equivalence class has at least l well-represented values for each sensitive attribute In this paper we show that l-diversity has a number of limitations In particular, it is neither necessary nor sufficient to prevent attribute disclosure We propose a novel privacy notion called t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table (ie, the distance between the two distributions should be no more than a threshold t) We choose to use the earth mover distance measure for our t-closeness requirement We discuss the rationale for t-closeness and illustrate its advantages through examples and experiments

...read moreread less

3,281 citations

Proceedings Article•DOI•

L-diversity: privacy beyond k-anonymity

[...]

Ashwin Machanavajjhala¹, Johannes Gehrke¹, Daniel Kifer¹, Muthuramakrishnan Venkitasubramaniam¹•Institutions (1)

Cornell University¹

03 Apr 2006

TL;DR: This paper shows with two simple attacks that a \kappa-anonymized dataset has some subtle, but severe privacy problems, and proposes a novel and powerful privacy definition called \ell-diversity, which is practical and can be implemented efficiently.

...read moreread less

Abstract: Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called \kappa-anonymity has gained popularity. In a \kappa-anonymized dataset, each record is indistinguishable from at least k—1 other records with respect to certain "identifying" attributes. In this paper we show with two simple attacks that a \kappa-anonymized dataset has some subtle, but severe privacy problems. First, we show that an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. Second, attackers often have background knowledge, and we show that \kappa-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks and we propose a novel and powerful privacy definition called \ell-diversity. In addition to building a formal foundation for \ell-diversity, we show in an experimental evaluation that \ell-diversity is practical and can be implemented efficiently.

...read moreread less

2,700 citations

Proceedings Article•DOI•

The Skyline operator

[...]

S. Borzsony¹, Donald Kossmann², Konrad Stocker¹•Institutions (2)

University of Passau¹, Technische Universität München²

02 Apr 2001

TL;DR: This work shows how SSL can be extended to pose Skyline queries, present and evaluate alternative algorithms to implement the Skyline operation, and shows how this operation can be combined with other database operations, e.g., join.

...read moreread less

Abstract: We propose to extend database systems by a Skyline operation. This operation filters out a set of interesting points from a potentially large set of data points. A point is interesting if it is not dominated by any other point. For example, a hotel might be interesting for somebody traveling to Nassau if no other hotel is both cheaper and closer to the beach. We show how SSL can be extended to pose Skyline queries, present and evaluate alternative algorithms to implement the Skyline operation, and show how this operation can be combined with other database operations, e.g., join.

...read moreread less

2,509 citations

Journal Article•DOI•

Data cube: a relational aggregation operator generalizing GROUP-BY, CROSS-TAB, and SUB-TOTALS

[...]

Jim Gray¹, A. Bosworth¹, A. Lyaman¹, Hamid Pirahesh²•Institutions (2)

Microsoft¹, IBM²

26 Feb 1996

TL;DR: The data cube operator as discussed by the authors generalizes the histogram, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers.

...read moreread less

Abstract: Data analysis applications typically aggregate data across many dimensions looking for unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zero-dimensional or one-dimensional answers. Applications need the N-dimensional generalization of these operators. The paper defines that operator, called the data cube or simply cube. The cube operator generalizes the histogram, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers. The cube treats each of the N aggregation attributes as a dimension of N-space. The aggregate of a particular set of attribute values is a point in this space. The set of points forms an N-dimensionaI cube. Super-aggregates are computed by aggregating the N-cube to lower dimensional spaces. Aggregation points are represented by an "infinite value": ALL, so the point (ALL,ALL,...,ALL, sum(*)) represents the global sum of all items. Each ALL value actually represents the set of values contributing to that aggregation.

...read moreread less

2,308 citations

Collapse

Performance

Metrics

6,255

Papers

251,626

Citations

No. of papers from the Conference in previous years
Year	Papers
2021	334
2020	357
2019	321
2018	292
2017	235
2016	249