Showing papers by "Paolo Ciaccia published in 2000"

PDF

Open Access

Proceedings Article•DOI•

PAC nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces

[...]

Paolo Ciaccia¹, Marco Patella¹•Institutions (1)

29 Feb 2000

TL;DR: This paper describes sequential and index-based PAC-NN algorithms that exploit the distance distribution of the query object in order to determine a stopping condition that respects the error bound, and provides experimental evidence that indexing can further speed-up the retrieval process by up to 1-2 orders of magnitude without giving up the accuracy of the result.

...read moreread less

Abstract: In high-dimensional and complex metric spaces, determining the nearest neighbor (NN) of a query object q can be a very expensive task, because of the poor partitioning operated by index structures-the so-called "curse of dimensionality". This also affects approximately correct (AC) algorithms, which return as results a point whose distance from q is less than (1+/spl epsiv/) times the distance between q and its true NN. In this paper we introduce a new approach to approximate similarity search, called PAC-NN queries, where the error bound /spl epsiv/ can be exceeded with probability /spl delta/ and both /spl epsiv/ and /spl delta/ parameters can be tuned at query time to trade the quality of the result for the cost of the search. We describe sequential and index-based PAC-NN algorithms that exploit the distance distribution of the query object in order to determine a stopping condition that respects the error bound. Analysis and experimental evaluation of the sequential algorithm confirm that, for moderately large data sets and suitable /spl epsiv/ and /spl delta/ values, PAC-NN queries can be efficiently solved and the error controlled. Then, we provide experimental evidence that indexing can further speed-up the retrieval process by up to 1-2 orders of magnitude without giving up the accuracy of the result.

...read moreread less

161 citations

Book Chapter•DOI•

Imprecision and User Preferences in Multimedia Queries: A Generic Algebraic Approach

[...]

Paolo Ciaccia, Danilo Montesi, Wilma Penzo, Alberto Trombetta

14 Feb 2000

TL;DR: An integrated algebraic framework which allows many relevant aspects of similarity query processing to be dealt with, and defines a generic similarity algebra, SAMEW, where semantics of operators is deliberately left unspecified in order to better adapt to specific scenarios.

...read moreread less

Abstract: Specification and efficient processing of similarity queries on multimedia databases have recently attracted several research efforts, even if most of them have considered specific aspects, such as indexing, of this new exciting scenario. In this paper we try to remedy this by presenting an integrated algebraic framework which allows many relevant aspects of similarity query processing to be dealt with. As a starting point, we assume the more general case where "imprecision" is already present at the data level, typically because of the ambiguous nature of multimedia objects' content. We then define a generic similarity algebra, SAMEW, where semantics of operators is deliberately left unspecified in order to better adapt to specific scenarios. A basic feature of SAMEW is that it allows user preferences, expressed in the form of weights, to be specified so as to alter the default behavior of most operators. Finally, we discuss some issues related to "approximation" and to "user evaluation" of query results.

...read moreread less

38 citations

The M2-tree: Processing Complex Multi-Feature Queries with Just One Index.

[...]

Paolo Ciaccia¹, Marco Patella¹•Institutions (1)

University of Bologna¹

01 Jan 2000

TL;DR: The proposed approach combines within a single index structure information from multiple metric spaces, thus being able to efficiently support queries on arbitrary combinations of indexed features.

...read moreread less

Abstract: Motivated by the needs for efficient similarity retrieval in multimedia digital libraries, we present basic principles of a new paged and balanced index structure, the M -tree The M-tree can be applied whenever “complex” range and/or best matches queries over different descriptions (features) of objects need to be solved The proposed approach combines within a single index structure information from multiple metric spaces, thus being able to efficiently support queries on arbitrary combinations of indexed features Efficiency of the structure is presented through preliminary experimental results over a real-world data-set

...read moreread less

34 citations

Proceedings Article•DOI•

A sound algorithm for region-based image retrieval using an index

[...]

Ilaria Bartolini¹, Paolo Ciaccia¹, Marco Patella¹•Institutions (1)

University of Bologna¹

06 Sep 2000

TL;DR: This work proposes the first provably sound algorithm for performing region-based similarity search when regions are accessed through an index, and demonstrates the effectiveness of this approach as also compared to alternative retrieval strategies.

...read moreread less

Abstract: Region-based image retrieval systems aim to improve the effectiveness of content-based search by decomposing each image into a set of "homogeneous" regions. Thus, similarity between images is assessed by computing similarity between pairs of regions and then combining the results at the image level. We propose the first provably sound algorithm for performing region-based similarity search when regions are accessed through an index. Experimental results demonstrate the effectiveness of our approach, as also compared to alternative retrieval strategies.

...read moreread less

29 citations

Optimization and evaluation of generalized top queries.

[...]

Paolo Ciaccia, Roberto Cornacchia, Andrea Ghidini

01 Jan 2000

3 citations

Using the Wavelet Transform to Learn from User Feedback

[...]

Ilaria Bartolini¹, Paolo Ciaccia¹, Florian Waas•Institutions (1)

University of Bologna¹

01 Jan 2000

TL;DR: In this paper, the authors propose to use multidimensional unbalanced wavelets that are used to store the parameters determined during the feedback process and to predict parameter settings for queries similar to earlier ones by interpolation.

...read moreread less

Abstract: User feedback has proven very successful to query large multimedia databases. Due to the nature of the data representation and the mismatch between mathematical models and human perception, the query techniques benefit substantially from interactively modifying a query. Typical examples are generalized ellipsoid queries where optimal ratios and orientations of the half-axes are determined by relevance feedback. However, no information about the outcome of a feedback process is stored whatsoever once the process is terminated. Accordingly, the entire feedback loop has to be repeated—starting out with default parameters—if the same query is posed again. In this paper we present preliminary results on how to preserve feedback results in a space efficient way and learn from user feedback. The cornerstone of our system are multidimensional unbalanced wavelets that are used to store the parameters determined during the feedback process. Using wavelets lets us not only store parameter combinations but also enables us to predict parameter settings for queries similar to earlier ones by interpolation: the feedback process for an entirely new query can be started with a parameter setting, usually much closer to the optimal than the default parameters. As a result, after an initial learning phase, feedback is needed for fine tuning only, increasing effectiveness and response time of multimedia databases.

...read moreread less

1 citations