scispace - formally typeset
Proceedings ArticleDOI

Patterns for Indexing Large Datasets

TLDR
In this work, a few basic reusable indexing structures are presented that can be used to create advanced and complexindexing structures with lesser effort and time.
Abstract
Searching is one of the fundamental tasks in Computer Science. An intuitive way to search is to do it linearly, that is, start at the beginning of the dataset and continue till the searched-for item is found or nothing is found. However, as the volume of data increases, the response time of linear search is no longer acceptable. Indexes are designed to search through massive datasets quickly. There are a number of different ways of building complex and advanced indexes. Appropriate selection and modification of indexing structures according to dynamic business requirements is crucial for data-intensive applications. In this work, we present a few basic reusable indexing structures. These structures can be used to create advanced and complex indexing structures with lesser effort and time.

read more

References
More filters
Journal ArticleDOI

A global geometric framework for nonlinear dimensionality reduction.

TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.
Proceedings ArticleDOI

R-trees: a dynamic index structure for spatial searching

TL;DR: A dynamic index structure called an R-tree is described which meets this need, and algorithms for searching and updating it are given and it is concluded that it is useful for current database systems in spatial applications.
Journal ArticleDOI

Singular value decomposition and least squares solutions

TL;DR: The decomposition of A is called the singular value decomposition (SVD) and the diagonal elements of ∑ are the non-negative square roots of the eigenvalues of A T A; they are called singular values.
Book ChapterDOI

The X-tree: an index structure for high-dimensional data

TL;DR: A new organization of the directory is introduced which uses a split algorithm minimizing overlap and additionally utilizes the concept of supernodes to keep the directory as hierarchical as possible, and at the same time to avoid splits in the directory that would result in high overlap.
Journal ArticleDOI

iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

TL;DR: An efficient B+-tree based indexing method for K-nearest neighbor (KNN) search in a high-dimensional metric space, called iDistance, which partitions the data based on a space- or data-partitioning strategy, and selects a reference point for each partition.
Related Papers (5)