Showing papers by "Vincent Oria published in 2010"

PDF

Open Access

Proceedings Article•DOI•

Combining multi-probe histogram and order-statistics based LSH for scalable audio content retrieval

[...]

Yi Yu¹, Michel Crucianu², Vincent Oria¹, Ernesto Damiani³•Institutions (3)

New Jersey Institute of Technology¹, Conservatoire national des arts et métiers², University of Milan³

25 Oct 2010

TL;DR: A new multi-stage LSH scheme that consists in extracting compact but accurate representations from audio tracks by exploiting the LSH idea to summarize audio tracks and adequately organizing the resulting representations in LSH tables, retaining almost the same accuracy as an exact kNN retrieval is suggested.

...read moreread less

Abstract: In order to improve the reliability and the scalability of content-based retrieval of variant audio tracks from large music databases, we suggest a new multi-stage LSH scheme that consists in (i) extracting compact but accurate representations from audio tracks by exploiting the LSH idea to summarize audio tracks, and (ii) adequately organizing the resulting representations in LSH tables, retaining almost the same accuracy as an exact kNN retrieval. In the first stage, we use major bins of successive chroma features to calculate a multi-probe histogram (MPH) that is concise but retains the information about local temporal correlations. In the second stage, based on the order statistics (OS) of the MPH, we propose a new LSH scheme, OS-LSH, to organize and probe the histograms. The representation and organization of the audio tracks are storage efficient and support robust and scalable retrieval. Extensive experiments over a large dataset with 30,000 real audio tracks confirm the effectiveness and efficiency of the proposed scheme.

...read moreread less

23 citations

Proceedings Article•DOI•

PARINET: A tunable access method for in-network trajectories

[...]

Iulian Sandu Popa, Karine Zeitouni, Vincent Oria¹, Dominique Barth, Sandrine Vial - Show less +1 more•Institutions (1)

New Jersey Institute of Technology¹

01 Mar 2010

TL;DR: PARINET is a new access method to efficiently retrieve the trajectories of objects moving in networks based on a combination of graph partitioning and a set of composite B+-tree local indexes that significantly outperforms both MON-tree and another R-tree based access method which are the reference indexing techniques for in-network trajectory databases.

...read moreread less

Abstract: In this paper we propose PARINET, a new access method to efficiently retrieve the trajectories of objects moving in networks. The structure of PARINET is based on a combination of graph partitioning and a set of composite B+-tree local indexes. PARINET is designed for historical data and relies on the distribution of the data over the network as for historical data, the data distribution is known in advance. Because the network can be modeled using graphs, the partitioning of the trajectory data is based on graph partitioning theory and can be tuned for a given query load. The data in each partition is indexed on the time component using B+-trees. We study different types of queries, and provide an optimal configuration for several scenarios. PARINET can easily be integrated into any RDBMS, which is an essential asset particularly for industrial or commercial applications. The experimental evaluation under an off-the-shelf DBMS shows that PARINET is robust. It also significantly outperforms both MON-tree and another R-tree based access method which are the reference indexing techniques for in-network trajectory databases.

...read moreread less

18 citations

Proceedings Article•DOI•

Active caching for similarity queries based on shared-neighbor information

[...]

Michael E. Houle¹, Vincent Oria², Umar Qasim²•Institutions (2)

National Institute of Informatics¹, New Jersey Institute of Technology²

26 Oct 2010

TL;DR: This paper proposes an 'active caching' technique for similarity queries that is capable of synthesizing query results from cached information even when the required result list is not explicitly stored in the cache.

...read moreread less

Abstract: Novel applications such as recommender systems, uncertain databases, and multimedia databases are designed to process similarity queries that produce ranked lists of objects as their results. Similarity queries typically result in disk access latency and incur a substantial computational cost. In this paper, we propose an 'active caching' technique for similarity queries that is capable of synthesizing query results from cached information even when the required result list is not explicitly stored in the cache. Our solution, the Cache Estimated Significance (CES) model, is based on shared-neighbor similarity measures, which assess the strength of the relationship between two objects as a function of the number of other objects in the common intersection of their neighborhoods. The proposed method is general in that it does not require that the features be drawn from a metric space, nor does it require that the partial orders induced by the similarity measure be monotonic. Experimental results on real data sets show a substantial cache hit rate when compared with traditional caching approaches.

...read moreread less

11 citations

Journal Article•DOI•

Supplementing virtual documents with just-in-time hypermedia functionality

[...]

Li Zhang¹, Michael Bieber¹, Min Song¹, Vincent Oria¹, David E. Millard² - Show less +1 more•Institutions (2)

New Jersey Institute of Technology¹, University of Southampton²

01 Sep 2010-International Journal on Digital Libraries

TL;DR: The specific challenges for virtual documents and dynamic hypermedia functionality are described: dynamic regeneration, and dynamic anchor re-identification and re-location, and issues prompted by this research are described.

...read moreread less

Abstract: Digital library systems and other analytic or computational applications create documents and display screens in response to user queries “dynamically” or in “real time.” These “virtual documents” do not exist in advance, and thus hypermedia features (links, comments, and bookmark anchors) must be generated “just in time”—automatically and dynamically. In addition, accessing the hypermedia features may cause target documents to be generated or re-generated. This article describes the specific challenges for virtual documents and dynamic hypermedia functionality: dynamic regeneration, and dynamic anchor re-identification and re-location. It presents Just-in-time Hypermedia Engine to support just-in-time hypermedia across digital library and other third-party applications with dynamic content, and discusses issues prompted by this research.

...read moreread less

3 citations

Proceedings Article•DOI•

Recommender system for MIR research community

[...]

Yi Yu¹, Vincent Oria¹, J. Stephen Downie²•Institutions (2)

New Jersey Institute of Technology¹, University of Illinois at Urbana–Champaign²

21 Jun 2010

TL;DR: This demonstration shows a recommender system for the Music Information Retrieval (MIR) research community that extracts the key topics and tags by analyzing the ten-year cumulative ISMIR proceedings, and recommends papers and research colleagues to users in an interactive way.

...read moreread less

Abstract: In this demonstration, we show a recommender system for the Music Information Retrieval (MIR) research community. We extract the key topics and tags by analyzing the ten-year cumulative ISMIR proceedings, and recommend papers and research colleagues to users in an interactive way.

...read moreread less