Search or ask a question

Showing papers by "Ye Yuan published in 2013"

PDF

Open Access

Journal Article•DOI•

Efficient Keyword Search on Uncertain Graph Data

[...]

Ye Yuan¹, Guoren Wang¹, Lei Chen², Haixun Wang³•Institutions (3)

Northeastern University (China)¹, Hong Kong University of Science and Technology², Microsoft³

01 Dec 2013-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A filtering-and-verification strategy based on a probabilistic keyword index, PKIndex, which offline compute path-based top-k probabilities, and attach these values to PKIndex in an optimal, compressed way to improve the search efficiency.

...read moreread less

Abstract: As a popular search mechanism, keyword search has been applied to retrieve useful data in documents, texts, graphs, and even relational databases. However, so far, there is no work on keyword search over uncertain graph data even though the uncertain graphs have been widely used in many real applications, such as modeling road networks, influential detection in social networks, and data analysis on PPI networks. Therefore, in this paper, we study the problem of top-k keyword search over uncertain graph data. Following the similar answer definition for keyword search over deterministic graphs, we consider a subtree in the uncertain graph as an answer to a keyword query if 1) it contains all the keywords; 2) it has a high score (defined by users or applications) based on keyword matching; and 3) it has low uncertainty. Keyword search over deterministic graphs is already a hard problem as stated in [1], [2], [3]. Due to the existence of uncertainty, keyword search over uncertain graphs is much harder. Therefore, to improve the search efficiency, we employ a filtering-and-verification strategy based on a probabilistic keyword index, PKIndex. For each keyword, we offline compute path-based top-k probabilities, and attach these values to PKIndex in an optimal, compressed way. In the filtering phase, we perform existence, path-based and tree-based probabilistic pruning phases, which filter out most false subtrees. In the verification, we propose a sampling algorithm to verify the candidates. Extensive experimental results demonstrate the effectiveness of the proposed algorithms.

...read moreread less

48 citations

Journal Article•DOI•

Semantic concept detection for video based on extreme learning machine

[...]

Bo Lu¹, Guoren Wang¹, Ye Yuan¹, Dong Han•Institutions (1)

Northeastern University (China)¹

01 Feb 2013-Neurocomputing

TL;DR: An Extreme Learning Machine (ELM) based Multi-modality Classifier Combination Framework (MCCF) to improve the accuracy of semantic concept detection and achieve performance at extremely high speed is proposed.

...read moreread less

22 citations

Proceedings Article•DOI•

Efficient Probabilistic Skyline Query Processing in MapReduce

[...]

Linlin Ding¹, Guoren Wang¹, Junchang Xin¹, Ye Yuan¹•Institutions (1)

Northeastern University (China)¹

27 Jun 2013

TL;DR: A filter-refine two phases approach in MapReduce that translates the probabilistic skyline query into two decomposable computations for obtaining the final results and develops the optimized probabilism skyline query processing algorithm to prune the unpromising data both in filter and refine phases.

...read moreread less

Abstract: As a popular parallel programming model, how to process probabilistic skyline query over uncertain data in MapReduce framework is becoming an urgent problem to be resolved. In MapReduce framework, implementing probabilistic skyline query is nontrivial since the probabilistic skyline query is not decomposable. Therefore, in this paper, we propose a filter-refine two phases approach in MapReduce that translates the probabilistic skyline query into two decomposable computations for obtaining the final results. Firstly, we describe the whole processing procedure of filter-refine, and then propose an efficient probabilistic skyline query processing algorithm in MapReduce. Furthermore, to reduce the computation and communication cost, we develop the optimized probabilistic skyline query processing algorithm to prune the unpromising data both in filter and refine phases. Finally, we conduct extensive experiments on synthetic data to verify the effectiveness and efficiency of the proposed filter-refine approach with various experimental settings.

...read moreread less

9 citations

Book Chapter•DOI•

An Algorithm for Outlier Detection on Uncertain Data Stream

[...]

Keyan Cao¹, Donghong Han¹, Guoren Wang¹, Yachao Hu¹, Ye Yuan¹ - Show less +1 more•Institutions (1)

Northeastern University¹

04 Apr 2013

TL;DR: A new outlier concept on uncertain data stream based on possible worlds is proposed to meet the demand of limited storage and real-time processing, and an efficient range query method based on SM-tree(Statistics M-tree), to reduce some redundant calculation.

...read moreread less

Abstract: Outlier detection plays an important role in fraud detection, sensor net, computer network management and many other areas. Now the flow property and uncertainty of data are more and more apparent, outlier detection on uncertain data stream has become a new research topic. Firstly, we propose a new outlier concept on uncertain data stream based on possible worlds. Then an outlier detection method on uncertain data stream is proposed to meet the demand of limited storage and real-time processing. Next, a dynamic storage structure is designed for outlier detection on uncertain data stream over sliding window, to meet the demands of limited storage and real-time response. Furthermore, an efficient range query method based on SM-tree(Statistics M-tree) is proposed to reduce some redundant calculation. Finally, the performance of our method is verified through a large number of simulation experiments. The experimental results show that our method is an effective way to solve the problem of outlier detection on uncertain data stream, and it could significantly reduce the execution time and storage space.

...read moreread less

3 citations