Home
/
Authors
/
Gang Qian

Author

Gang Qian

Other affiliations: Michigan State University

Bio: Gang Qian is an academic researcher from University of Central Oklahoma. The author has contributed to research in topics: Search engine indexing & Tree (data structure). The author has an hindex of 9, co-authored 28 publications receiving 994 citations. Previous affiliations of Gang Qian include Michigan State University.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Segmentation and histogram generation using the HSV color space for image retrieval

[...]

Shamik Sural¹, Gang Qian¹, Sakti Pramanik¹•Institutions (1)

Michigan State University¹

10 Dec 2002

TL;DR: The feature extraction method has been applied for both image segmentation as well as histogram generation applications - two distinct approaches to content based image retrieval (CBIR), showing better identification of objects in an image.

...read moreread less

Abstract: We have analyzed the properties of the HSV (hue, saturation and value) color space with emphasis on the visual perception of the variation in hue, saturation and intensity values of an image pixel. We extract pixel features by either choosing the hue or the intensity as the dominant property based on the saturation value of a pixel. The feature extraction method has been applied for both image segmentation as well as histogram generation applications - two distinct approaches to content based image retrieval (CBIR). Segmentation using this method shows better identification of objects in an image. The histogram retains a uniform color transition that enables us to do a window-based smoothing during retrieval. The results have been compared with those generated using the RGB color space.

...read moreread less

555 citations

Proceedings Article•DOI•

[...]

Gang Qian¹, Shamik Sural², Yuelong Gu¹, Sakti Pramanik¹•Institutions (2)

Michigan State University¹, Indian Institute of Technology Kharagpur²

14 Mar 2004

TL;DR: This paper compares two commonly used distance measures in vector models, namely, Euclidean distance (EUD) and cosine angle distance (CAD), for nearest neighbor (NN) queries in high dimensional data spaces and shows that CAD works no worse than EUD.

...read moreread less

Abstract: Understanding the relationship among different distance measures is helpful in choosing a proper one for a particular application. In this paper, we compare two commonly used distance measures in vector models, namely, Euclidean distance (EUD) and cosine angle distance (CAD), for nearest neighbor (NN) queries in high dimensional data spaces. Using theoretical analysis and experimental results, we show that the retrieval results based on EUD are similar to those based on CAD when dimension is high. We have applied CAD for content based image retrieval (CBIR). Retrieval results show that CAD works no worse than EUD, which is a commonly used distance measure for CBIR, while providing other advantages, such as naturally normalized distance.

...read moreread less

281 citations

Book Chapter•DOI•

The ND-tree: a dynamic indexing technique for multidimensional non-ordered discrete data spaces

[...]

Gang Qian¹, Qiang Zhu², Qiang Xue¹, Sakti Pramanik¹•Institutions (2)

Michigan State University¹, University of Michigan²

09 Sep 2003

TL;DR: In this paper, a dynamic indexing technique called the ND-tree is proposed to support efficient similarity searches in an NDDS, which extends the relevant geometric concepts as well as some indexing strategies used in CDSs to NDDSs.

...read moreread less

Abstract: Similarity searches in multidimensional Nonordered Discrete Data Spaces (NDDS) are becoming increasingly important for application areas such as genome sequence databases. Existing indexing methods developed for multidimensional (ordered) Continuous Data Spaces (CDS) such as R-tree cannot be directly applied to an NDDS. This is because some essential geometric concepts/properties such as the minimum bounding region and the area of a region in a CDS are no longer valid in an NDDS. On the other hand, indexing methods based on metric spaces such as M-tree are too general to effectively utilize the data distribution characteristics in an NDDS. Therefore, their retrieval performance is not optimized. To support efficient similarity searches in an NDDS, we propose a new dynamic indexing technique, called the ND-tree. The key idea is to extend the relevant geometric concepts as well as some indexing strategies used in CDSs to NDDSs. Efficient algorithms for ND-tree construction are presented. Our experimental results on synthetic and genomic sequence data demonstrate that the performance of the ND-tree is significantly better than that of the linear scan and M-tree in high dimensional NDDSs.

...read moreread less

35 citations

A Dynamic Indexing Technique for Multidimensional Non-ordered Discrete Data Spaces

[...]

Gang Qian, Qiang Zhu, Qiang Xue, Sakti Pramanik

01 Jan 2006

TL;DR: The key idea is to extend the relevant geometric concepts as well as some indexing strategies used in CDSs to NDDSs, and demonstrate that the performance of the ND-tree is significantly better than that of the linear scan and M-tree in high dimensionalNDDSs.

...read moreread less

Abstract: Similarity searches in multidimensional Non-ordered Discrete Data Spaces (NDDS) are becoming increasingly important for application areas such as bioinformatics, biometrics, data mining and Ecommerce. Ecien t similarity searches require robust indexing techniques. Unfortunately, existing indexing methods developed for multidimensional (ordered) Continuous Data Spaces (CDS) such as the R-tree cannot be directly applied to an NDDS. This is because some essential geometric concepts/properties such as the minimum bounding region and the area of a region in a CDS are no longer valid in an NDDS. Other indexing methods based on metric spaces such as the M-tree and the Slim-trees are too general to eectiv ely utilize the special characteristics of NDDSs, resulting in non-optimized performance. In this paper, we propose a new dynamic indexing technique, called the ND-tree, to support ecien t similarity searches in an NDDS. The key idea is to extend the relevant geometric concepts as well as some indexing strategies used in CDSs to NDDSs. Ecien t algorithms for ND-tree construction and techniques to solve relevant issues such as handling dimensions with dieren t alphabets in an NDDS are presented. Our experimental results on synthetic data and real genome sequence data demonstrate that the ND-tree outperforms the linear scan, the M-tree and the Slim-trees for similarity searches in multidimensional NDDSs. A theoretical model is also developed to predict the performance of the ND-tree for random data.

...read moreread less

33 citations

Journal Article•DOI•

Dynamic indexing for multidimensional non-ordered discrete data spaces using a data-partitioning approach

[...]

Gang Qian¹, Qiang Zhu², Qiang Xue³, Sakti Pramanik³•Institutions (3)

University of Central Oklahoma¹, University of Michigan², Michigan State University³

01 Jun 2006-ACM Transactions on Database Systems

TL;DR: The experimental results on synthetic data and real genome sequence data demonstrate that the ND-tree outperforms the linear scan, the M-tree and the Slim-trees for similarity searches in multidimensional NDDSs.

...read moreread less

Abstract: Similarity searches in multidimensional Non-ordered Discrete Data Spaces (NDDS) are becoming increasingly important for application areas such as bioinformatics, biometrics, data mining and E-commerce. Efficient similarity searches require robust indexing techniques. Unfortunately, existing indexing methods developed for multidimensional (ordered) Continuous Data Spaces (CDS) such as the R-tree cannot be directly applied to an NDDS. This is because some essential geometric concepts/properties such as the minimum bounding region and the area of a region in a CDS are no longer valid in an NDDS. Other indexing methods based on metric spaces such as the M-tree and the Slim-trees are too general to effectively utilize the special characteristics of NDDSs, resulting in nonoptimized performance. In this article, we propose a new dynamic data-partitioning-based indexing technique, called the ND-tree, to support efficient similarity searches in an NDDS. The key idea is to extend the relevant geometric concepts as well as some indexing strategies used in CDSs to NDDSs. Efficient algorithms for ND-tree construction and techniques to solve relevant issues such as handling dimensions with different alphabets in an NDDS are presented. Our experimental results on synthetic data and real genome sequence data demonstrate that the ND-tree outperforms the linear scan, the M-tree and the Slim-trees for similarity searches in multidimensional NDDSs. A theoretical model is also developed to predict the performance of the ND-tree for random data.

...read moreread less

31 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Proceedings Article•DOI•

Segmentation and histogram generation using the HSV color space for image retrieval

[...]

Shamik Sural¹, Gang Qian¹, Sakti Pramanik¹•Institutions (1)

Michigan State University¹

10 Dec 2002

...read moreread less

555 citations

Journal Article•

ACM Transactions on Database Systems

[...]

Dan Suciu, Gerhard Weikum

01 Jan 2005-ACM Transactions on Database Systems

TL;DR: BLOCKIN BLOCKINÒ BLOCKin× ½¸ÔÔº ¾ßß¿º ¿ ¾ ¾ Ã ¼ Ã Ã 0

...read moreread less

Abstract: BLOCKIN BLOCKINÒ BLOCKIN× ½¸ÔÔº ¿ßß¿º ¿

...read moreread less

373 citations

Proceedings Article•DOI•

Scalable Graph-based Bug Search for Firmware Images

[...]

Qian Feng¹, Rundong Zhou¹, Chengcheng Xu¹, Yao Cheng¹, Brian Testa², Heng Yin³ - Show less +2 more•Institutions (3)

Syracuse University¹, Air Force Research Laboratory², University of California, Riverside³

24 Oct 2016

TL;DR: A new bug search scheme is proposed which addresses the scalability challenge in existing cross-platform bug search techniques and further improves search accuracy, and implemented a bug search engine, Genius, and compared it with state-of-art bug search approaches.

...read moreread less

Abstract: Because of rampant security breaches in IoT devices, searching vulnerabilities in massive IoT ecosystems is more crucial than ever. Recent studies have demonstrated that control-flow graph (CFG) based bug search techniques can be effective and accurate in IoT devices across different architectures. However, these CFG-based bug search approaches are far from being scalable to handle an enormous amount of IoT devices in the wild, due to their expensive graph matching overhead. Inspired by rich experience in image and video search, we propose a new bug search scheme which addresses the scalability challenge in existing cross-platform bug search techniques and further improves search accuracy. Unlike existing techniques that directly conduct searches based upon raw features (CFGs) from the binary code, we convert the CFGs into high-level numeric feature vectors. Compared with the CFG feature, high-level numeric feature vectors are more robust to code variation across different architectures, and can easily achieve realtime search by using state-of-the-art hashing techniques. We have implemented a bug search engine, Genius, and compared it with state-of-art bug search approaches. Experimental results show that Genius outperforms baseline approaches for various query loads in terms of speed and accuracy. We also evaluated Genius on a real-world dataset of 33,045 devices which was collected from public sources and our system. The experiment showed that Genius can finish a search within 1 second on average when performed over 8,126 firmware images of 420,558,702 functions. By only looking at the top 50 candidates in the search result, we found 38 potentially vulnerable firmware images across 5 vendors, and confirmed 23 of them by our manual analysis. We also found that it took only 0.1 seconds on average to finish searching for all 154 vulnerabilities in two latest commercial firmware images from D-LINK. 103 of them are potentially vulnerable in these images, and 16 of them were confirmed.

...read moreread less

325 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195

Collapse