scispace - formally typeset
Search or ask a question

Showing papers by "Liqing Zhang published in 2012"


Book ChapterDOI
07 Oct 2012
TL;DR: A real-time solution is presented to automatically segment a user's sketch during his/her drawing using a graph-based sketch segmentation algorithm and a semantic-based approach to simulate the past experience in the perceptual system by leveraging a web-scale clipart database.
Abstract: In this paper, we study the problem of how to segment a freehand sketch at the object level. By carefully considering the basic principles of human perceptual organization, a real-time solution is presented to automatically segment a user's sketch during his/her drawing. First, a graph-based sketch segmentation algorithm is proposed to segment a cluttered sketch into multiple parts based on the factor of proximity. Then, to improve the ability of detecting semantically meaningful objects, a semantic-based approach is introduced to simulate the past experience in the perceptual system by leveraging a web-scale clipart database. Finally, other important factors learnt from past experience, such as similarity, symmetry, direction, and closure, are also taken into account to make the approach more robust and practical. The proposed sketch segmentation framework has ability to handle complex sketches with overlapped objects. Extensive experimental results show the effectiveness of the proposed framework and algorithms.

64 citations


Proceedings ArticleDOI
29 Oct 2012
TL;DR: A simple and effective sketch-based algorithm for large scale image retrieval which could increase the retrieval rate dramatically and first extract orientation features and then organize them in a hierarchal way to generate global-to-local features.
Abstract: The paper presents a simple and effective sketch-based algorithm for large scale image retrieval. One of the main challenges in image retrieval is to localize a region in an image which would be matched with the query image in contour. To tackle this problem, we use the human perception mechanism to identify two types of regions in one image: the first type of region (the main region) is defined by a weighted center of image features, suggesting that we could retrieve objects in images regardless of their sizes and positions. The second type of region, called region of interests (ROI), is to find the most salient part of an image, and is helpful to retrieve images with objects similar to the query in a complicated scene. So using the two types of regions as candidate regions for feature extraction, our algorithm could increase the retrieval rate dramatically. Besides, to accelerate the retrieval speed, we first extract orientation features and then organize them in a hierarchal way to generate global-to-local features. Based on this characteristic, a hierarchical database index structure could be built which makes it possible to retrieve images on a very large scale image database online. Finally a real-time image retrieval system on 4.5 million database is developed to verify the proposed algorithm. The experiment results show excellent retrieval performance of the proposed algorithm and comparisons with other algorithms are also given.

57 citations


Proceedings ArticleDOI
29 Oct 2012
TL;DR: A query-adaptive shape topic model is proposed to mine object topics and shape topics related to the sketch, in which, multiple layers of information such as sketch, object, shape, image, and semantic labels are modeled in a generative process.
Abstract: In this work, we study the problem of hand-drawn sketch recognition. Due to large intra-class variations presented in hand-drawn sketches, most of existing work was limited to a particular domain or limited pre-defined classes. Different from existing work, we target at developing a general sketch recognition system, to recognize any semantically meaningful object that a child can recognize. To increase the recognition coverage, a web-scale clipart image collection is leveraged as the knowledge base of the recognition system. To alleviate the problems of intra-class shape variation and inter-class shape ambiguity in this unconstrained situation, a query-adaptive shape topic model is proposed to mine object topics and shape topics related to the sketch, in which, multiple layers of information such as sketch, object, shape, image, and semantic labels are modeled in a generative process. Besides sketch recognition, the proposed topic model can also be used for related applications such as sketch tagging, image tagging, and sketch-based image search. Extensive experiments on different applications show the effectiveness of the proposed topic model and the recognition system.

40 citations


Journal ArticleDOI
TL;DR: This work proposed a new paradigm where subjects participated in training more actively than in the traditional paradigm, and this active training paradigm may generate better training samples with fewer inconsistent labels because it overcomes mistakes when subject’s motor imagination does not match the given cues.
Abstract: Brain–computer interface (BCI) allows the use of brain activities for people to directly communicate with the external world or to control external devices without participation of any peripheral nerves and muscles. Motor imagery is one of the most popular modes in the research field of brain–computer interface. Although motor imagery BCI has some advantages compared with other modes of BCI, such as asynchronization, it is necessary to require training sessions before using it. The performance of trained BCI system depends on the quality of training samples or the subject engagement. In order to improve training effect and decrease training time, we proposed a new paradigm where subjects participated in training more actively than in the traditional paradigm. In the traditional paradigm, a cue (to indicate what kind of motor imagery should be imagined during the current trial) is given to the subject at the beginning of a trial or during a trial, and this cue is also used as a label for this trial. It is usually assumed that labels for trials are accurate in the traditional paradigm, although subjects may not have performed the required or correct kind of motor imagery, and trials may thus be mislabeled. And then those mislabeled trials give rise to interference during model training. In our proposed paradigm, the subject is required to reconfirm the label and can correct the label when necessary. This active training paradigm may generate better training samples with fewer inconsistent labels because it overcomes mistakes when subject’s motor imagination does not match the given cues. The experiments confirm that our proposed paradigm achieves better performance; the improvement is significant according to statistical analysis.

30 citations


Proceedings ArticleDOI
01 Jan 2012
TL;DR: Results show that ERP and EEG spectral power modulations contribute complementary information to decoding intended movement directions in the PPC, which might lead to a practical brain-computer interface (BCI) for decoding movement intention of individuals.
Abstract: The posterior parietal cortex (PPC) plays an important role in visuomotor transformations for movement planning and execution. To investigate how noninvasive electroencephalographic (EEG) signals correlate with intended movement directions in the PPC, this study recorded whole-head EEG during a delayed saccade-or-reach task and found direction-related changes in both event-related potentials (ERPs) and the EEG power in the theta and alpha bands in the PPC. Single-trial (left versus right) classification using ERP and EEG spectral features prior to motor execution obtained an average accuracy of 65.4% and 65.6% respectively on 10 subjects. By combining the two types of features, the classification accuracy increased to 69.7%. These results show that ERP and EEG spectral power modulations contribute complementary information to decoding intended movement directions in the PPC. The proposed paradigm might lead to a practical brain-computer interface (BCI) for decoding movement intention of individuals.

24 citations


Proceedings ArticleDOI
29 Oct 2012
TL;DR: Sketch2Tag is a general sketch recognition system, towards recognizing any semantically meaningful object that a child can recognize, and a web-scale clipart image collection is leveraged as the knowledge base of the recognition system.
Abstract: In this work, we introduce the Sketch2Tag system for hand-drawn sketch recognition. Due to large variations presented in hand-drawn sketches, most of existing work was limited to a particular domain or limited predefined classes. Different from existing work, Sketch2Tag is a general sketch recognition system, towards recognizing any semantically meaningful object that a child can recognize. This system enables a user to draw a sketch on the query panel, and then provides real-time recognition results. To increase the recognition coverage, a web-scale clipart image collection is leveraged as the knowledge base of the recognition system. Better understanding a user's drawing will be of great value to a variety of applications, such as, improving the sketch-based image search by combining the recognition results as textual queries.

8 citations


Journal ArticleDOI
TL;DR: The proposed model is a generalization of the traditional LDA by introducing the concept-continuous words and the experimental results show the method is a valuable direction to generalize topic models.

6 citations


Book ChapterDOI
11 Jul 2012
TL;DR: A feature extraction method of PCA and ICA approach that can effectively exclude the influence of not accurate cardiology features and greatly improve the classification accuracy for heart diseases is proposed.
Abstract: As for ECG auto-diagnosis, Classification accuracy is a vital factor for providing diagnosis decision support in remote ECG diagnosis. The final accuracy depends on ECG preprocessing process, feature extraction, feature selection and classification. However, different heart diseases are with different ECG wave shapes, in addition, there is large numbers of heart diseases, so it is hard to accurately extract cardiology features from diverse ECG wave forms. Also the extracted cardiology features are always with large error which to some extent influence the classification accuracy. To deal with these problems, we propose a feature extraction method of PCA and ICA approach. We calculate a adaptive basis with ICA and PCA for the given disease type ECG and extract the coefficients in the respect of trained basis which will be used as the classification features combined with cardiology features. To prevent the dimension disaster problem brought by the additional ICA and PCA feature, a minimal redundancy maximal relevance feature selection method is adapted to reduce the dimension of feature vector. Experiment shows that our method can effectively exclude the influence of not accurate cardiology features and greatly improve the classification accuracy for heart diseases.

4 citations


Proceedings ArticleDOI
29 Oct 2012
TL;DR: A real-time image retrieval system which allows users to search target images whose objects are similar to the query in contour, regardless of their sizes and positions appearing in the images, which has better retrieval rate than existing systems and algorithms.
Abstract: We propose a real-time image retrieval system which allows users to search target images whose objects are similar to the query in contour, regardless of their sizes and positions appearing in the images. Even in a complicated scene, as long as the object's contour is most salient in the target image, the system is still able to capture it and lists the image in the retrieval results. Therefore, the system has better retrieval rate than existing systems and algorithms. One typical application of the proposed system is to help the computer understand what does the user draw or upload. It is based on the statistical distributions of tags of retrieved images, and the proposed system feeds back some candidate tags related to the query image. Such tags could be used for further retrieval to refine the result list. In addition, the system provides a friendly interactive interface with multiple queries. These queries are from different combinations of tags, a hand-drawn sketch and a natural image, and could help users search images flexibly and conveniently. The system runs on a database of 1.3 million images and could achieve a real-time retrieval speed. The results in the demonstration show excellent retrieval performance of the proposed system.

3 citations


Proceedings Article
01 Dec 2012
TL;DR: The higher-order PLS approach to find the latent variables related to the target labels and then make classification based on latent variables to effectively extract the underlying components from brain activities which correspond to the specific mental state is proposed.
Abstract: The EEG signals recorded during Brain Computer Interfaces (BCIs) are naturally represented by multi-way arrays in spatial, temporal, and frequency domains. In order to effectively extract the underlying components from brain activities which correspond to the specific mental state, we propose the higher-order PLS approach to find the latent variables related to the target labels and then make classification based on latent variables. To this end, the low-dimensional latent space can be optimized by using the higher-order SVD on a cross-product tensor, and the latent variables are considered as shared components between observed data and target output. The EEG signals recorded under the P300-type affective BCI paradigm were used to demonstrate the effectiveness of our new approach.

3 citations


01 Jan 2012
TL;DR: An unsupervised learning approach based on an extended Independent Subspace Analysis model to extract spatio-temporal feature directly from the video data through the bag-of-words procedure and SVM classifier is investigated.
Abstract: In TRECVID 2012, our team takes part in the Surveillance Event Detection (SED) task and has finished four human events detection. We investigate an unsupervised learning approach based on an extended Independent Subspace Analysis model to extract spatio-temporal feature directly from the video data. The bag-of-words procedure and SVM classifier is used. We present the results and comparison of our primary run on the detection task.