scispace - formally typeset
Search or ask a question
Author

Rong Yan

Bio: Rong Yan is an academic researcher from IBM. The author has contributed to research in topics: TRECVID & Image retrieval. The author has an hindex of 48, co-authored 158 publications receiving 11918 citations. Previous affiliations of Rong Yan include Facebook & Wuhan Polytechnic University.


Papers
More filters
Journal ArticleDOI
01 Aug 2007-Fuel
TL;DR: In this article, the pyrolysis characteristics of three main components (hemicellulose, cellulose and lignin) of biomass were investigated using, respectively, a thermogravimetric analyzer (TGA) with differential scanning calorimetry (DSC) detector and a pack bed.

5,859 citations

Proceedings ArticleDOI
29 Sep 2007
TL;DR: This paper proposes Adaptive Support Vector Machines (A-SVMs) as a general method to adapt one or more existing classifiers of any type to the new dataset and outperforms several baseline and competing methods in terms of classification accuracy and efficiency in cross-domain concept detection in the TRECVID corpus.
Abstract: Many multimedia applications can benefit from techniques for adapting existing classifiers to data with different distributions. One example is cross-domain video concept detection which aims to adapt concept classifiers across various video domains. In this paper, we explore two key problems for classifier adaptation: (1) how to transform existing classifier(s) into an effective classifier for a new dataset that only has a limited number of labeled examples, and (2) how to select the best existing classifier(s) for adaptation. For the first problem, we propose Adaptive Support Vector Machines (A-SVMs) as a general method to adapt one or more existing classifiers of any type to the new dataset. It aims to learn the "delta function" between the original and adapted classifier using an objective function similar to SVMs. For the second problem, we estimate the performance of each existing classifier on the sparsely-labeled new dataset by analyzing its score distribution and other meta features, and select the classifiers with the best estimated performance. The proposed method outperforms several baseline and competing methods in terms of classification accuracy and efficiency in cross-domain concept detection in the TRECVID corpus.

745 citations

01 Jan 2004
TL;DR: In the NIST TRECVID-2004 evaluation as discussed by the authors, shot boundary detection, high-level feature detection, story segmentation, and search were all performed by the same team.
Abstract: In this paper we describe our participation in the NIST TRECVID-2004 evaluation. We participated in four tasks of the benchmark including shot boundary detection, high-level feature detection, story segmentation, and search. We describe the different runs we submitted for each track and provide a preliminary analysis of our performance.

386 citations

Journal ArticleDOI
TL;DR: In this paper, the effects of temperature, residence time and catalyst adding on the yields and distribution of hydrogen rich gas products were investigated, and the effect of adding Ni showed the greatest catalytic effect with the maximum H 2 yield achieved at 29.78% of the raw biomass sample basis.

237 citations

Book ChapterDOI
24 Jul 2003
TL;DR: The results are encouraging, indicating that pseudo-relevance feedback shows great promise for multimedia retrieval with very varied and errorful data.
Abstract: We present an algorithm for video retrieval that fuses the decisions of multiple retrieval agents in both text and image modalities. While the normalization and combination of evidence is novel, this paper emphasizes the successful use of negative pseudorelevance feedback to improve image retrieval performance. Although we have not solved all problems in video information retrieval, the results are encouraging, indicating that pseudo-relevance feedback shows great promise for multimedia retrieval with very varied and errorful data.

229 citations


Cited by
More filters
Journal ArticleDOI
01 Aug 2007-Fuel
TL;DR: In this article, the pyrolysis characteristics of three main components (hemicellulose, cellulose and lignin) of biomass were investigated using, respectively, a thermogravimetric analyzer (TGA) with differential scanning calorimetry (DSC) detector and a pack bed.

5,859 citations

01 Jan 2009
TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
Abstract: The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer training labels if it is allowed to choose the data from which it learns. An active learner may pose queries, usually in the form of unlabeled data instances to be labeled by an oracle (e.g., a human annotator). Active learning is well-motivated in many modern machine learning problems, where unlabeled data may be abundant or easily obtained, but labels are difficult, time-consuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for successful active learning, a summary of problem setting variants and practical issues, and a discussion of related topics in machine learning research are also presented.

5,227 citations

01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

3,627 citations

Journal ArticleDOI
TL;DR: This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied toTransfer learning, which can be applied to big data environments.
Abstract: Machine learning and data mining techniques have been used in numerous real-world applications. An assumption of traditional machine learning methodologies is the training data and testing data are taken from the same domain, such that the input feature space and data distribution characteristics are the same. However, in some real-world machine learning scenarios, this assumption does not hold. There are cases where training data is expensive or difficult to collect. Therefore, there is a need to create high-performance learners trained with more easily obtained data from different domains. This methodology is referred to as transfer learning. This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied to transfer learning. Lastly, there is information listed on software downloads for various transfer learning solutions and a discussion of possible future research work. The transfer learning solutions surveyed are independent of data size and can be applied to big data environments.

2,900 citations