Home
/
Authors
/
Yong Rui

Author

Yong Rui

Other affiliations: Zhejiang University, Advanced Technology Center, University of Illinois at Urbana–Champaign

Bio: Yong Rui is an academic researcher from Microsoft. The author has contributed to research in topics: Image retrieval & Automatic summarization. The author has an hindex of 72, co-authored 287 publications receiving 20768 citations. Previous affiliations of Yong Rui include Zhejiang University & Advanced Technology Center.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Relevance feedback: a power tool for interactive content-based image retrieval

[...]

Yong Rui¹, Thomas S. Huang¹, Michael Ortega¹, Sharad Mehrotra¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Sep 1998-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A relevance feedback based interactive retrieval approach that effectively takes into account the subjectivity of human perception of visual content and the gap between high-level concepts and low-level features in CBIR.

...read moreread less

Abstract: Content-based image retrieval (CBIR) has become one of the most active research areas in the past few years. Many visual feature representations have been explored and many systems built. While these research efforts establish the basis of CBIR, the usefulness of the proposed approaches is limited. Specifically, these efforts have relatively ignored two distinct characteristics of CBIR systems: (1) the gap between high-level concepts and low-level features, and (2) the subjectivity of human perception of visual content. This paper proposes a relevance feedback based interactive retrieval approach, which effectively takes into account the above two characteristics in CBIR. During the retrieval process, the user's high-level query and perception subjectivity are captured by dynamically updated weights based on the user's feedback. The experimental results over more than 70000 images show that the proposed approach greatly reduces the user's effort of composing a query, and captures the user's information need more precisely.

...read moreread less

1,933 citations

Proceedings Article•DOI•

MSR-VTT: A Large Video Description Dataset for Bridging Video and Language

[...]

Jun Xu¹, Tao Mei¹, Ting Yao¹, Yong Rui¹•Institutions (1)

Microsoft¹

01 Jun 2016

TL;DR: A detailed analysis of MSR-VTT in comparison to a complete set of existing datasets, together with a summarization of different state-of-the-art video-to-text approaches, shows that the hybrid Recurrent Neural Networkbased approach, which combines single-frame and motion representations with soft-attention pooling strategy, yields the best generalization capability on this dataset.

...read moreread less

Abstract: While there has been increasing interest in the task of describing video with natural language, current computer vision algorithms are still severely limited in terms of the variability and complexity of the videos and their associated language that they can recognize. This is in part due to the simplicity of current benchmarks, which mostly focus on specific fine-grained domains with limited videos and simple descriptions. While researchers have provided several benchmark datasets for image captioning, we are not aware of any large-scale video description dataset with comprehensive categories yet diverse video content. In this paper we present MSR-VTT (standing for "MSRVideo to Text") which is a new large-scale video benchmark for video understanding, especially the emerging task of translating video to text. This is achieved by collecting 257 popular queries from a commercial video search engine, with 118 videos for each query. In its current version, MSR-VTT provides 10K web video clips with 41.2 hours and 200K clip-sentence pairs in total, covering the most comprehensive categories and diverse visual content, and representing the largest dataset in terms of sentence and vocabulary. Each clip is annotated with about 20 natural sentences by 1,327 AMT workers. We present a detailed analysis of MSR-VTT in comparison to a complete set of existing datasets, together with a summarization of different state-of-the-art video-to-text approaches. We also provide an extensive evaluation of these approaches on this dataset, showing that the hybrid Recurrent Neural Networkbased approach, which combines single-frame and motion representations with soft-attention pooling strategy, yields the best generalization capability on MSR-VTT.

...read moreread less

933 citations

Proceedings Article•DOI•

Content-based image retrieval with relevance feedback in MARS

[...]

Yong Rui¹, Thomas S. Huang¹, Sharad Mehrotra¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

26 Oct 1997

TL;DR: Experimental results show that the image retrieval precision increases considerably by using the proposed integration approach, and the relevance feedback technique from the IR domain is used in content-based image retrieval to demonstrate the effectiveness of this conversion.

...read moreread less

Abstract: Technology advances in the areas of image processing (IP) and information retrieval (IR) have evolved separately for a long time. However, successful content-based image retrieval systems require the integration of the two. There is an urgent need to develop integration mechanisms to link the image retrieval model to text retrieval model, such that the well established text retrieval techniques can be utilized. Approaches of converting image feature vectors (IF domain) to weighted-term vectors (IR domain) are proposed in this paper. Furthermore, the relevance feedback technique from the IR domain is used in content-based image retrieval to demonstrate the effectiveness of this conversion. Experimental results show that the image retrieval precision increases considerably by using the proposed integration approach.

...read moreread less

815 citations

Proceedings Article•DOI•

Adaptive key frame extraction using unsupervised clustering

[...]

Yueting Zhuang¹, Yong Rui², T.S. Huang², Sharad Mehrotra²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Zhejiang University²

04 Oct 1998

TL;DR: A new algorithm for key frame extraction based on unsupervised clustering is introduced, both computationally simple and able to adapt to the visual content, which is validated by large amount of real-world videos.

...read moreread less

Abstract: Key frame extraction has been recognized as one of the important research issues in video information retrieval. Although progress has been made in key frame extraction, the existing approaches are either computationally expensive or ineffective in capturing salient visual content. We first discuss the importance of key frame selection; and then review and evaluate the existing approaches. To overcome the shortcomings of the existing approaches, we introduce a new algorithm for key frame extraction based on unsupervised clustering. The proposed algorithm is both computationally simple and able to adapt to the visual content. The efficiency and effectiveness are validated by large amount of real-world videos.

...read moreread less

620 citations

Proceedings Article•DOI•

GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation

[...]

Defu Lian¹, Cong Zhao¹, Xing Xie², Guangzhong Sun¹, Enhong Chen¹, Yong Rui² - Show less +2 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

24 Aug 2014

TL;DR: The results indicate that weighted matrix factorization is superior to other forms of factorization models and that incorporating the spatial clustering phenomenon in human mobility behavior on the LBSNs into matrixfactorization improves recommendation performance.

...read moreread less

Abstract: Point-of-Interest (POI) recommendation has become an important means to help people discover attractive locations However, extreme sparsity of user-POI matrices creates a severe challenge To cope with this challenge, viewing mobility records on location-based social networks (LBSNs) as implicit feedback for POI recommendation, we first propose to exploit weighted matrix factorization for this task since it usually serves collaborative filtering with implicit feedback better Besides, researchers have recently discovered a spatial clustering phenomenon in human mobility behavior on the LBSNs, ie, individual visiting locations tend to cluster together, and also demonstrated its effectiveness in POI recommendation, thus we incorporate it into the factorization model Particularly, we augment users' and POIs' latent factors in the factorization model with activity area vectors of users and influence area vectors of POIs, respectively Based on such an augmented model, we not only capture the spatial clustering phenomenon in terms of two-dimensional kernel density estimation, but we also explain why the introduction of such a phenomenon into matrix factorization helps to deal with the challenge from matrix sparsity We then evaluate the proposed algorithm on a large-scale LBSN dataset The results indicate that weighted matrix factorization is superior to other forms of factorization models and that incorporating the spatial clustering phenomenon into matrix factorization improves recommendation performance

...read moreread less

582 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

[...]

성기익, 이영탁, 박계현, 전태국, 박표원, 한일용, 장윤희 - Show less +3 more

01 Mar 2003-The Korean Journal of Thoracic and Cardiovascular Surgery

28,685 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Journal Article•DOI•

Content-based image retrieval at the end of the early years

[...]

Arnold W. M. Smeulders¹, Marcel Worring¹, Simone Santini², Amarnath Gupta², Ramesh Jain - Show less +1 more•Institutions (2)

University of Amsterdam¹, University of California, San Diego²

01 Dec 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap are discussed, as well as aspects of system engineering: databases, system architecture, and evaluation.

...read moreread less

Abstract: Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.

...read moreread less

6,447 citations

Journal Article•DOI•

Object tracking: A survey

[...]

Alper Yilmaz¹, Omar Javed, Mubarak Shah²•Institutions (2)

Ohio State University¹, University of Central Florida²

25 Dec 2006-ACM Computing Surveys

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

5,318 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse