scispace - formally typeset
S

Sheng Tang

Researcher at Chinese Academy of Sciences

Publications -  143
Citations -  3507

Sheng Tang is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Visual Word & TRECVID. The author has an hindex of 25, co-authored 131 publications receiving 2431 citations. Previous affiliations of Sheng Tang include National University of Singapore & Dalian University of Technology.

Papers
More filters
Proceedings ArticleDOI

Large visual words for large scale image classification

TL;DR: This paper presents an efficient generation approach of large visual words with a very compact vocabulary, namely two dictionaries learned with sparse non-negative matrix factorization (NMF) and can classify images very efficiently with the incorporation of the fast KNN search based on large visual Words into SVM-KNN method.
Book ChapterDOI

Spatiotemporal Breast Mass Detection Network (MD-Net) in 4D DCE-MRI Images

TL;DR: This work aims to leverage recent deep learning techniques for breast lesion detection and proposes the Spatiotemporal Breast Mass Detection Networks (MD-Nets) to detect the masses in the 4D DCE-MRI images automatically.
Book ChapterDOI

Document Clustering Based on Spectral Clustering and Non-negative Matrix Factorization

TL;DR: A novel non-negative matrix factorization to the affinity matrix for document clustering, which enforces non-negativity and orthogonality constraints simultaneously and presents a much more reasonable clustering interpretation than the previous NMF-based clustering methods.
Proceedings ArticleDOI

A Novel Anchorperson Detection Algorithm Based on Spatio-temporal Slice

TL;DR: This paper presents a novel anchorperson detection algorithm based on spatio-temporal slice (STS), which with STSpattern analysis, clustering and decision fusion, anchorperson shots can be detected for browsing news video.
Proceedings ArticleDOI

A distribution based video representation for human action recognition

TL;DR: The proposed representation encodes the visual and motion information of an ensemble of local spatial temporal features of a video into a distribution estimated by a generative probabilistic model such as the Gaussian Mixture Model, which naturally gives rise to an information-theoretic distance metric of videos.