scispace - formally typeset
Journal ArticleDOI

Understanding bag-of-words model: A statistical framework

TLDR
A statistical framework which generalizes the bag-of-words representation, in which the visual words are generated by a statistical process rather than using a clustering algorithm, while the empirical performance is competitive to clustering-based method.
Abstract
The bag-of-words model is one of the most popular representation methods for object categorization. The key idea is to quantize each extracted key point into one of visual words, and then represent each image by a histogram of the visual words. For this purpose, a clustering algorithm (e.g., K-means), is generally used for generating the visual words. Although a number of studies have shown encouraging results of the bag-of-words representation for object categorization, theoretical studies on properties of the bag-of-words model is almost untouched, possibly due to the difficulty introduced by using a heuristic clustering process. In this paper, we present a statistical framework which generalizes the bag-of-words representation. In this framework, the visual words are generated by a statistical process rather than using a clustering algorithm, while the empirical performance is competitive to clustering-based method. A theoretical analysis based on statistical consistency is presented for the proposed framework. Moreover, based on the framework we developed two algorithms which do not rely on clustering, while achieving competitive performance in object categorization when compared to clustering-based bag-of-words representations.

read more

Citations
More filters
Posted Content

ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission

TL;DR: ClinicalBERT uncovers high-quality relationships between medical concepts as judged by humans and outperforms baselines on 30-day hospital readmission prediction using both discharge summaries and the first few days of notes in the intensive care unit.
Proceedings ArticleDOI

SourcererCC: scaling code clone detection to big-code

TL;DR: In this article, a token-based clone detector, SourcererCC, is proposed to detect both exact and near-miss clones from large inter-project repositories using a standard workstation.
Book ChapterDOI

Multi-modal Transformer for Video Retrieval

TL;DR: A multi-modal transformer to jointly encode the different modalities in video, which allows each of them to attend to the others, and a novel framework to establish state-of-the-art results for video retrieval on three datasets.
Journal ArticleDOI

Advanced internet of things for personalised healthcare systems

TL;DR: This paper will give a systematic review on advanced IoT enabled PHS, and key enabling technologies, major IoT enabled applications and successful case studies in healthcare, and finally point out future research trends and challenges.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Book ChapterDOI

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.
BookDOI

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

TL;DR: Learning with Kernels provides an introduction to SVMs and related kernel methods that provide all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms.
Proceedings ArticleDOI

Video Google: a text retrieval approach to object matching in videos

TL;DR: An approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video, represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion.
Related Papers (5)