Multiple Modal Features and Multiple Kernel Learning for Human Daily Activity Recognition

A novel approach of daily activity recognition is proposed and it is hypothesized that the performance of the system can be promoted by combining multimodal features, and results prove the proposed methods are effective and feasible for activity recognition system in the daily environment.

Abstract:

Introduction: Recognizing human activity in a daily environment has attracted much research in computer vision and recognition in recent years. It is a difficult and challenging topic not only inasmuch as the variations of background clutter, occlusion or intra-class variation in image sequences but also inasmuch as complex patterns of activity are created by interactions among people-people or people-objects. In addition, it also is very valuable for many practical applications, such as smart home, gaming, health care, human-computer interaction and robotics. Now, we are living in the beginning age of the industrial revolution 4.0 where intelligent systems have become the most important subject, as reflected in the research and industrial communities. There has been emerging advances in 3D cameras, such as Microsoft's Kinect and Intel's RealSense, which can capture RGB, depth and skeleton in real time. This creates a new opportunity to increase the capabilities of recognizing the human activity in the daily environment. In this research, we propose a novel approach of daily activity recognition and hypothesize that the performance of the system can be promoted by combining multimodal features. Methods: We extract spatial-temporal feature for the human body with representation of parts based on skeleton data from RGB-D data. Then, we combine multiple features from the two sources to yield the robust features for activity representation. Finally, we use the Multiple Kernel Learning algorithm to fuse multiple features to identify the activity label for each video. To show generalizability, the proposed framework has been tested on two challenging datasets by cross-validation scheme. Results: The experimental results show a good outcome on both CAD120 and MSR-Daily Activity 3D datasets with 94.16% and 95.31% in accuracy, respectively. Conclusion: These results prove our proposed methods are effective and feasible for activity recognition system in the daily environment.

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

FKNDT: A Flexible Kernel by Negotiating Between Data-dependent Kernel Learning and Task-dependent Kernel Learning

Fatemeh Alavi,Sattar Hashemi +1 moreShiraz University

Show Less

TL;DR: A Flexible Kernel by Negotiating between Data-dependentkernel learning and Task-dependent kernel learning termed as FKNDT is presented, which is better than other state-of-the-art kernel-based algorithms in terms of classification accuracy on fifteen benchmark datasets.

...read moreread less

Journal ArticleDOI

Role of Human-Computer Interaction Healthcare System in the Teaching of Physiology and Medicine

Xiuhong Li,Yubo Xu +1 more

- 13 Apr 2022 -

Computational Intelligence and Neuroscie...

Show Less

TL;DR: The final data show that the teaching system meets the requirements in four evaluation indicators: teaching content, ease of use of human-computer interaction design, technical services, and user subjective satisfaction.

...read moreread less

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal,Bill Triggs +1 moreFrench Institute for Research in Computer Science and Automation

Show Less

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Distinctive Image Features from Scale-Invariant Keypoints

Matthijs Dorst

Show Less

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.

...read moreread less

Book ChapterDOI

SURF: speeded up robust features

Herbert Bay,Tinne Tuytelaars,Luc Van Gool +2 moreETH Zurich,Katholieke Universiteit Leuven

Show Less

TL;DR: A novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.

...read moreread less

Proceedings ArticleDOI

Learning realistic human actions from movies

Ivan Laptev,Marcin Marszalek,Cordelia Schmid,B. Rozenfeld +3 moreFrench Institute for Research in Computer Science and Automation,Bar-Ilan University

Show Less

TL;DR: A new method for video classification that builds upon and extends several recent ideas including local space-time features,space-time pyramids and multi-channel non-linear SVMs is presented and shown to improve state-of-the-art results on the standard KTH action dataset.

...read moreread less

Journal ArticleDOI

The recognition of human movement using temporal templates

Aaron F. Bobick,James W. Davis +1 moreGeorgia Tech Research Institute,Ohio State University

- 01 Mar 2001 -

IEEE Transactions on Pattern Analysis an...

Show Less

TL;DR: A view-based approach to the representation and recognition of human movement is presented, and a recognition method matching temporal templates against stored instances of views of known actions is developed.

...read moreread less