Home
/
Authors
/
Mengyuan Liu

Author

Mengyuan Liu

Other affiliations: Tencent, Nanyang Technological University, Peking University

Bio: Mengyuan Liu is an academic researcher from Sun Yat-sen University. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 17, co-authored 47 publications receiving 1354 citations. Previous affiliations of Mengyuan Liu include Tencent & Nanyang Technological University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Enhanced skeleton visualization for view invariant human action recognition

[...]

Mengyuan Liu¹, Hong Liu¹, Chen Chen²•Institutions (2)

Peking University¹, University of Central Florida²

01 Aug 2017-Pattern Recognition

TL;DR: Enhanced skeleton visualization method encodes spatio-temporal skeletons as visual and motion enhanced color images in a compact yet distinctive manner and consistently achieves the highest accuracies on four datasets, including the largest and most challenging NTU RGB+D dataset for skeleton-based action recognition.

...read moreread less

668 citations

Proceedings Article•DOI•

Recognizing Human Actions as the Evolution of Pose Estimation Maps

[...]

Mengyuan Liu¹, Junsong Yuan²•Institutions (2)

Nanyang Technological University¹, University at Buffalo²

18 Jun 2018

TL;DR: This work presents a novel method to recognize human action as the evolution of pose estimation maps, which outperforms most state-of-the-art methods.

...read moreread less

Abstract: Most video-based action recognition approaches choose to extract features from the whole video to recognize actions. The cluttered background and non-action motions limit the performances of these methods, since they lack the explicit modeling of human body movements. With recent advances of human pose estimation, this work presents a novel method to recognize human action as the evolution of pose estimation maps. Instead of relying on the inaccurate human poses estimated from videos, we observe that pose estimation maps, the byproduct of pose estimation, preserve richer cues of human body to benefit action recognition. Specifically, the evolution of pose estimation maps can be decomposed as an evolution of heatmaps, e.g., probabilistic maps, and an evolution of estimated 2D human poses, which denote the changes of body shape and body pose, respectively. Considering the sparse property of heatmap, we develop spatial rank pooling to aggregate the evolution of heatmaps as a body shape evolution image. As body shape evolution image does not differentiate body parts, we design body guided sampling to aggregate the evolution of poses as a body pose evolution image. The complementary properties between both types of images are explored by deep convolutional neural networks to predict action label. Experiments on NTU RGB+D, UTD-MHAD and PennAction datasets verify the effectiveness of our method, which outperforms most state-of-the-art methods.

...read moreread less

287 citations

Posted Content•

Two-Stream 3D Convolutional Neural Network for Skeleton-Based Action Recognition.

[...]

Hong Liu, Juanhui Tu, Mengyuan Liu

23 May 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes a novel two-stream model using 3D CNN in skeleton-based action recognition, which outperforms most of RNN-based methods, which verify the complementary property between spatial and temporal information and the robustness to noise.

...read moreread less

Abstract: It remains a challenge to efficiently extract spatialtemporal information from skeleton sequences for 3D human action recognition. Although most recent action recognition methods are based on Recurrent Neural Networks which present outstanding performance, one of the shortcomings of these methods is the tendency to overemphasize the temporal information. Since 3D convolutional neural network(3D CNN) is a powerful tool to simultaneously learn features from both spatial and temporal dimensions through capturing the correlations between three dimensional signals, this paper proposes a novel two-stream model using 3D CNN. To our best knowledge, this is the first application of 3D CNN in skeleton-based action recognition. Our method consists of three stages. First, skeleton joints are mapped into a 3D coordinate space and then encoding the spatial and temporal information, respectively. Second, 3D CNN models are seperately adopted to extract deep features from two streams. Third, to enhance the ability of deep features to capture global relationships, we extend every stream into multitemporal version. Extensive experiments on the SmartHome dataset and the large-scale NTU RGB-D dataset demonstrate that our method outperforms most of RNN-based methods, which verify the complementary property between spatial and temporal information and the robustness to noise.

...read moreread less

106 citations

Posted Content•

A Survey on 3D Skeleton-Based Action Recognition Using Learning Method.

[...]

Bin Ren, Mengyuan Liu, Runwei Ding, Hong Liu

14 Feb 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This survey highlights the necessity of action recognition and the significance of 3D-skeleton data, and gives an overall discussion over deep learning-based action recognitin using 3D skeleton data.

...read moreread less

Abstract: 3D skeleton-based action recognition, owing to the latent advantages of skeleton, has been an active topic in computer vision. As a consequence, there are lots of impressive works including conventional handcraft feature based and learned feature based have been done over the years. However, previous surveys about action recognition mostly focus on the video or RGB data dominated methods, and the scanty existing reviews related to skeleton data mainly indicate the representation of skeleton data or performance of some classic techniques on a certain dataset. Besides, though deep learning methods has been applied to this field for years, there is no related reserach concern about an introduction or review from the perspective of deep learning architectures. To break those limitations, this survey firstly highlight the necessity of action recognition and the significance of 3D-skeleton data. Then a comprehensive introduction about Recurrent Neural Network(RNN)-based, Convolutional Neural Network(CNN)-based and Graph Convolutional Network(GCN)-based main stream action recognition techniques are illustrated in a data-driven manner. Finally, we give a brief talk about the biggest 3D skeleton dataset NTU-RGB+D and its new edition called NTU-RGB+D 120, accompanied with several existing top rank algorithms within those two datasets. To our best knowledge, this is the first research which give an overall discussion over deep learning-based action recognitin using 3D skeleton data.

...read moreread less

71 citations

Proceedings Article•

3D action recognition using multi-temporal depth motion maps and fisher vector

[...]

Chen Chen¹, Mengyuan Liu², Baochang Zhang³, Jungong Han⁴, Junjun Jiang⁵, Hong Liu² - Show less +2 more•Institutions (5)

University of Texas at Dallas¹, Peking University², Beihang University³, Northumbria University⁴, China University of Geosciences (Wuhan)⁵

09 Jul 2016

TL;DR: Extensive experiments on the public MSRAction3D, MSRGesture3D and DHA datasets show that the proposed method outperforms state-of-the-art approaches for depth-based action recognition.

...read moreread less

Abstract: This paper presents an effective local spatio-temporal descriptor for action recognition from depth video sequences. The unique property of our descriptor is that it takes the shape discrimination and action speed variations into account, intending to solve the problems of distinguishing different pose shapes and identifying the actions with different speeds in one goal. The entire algorithm is carried out in three stages. In the first stage, a depth sequence is divided into temporally overlapping depth segments which are used to generate three depth motion maps (DMMs), capturing the shape and motion cues. To cope with speed variations in actions, multiple frame lengths of depth segments are utilized, leading to a multitemporal DMMs representation. In the second stage, all the DMMs are first partitioned into dense patches. Then, the local binary patterns (LBP) descriptor is exploited to characterize local rotation invariant texture information in those patches. In the third stage, the Fisher kernel is employed to encode the patch descriptors for a compact feature representation, which is fed into a kernel-based extreme learning machine classifier. Extensive experiments on the public MSRAction3D, MSRGesture3D and DHA datasets show that our proposed method outperforms state-of-the-art approaches for depth-based action recognition.

...read moreread less

64 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Proceedings Article•DOI•

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

[...]

Lei Shi¹, Yifan Zhang¹, Jian Cheng¹, Hanqing Lu•Institutions (1)

Chinese Academy of Sciences¹

15 Jun 2019

TL;DR: Zhang et al. as mentioned in this paper proposed a two-stream adaptive graph convolutional network (2s-AGCN) to model both the first-order and the second-order information simultaneously, which shows notable improvement for the recognition accuracy.

...read moreread less

Abstract: In skeleton-based action recognition, graph convolutional networks (GCNs), which model the human body skeletons as spatiotemporal graphs, have achieved remarkable performance. However, in existing GCN-based methods, the topology of the graph is set manually, and it is fixed over all layers and input samples. This may not be optimal for the hierarchical GCN and diverse samples in action recognition tasks. In addition, the second-order information (the lengths and directions of bones) of the skeleton data, which is naturally more informative and discriminative for action recognition, is rarely investigated in existing methods. In this work, we propose a novel two-stream adaptive graph convolutional network (2s-AGCN) for skeleton-based action recognition. The topology of the graph in our model can be either uniformly or individually learned by the BP algorithm in an end-to-end manner. This data-driven method increases the flexibility of the model for graph construction and brings more generality to adapt to various data samples. Moreover, a two-stream framework is proposed to model both the first-order and the second-order information simultaneously, which shows notable improvement for the recognition accuracy. Extensive experiments on the two large-scale datasets, NTU-RGBD and Kinetics-Skeleton, demonstrate that the performance of our model exceeds the state-of-the-art with a significant margin.

...read moreread less

997 citations

Journal Article•DOI•

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding

[...]

Jun Liu¹, Amir Shahroudy², Mauricio Perez¹, Gang Wang³, Ling-Yu Duan⁴, Alex C. Kot¹ - Show less +2 more•Institutions (4)

Nanyang Technological University¹, Chalmers University of Technology², Alibaba Group³, Peking University⁴

01 Oct 2020-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work introduces a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames, and investigates a novel one-shot 3D activity recognition problem on this dataset.

...read moreread less

Abstract: Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding.

...read moreread less

837 citations

Journal Article•DOI•

Enhanced skeleton visualization for view invariant human action recognition

[...]