Institution

Baidu

Company•Beijing, China•

About: Baidu is a company organization based out in Beijing, China. It is known for research contribution in the topics: Computer science & Feature (computer vision). The organization has 4250 authors who have published 4856 publications receiving 85219 citations. The organization is also known as: Bǎidù & Baidu, Inc..

...read moreread less

Topics: Computer science, Feature (computer vision), Artificial neural network, Terminal (electronics), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

3D Convolutional Neural Networks for Human Action Recognition

[...]

Shuiwang Ji¹, Wei Xu², Ming Yang, Kai Yu³•Institutions (3)

Old Dominion University¹, Facebook², Baidu³

01 Jan 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Wang et al. as mentioned in this paper developed a novel 3D CNN model for action recognition, which extracts features from both the spatial and the temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames.

...read moreread less

Abstract: We consider the automated recognition of human actions in surveillance videos. Most current methods build classifiers based on complex handcrafted features computed from the raw inputs. Convolutional neural networks (CNNs) are a type of deep model that can act directly on the raw inputs. However, such models are currently limited to handling 2D inputs. In this paper, we develop a novel 3D CNN model for action recognition. This model extracts features from both the spatial and the temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames. The developed model generates multiple channels of information from the input frames, and the final feature representation combines information from all channels. To further boost the performance, we propose regularizing the outputs with high-level features and combining the predictions of a variety of different models. We apply the developed models to recognize human actions in the real-world environment of airport surveillance videos, and they achieve superior performance in comparison to baseline methods.

...read moreread less

4,545 citations

Proceedings Article•

3D Convolutional Neural Networks for Human Action Recognition

[...]

Shuiwang Ji¹, Wei Xu², Ming Yang, Kai Yu³•Institutions (3)

Arizona State University¹, Facebook², Baidu³

21 Jun 2010

TL;DR: A novel 3D CNN model for action recognition that extracts features from both the spatial and the temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames.

...read moreread less

Abstract: We consider the fully automated recognition of actions in uncontrolled environment. Most existing work relies on domain knowledge to construct complex handcrafted features from inputs. In addition, the environments are usually assumed to be controlled. Convolutional neural networks (CNNs) are a type of deep models that can act directly on the raw inputs, thus automating the process of feature construction. However, such models are currently limited to handle 2D inputs. In this paper, we develop a novel 3D CNN model for action recognition. This model extracts features from both spatial and temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames. The developed model generates multiple channels of information from the input frames, and the final feature representation is obtained by combining information from all channels. We apply the developed model to recognize human actions in real-world environment, and it achieves superior performance without relying on handcrafted features.

...read moreread less

4,087 citations

Proceedings Article•DOI•

Multi-view 3D Object Detection Network for Autonomous Driving

[...]

Xiaozhi Chen¹, Huimin Ma¹, Ji Wan², Bo Li², Xia Tian² - Show less +1 more•Institutions (2)

Tsinghua University¹, Baidu²

21 Jul 2017

TL;DR: This paper proposes Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes and designs a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths.

...read moreread less

Abstract: This paper aims at high-accuracy 3D object detection in autonomous driving scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. We encode the sparse 3D point cloud with a compact multi-view representation. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the birds eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on the challenging KITTI benchmark show that our approach outperforms the state-of-the-art by around 25% and 30% AP on the tasks of 3D localization and 3D detection. In addition, for 2D detection, our approach obtains 14.9% higher AP than the state-of-the-art on the hard data among the LIDAR-based methods.

...read moreread less

2,569 citations

Journal Article•DOI•

A Comprehensive Survey on Transfer Learning

[...]

Fuzhen Zhuang¹, Zhiyuan Qi¹, Keyu Duan¹, Dongbo Xi¹, Yongchun Zhu¹, Hengshu Zhu², Hui Xiong³, Qing He¹ - Show less +4 more•Institutions (3)

Chinese Academy of Sciences¹, Baidu², Rutgers University³

01 Jan 2021

TL;DR: Transfer learning aims to improve the performance of target learners on target domains by transferring the knowledge contained in different but related source domains as discussed by the authors, in which the dependence on a large number of target-domain data can be reduced for constructing target learners.

...read moreread less

Abstract: Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target-domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. Due to the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning research studies, as well as to summarize and interpret the mechanisms and the strategies of transfer learning in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Unlike previous surveys, this survey article reviews more than 40 representative transfer learning approaches, especially homogeneous transfer learning approaches, from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, over 20 representative transfer learning models are used for experiments. The models are performed on three different data sets, that is, Amazon Reviews, Reuters-21578, and Office-31, and the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.

...read moreread less

2,433 citations

Proceedings Article•DOI•

Conditional Random Fields as Recurrent Neural Networks

[...]

Shuai Zheng¹, Sadeep Jayasumana¹, Bernardino Romera-Paredes¹, Vibhav Vineet², Zhizhong Su, Dalong Du, Chang Huang³, Philip H. S. Torr¹ - Show less +4 more•Institutions (3)

University of Oxford¹, Stanford University², Baidu³

07 Dec 2015

TL;DR: In this article, a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling is introduced.

...read moreread less

Abstract: Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate Conditional Random Fields with Gaussian pairwise potentials and mean-field approximate inference as Recurrent Neural Networks. This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has desirable properties of both CNNs and CRFs. Importantly, our system fully integrates CRF modelling with CNNs, making it possible to train the whole deep network end-to-end with the usual back-propagation algorithm, avoiding offline post-processing methods for object delineation. We apply the proposed method to the problem of semantic image segmentation, obtaining top results on the challenging Pascal VOC 2012 segmentation benchmark.

...read moreread less

1,973 citations

Collapse

Authors

Showing all 4293 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Yi Yang	143	2456	92268
Andrew Y. Ng	130	345	164995
Shuai Liu	129	1095	80823
Wei Liu	102	2927	65228
Tong Zhang	93	414	36519
William W. Cohen	85	384	31495
Hui Xiong	69	470	16776
Peng Li	66	825	17800
Ruigang Yang	63	324	16176
Kenneth Church	61	295	21179
Jun Zhu	61	468	15391
Kai Yu	60	173	27897
Wei Xu	58	133	24351
Wei Fan	57	330	15157

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

95% related

Facebook

10.9K papers, 570.1K citations

94% related

Microsoft

86.9K papers, 4.1M citations

93% related

Adobe Systems

8K papers, 214.7K citations

38.6K papers, 1.3M citations

88% related

Performance

Metrics

4,893

Papers

124,457

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	7
2022	41
2021	698
2020	951
2019	1,064
2018	791