scispace - formally typeset
P

Ping Tak Peter Tang

Researcher at Intel

Publications -  16
Citations -  2271

Ping Tak Peter Tang is an academic researcher from Intel. The author has contributed to research in topics: Xeon Phi & Generalization. The author has an hindex of 11, co-authored 15 publications receiving 1607 citations.

Papers
More filters
Posted Content

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

TL;DR: In this paper, the authors investigate the cause of the generalization drop in the large batch regime and present numerical evidence that supports the view that large-batch methods tend to converge to sharp minima of the training and testing functions.
Proceedings Article

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

TL;DR: In this article, the authors investigate the cause of the generalization drop in the large batch regime and present numerical evidence that supports the view that large-batch methods tend to converge to sharp minima of the training and testing functions.
Posted Content

Faster CNNs with Direct Sparse Convolutions and Guided Pruning

TL;DR: In this paper, a general sparse-with-dense matrix multiplication implementation for convolution of feature maps with kernels of arbitrary sparsity patterns is presented. And a performance model is developed to predict the sweet spots of sparsity levels for different layers and on different computer architectures.
Proceedings Article

A Progressive Batching L-BFGS Method for Machine Learning

TL;DR: This article presented a new version of the L-BFGS algorithm that combines three basic components -progressive batching, a stochastic line search, and stable quasi-Newton updating -and performed well on training logistic regression and deep neural networks.
Proceedings Article

Faster CNNs with Direct Sparse Convolutions and Guided Pruning

TL;DR: An efficient general sparse-with-dense matrix multiplication implementation that is applicable to convolution of feature maps with kernels of arbitrary sparsity patterns and a performance model that predicts sweet spots of sparsity levels for different layers and on different computer architectures are developed.