scispace - formally typeset
Search or ask a question
Institution

Alibaba Group

CompanyHangzhou, China
About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).


Papers
More filters
Posted Content
TL;DR: It is shown that GC can regularize both the weight space and output feature space so that it can boost the generalization performance of DNNs, and improves the Lipschitzness of the loss function and its gradient so that the training process becomes more efficient and stable.
Abstract: Optimization techniques are of great importance to effectively and efficiently train a deep neural network (DNN). It has been shown that using the first and second order statistics (e.g., mean and variance) to perform Z-score standardization on network activations or weight vectors, such as batch normalization (BN) and weight standardization (WS), can improve the training performance. Different from these existing methods that mostly operate on activations or weights, we present a new optimization technique, namely gradient centralization (GC), which operates directly on gradients by centralizing the gradient vectors to have zero mean. GC can be viewed as a projected gradient descent method with a constrained loss function. We show that GC can regularize both the weight space and output feature space so that it can boost the generalization performance of DNNs. Moreover, GC improves the Lipschitzness of the loss function and its gradient so that the training process becomes more efficient and stable. GC is very simple to implement and can be easily embedded into existing gradient based DNN optimizers with only one line of code. It can also be directly used to fine-tune the pre-trained DNNs. Our experiments on various applications, including general image classification, fine-grained image classification, detection and segmentation, demonstrate that GC can consistently improve the performance of DNN learning. The code of GC can be found at this https URL.

53 citations

Journal ArticleDOI
TL;DR: The proposed VRANet is built based on the bilinear visual attention module which identifies the critical objects and proposes a novel Visual Relational Reasoning (VRR) module to reason about pair-wise and inner-group visual relationships among objects guided by the textual information.
Abstract: Cross-modal analysis has become a promising direction for artificial intelligence. Visual representation is crucial for various cross-modal analysis tasks that require visual content understanding. Visual features which contain semantical information can disentangle the underlying correlation between different modalities, thus benefiting the downstream tasks. In this paper, we propose a Visual Reasoning and Attention Network (VRANet) as a plug-and-play module to capture rich visual semantics and help to enhance the visual representation for improving cross-modal analysis. Our proposed VRANet is built based on the bilinear visual attention module which identifies the critical objects. We propose a novel Visual Relational Reasoning (VRR) module to reason about pair-wise and inner-group visual relationships among objects guided by the textual information. The two modules enhance the visual features at both relation level and object level. We demonstrate the effectiveness of the proposed VRANet by applying it to both Visual Question Answering (VQA) and Cross-Modal Information Retrieval (CMIR) tasks. Extensive experiments conducted on VQA 2.0, CLEVR, CMPlaces, and MS-COCO datasets indicate superior performance comparing with state-of-the-art work.

53 citations

Proceedings ArticleDOI
01 Jul 2019
TL;DR: In this paper, the authors proposed a multi-level NER framework, MGNER, which detects and recognizes entities on multiple granularities: it is able to recognize named entities without explicitly assuming non-overlapping or totally nested structures.
Abstract: This paper presents a novel framework, MGNER, for Multi-Grained Named Entity Recognition where multiple entities or entity mentions in a sentence could be non-overlapping or totally nested. Different from traditional approaches regarding NER as a sequential labeling task and annotate entities consecutively, MGNER detects and recognizes entities on multiple granularities: it is able to recognize named entities without explicitly assuming non-overlapping or totally nested structures. MGNER consists of a Detector that examines all possible word segments and a Classifier that categorizes entities. In addition, contextual information and a self-attention mechanism are utilized throughout the framework to improve the NER performance. Experimental results show that MGNER outperforms current state-of-the-art baselines up to 4.4% in terms of the F1 score among nested/non-overlapping NER tasks.

53 citations

Proceedings ArticleDOI
23 Apr 2019
TL;DR: An innovative training strategy is proposed that learns the parameters of the student intertwined with the teachers, achieved by ``projecting'' its amalgamated features onto each teacher's domain and computing the loss.
Abstract: In this paper, we investigate a novel deep-model reusing task. Our goal is to train a lightweight and versatile student model, without human-labelled annotations, that amalgamates the knowledge and masters the expertise of two pre-trained teacher models working on heterogeneous problems, one on scene parsing and the other on depth estimation. To this end, we propose an innovative training strategy that learns the parameters of the student intertwined with the teachers, achieved by ``projecting'' its amalgamated features onto each teacher's domain and computing the loss. We also introduce two options to generalize the proposed training strategy to handle three or more tasks simultaneously. The proposed scheme yields very encouraging results. As demonstrated on several benchmarks, the trained student model achieves results even superior to those of the teachers in their own expertise domains and on par with the state-of-the-art fully supervised models relying on human-labelled annotations.

53 citations

Proceedings ArticleDOI
20 Jun 2021
TL;DR: In this article, the authors propose a collaborative compression scheme, which combines channel pruning and tensor decomposition to compress CNN models by simultaneously learning the model sparsity and low-rankness.
Abstract: Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression. However, these two techniques are traditionally deployed in an isolated manner, leading to significant accuracy drop when pursuing high compression rates. In this paper, we propose a Collaborative Compression (CC) scheme, which joints channel pruning and tensor decomposition to compress CNN models by simultaneously learning the model sparsity and low-rankness. Specifically, we first investigate the compression sensitivity of each layer in the network, and then propose a Global Compression Rate Optimization that transforms the decision problem of compression rate into an optimization problem. After that, we propose multi-step heuristic compression to remove redundant compression units step-by-step, which fully considers the effect of the remaining compression space (i.e., unremoved compression units). Our method demonstrates superior performance gains over previous ones on various datasets and backbone architectures. For example, we achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.

53 citations


Authors

Showing all 6829 results

NameH-indexPapersCitations
Philip S. Yu1481914107374
Lei Zhang130231286950
Jian Xu94136652057
Wei Chu8067028771
Le Song7634521382
Yuan Xie7673924155
Narendra Ahuja7647429517
Rong Jin7544919456
Beng Chin Ooi7340819174
Wotao Yin7230327233
Deng Cai7032624524
Xiaofei He7026028215
Irwin King6747619056
Gang Wang6537321579
Xiaodan Liang6131814121
Network Information
Related Institutions (5)
Microsoft
86.9K papers, 4.1M citations

94% related

Google
39.8K papers, 2.1M citations

94% related

Facebook
10.9K papers, 570.1K citations

93% related

AT&T Labs
5.5K papers, 483.1K citations

90% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20235
202230
20211,352
20201,671
20191,459
2018863