Institution

Alibaba Group

Company•Hangzhou, China•

About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).

...read moreread less

Topics: Computer science, Terminal (electronics), Graph (abstract data type), Node (networking), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Gradient Centralization: A New Optimization Technique for Deep Neural Networks

[...]

Hongwei Yong¹, Jianqiang Huang¹, Xian-Sheng Hua², Lei Zhang¹•Institutions (2)

Hong Kong Polytechnic University¹, Alibaba Group²

03 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that GC can regularize both the weight space and output feature space so that it can boost the generalization performance of DNNs, and improves the Lipschitzness of the loss function and its gradient so that the training process becomes more efficient and stable.

...read moreread less

Abstract: Optimization techniques are of great importance to effectively and efficiently train a deep neural network (DNN). It has been shown that using the first and second order statistics (e.g., mean and variance) to perform Z-score standardization on network activations or weight vectors, such as batch normalization (BN) and weight standardization (WS), can improve the training performance. Different from these existing methods that mostly operate on activations or weights, we present a new optimization technique, namely gradient centralization (GC), which operates directly on gradients by centralizing the gradient vectors to have zero mean. GC can be viewed as a projected gradient descent method with a constrained loss function. We show that GC can regularize both the weight space and output feature space so that it can boost the generalization performance of DNNs. Moreover, GC improves the Lipschitzness of the loss function and its gradient so that the training process becomes more efficient and stable. GC is very simple to implement and can be easily embedded into existing gradient based DNN optimizers with only one line of code. It can also be directly used to fine-tune the pre-trained DNNs. Our experiments on various applications, including general image classification, fine-grained image classification, detection and segmentation, demonstrate that GC can consistently improve the performance of DNN learning. The code of GC can be found at this https URL.

...read moreread less

53 citations

Journal Article•DOI•

Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval

[...]

Jing Yu¹, Weifeng Zhang, Yuhang Lu², Zengchang Qin³, Yue Hu¹, Jianlong Tan¹, Qi Wu⁴ - Show less +3 more•Institutions (4)

Chinese Academy of Sciences¹, Alibaba Group², Beihang University³, University of Adelaide⁴

10 Feb 2020-IEEE Transactions on Multimedia

TL;DR: The proposed VRANet is built based on the bilinear visual attention module which identifies the critical objects and proposes a novel Visual Relational Reasoning (VRR) module to reason about pair-wise and inner-group visual relationships among objects guided by the textual information.

...read moreread less

Abstract: Cross-modal analysis has become a promising direction for artificial intelligence. Visual representation is crucial for various cross-modal analysis tasks that require visual content understanding. Visual features which contain semantical information can disentangle the underlying correlation between different modalities, thus benefiting the downstream tasks. In this paper, we propose a Visual Reasoning and Attention Network (VRANet) as a plug-and-play module to capture rich visual semantics and help to enhance the visual representation for improving cross-modal analysis. Our proposed VRANet is built based on the bilinear visual attention module which identifies the critical objects. We propose a novel Visual Relational Reasoning (VRR) module to reason about pair-wise and inner-group visual relationships among objects guided by the textual information. The two modules enhance the visual features at both relation level and object level. We demonstrate the effectiveness of the proposed VRANet by applying it to both Visual Question Answering (VQA) and Cross-Modal Information Retrieval (CMIR) tasks. Extensive experiments conducted on VQA 2.0, CLEVR, CMPlaces, and MS-COCO datasets indicate superior performance comparing with state-of-the-art work.

...read moreread less

53 citations

Proceedings Article•DOI•

Multi-grained Named Entity Recognition.

[...]

Congying Xia¹, Chenwei Zhang¹, Tao Yang², Yaliang Li³, Nan Du², Xian Wu², Wei Fan², Fenglong Ma⁴, Philip S. Yu¹ - Show less +5 more•Institutions (4)

University of Illinois at Chicago¹, Tencent², Alibaba Group³, Penn State College of Information Sciences and Technology⁴

01 Jul 2019

TL;DR: In this paper, the authors proposed a multi-level NER framework, MGNER, which detects and recognizes entities on multiple granularities: it is able to recognize named entities without explicitly assuming non-overlapping or totally nested structures.

...read moreread less

Abstract: This paper presents a novel framework, MGNER, for Multi-Grained Named Entity Recognition where multiple entities or entity mentions in a sentence could be non-overlapping or totally nested. Different from traditional approaches regarding NER as a sequential labeling task and annotate entities consecutively, MGNER detects and recognizes entities on multiple granularities: it is able to recognize named entities without explicitly assuming non-overlapping or totally nested structures. MGNER consists of a Detector that examines all possible word segments and a Classifier that categorizes entities. In addition, contextual information and a self-attention mechanism are utilized throughout the framework to improve the NER performance. Experimental results show that MGNER outperforms current state-of-the-art baselines up to 4.4% in terms of the F1 score among nested/non-overlapping NER tasks.

...read moreread less

53 citations

Proceedings Article•DOI•

Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More

[...]

Jingwen Ye¹, Yixin Ji¹, Xinchao Wang², Kairi Ou³, Dapeng Tao⁴, Mingli Song¹ - Show less +2 more•Institutions (4)

Zhejiang University¹, Stevens Institute of Technology², Alibaba Group³, Yunnan University⁴

23 Apr 2019

TL;DR: An innovative training strategy is proposed that learns the parameters of the student intertwined with the teachers, achieved by ``projecting'' its amalgamated features onto each teacher's domain and computing the loss.

...read moreread less

Abstract: In this paper, we investigate a novel deep-model reusing task. Our goal is to train a lightweight and versatile student model, without human-labelled annotations, that amalgamates the knowledge and masters the expertise of two pre-trained teacher models working on heterogeneous problems, one on scene parsing and the other on depth estimation. To this end, we propose an innovative training strategy that learns the parameters of the student intertwined with the teachers, achieved by ``projecting'' its amalgamated features onto each teacher's domain and computing the loss. We also introduce two options to generalize the proposed training strategy to handle three or more tasks simultaneously. The proposed scheme yields very encouraging results. As demonstrated on several benchmarks, the trained student model achieves results even superior to those of the teachers in their own expertise domains and on par with the state-of-the-art fully supervised models relying on human-labelled annotations.

...read moreread less

53 citations

Proceedings Article•DOI•

Towards Compact CNNs via Collaborative Compression

[...]

Yuchao Li¹, Shaohui Lin², Jianzhuang Liu³, Qixiang Ye⁴, Mengdi Wang⁵, Fei Chao¹, Fan Yang³, Jincheng Ma³, Qi Tian³, Rongrong Ji¹ - Show less +6 more•Institutions (5)

Xiamen University¹, East China Normal University², Huawei³, Chinese Academy of Sciences⁴, Alibaba Group⁵

20 Jun 2021

TL;DR: In this article, the authors propose a collaborative compression scheme, which combines channel pruning and tensor decomposition to compress CNN models by simultaneously learning the model sparsity and low-rankness.

...read moreread less

Abstract: Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression. However, these two techniques are traditionally deployed in an isolated manner, leading to significant accuracy drop when pursuing high compression rates. In this paper, we propose a Collaborative Compression (CC) scheme, which joints channel pruning and tensor decomposition to compress CNN models by simultaneously learning the model sparsity and low-rankness. Specifically, we first investigate the compression sensitivity of each layer in the network, and then propose a Global Compression Rate Optimization that transforms the decision problem of compression rate into an optimization problem. After that, we propose multi-step heuristic compression to remove redundant compression units step-by-step, which fully considers the effect of the remaining compression space (i.e., unremoved compression units). Our method demonstrates superior performance gains over previous ones on various datasets and backbone architectures. For example, we achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.

...read moreread less

53 citations

Collapse

Authors

Showing all 6829 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Lei Zhang	130	2312	86950
Jian Xu	94	1366	52057
Wei Chu	80	670	28771
Le Song	76	345	21382
Yuan Xie	76	739	24155
Narendra Ahuja	76	474	29517
Rong Jin	75	449	19456
Beng Chin Ooi	73	408	19174
Wotao Yin	72	303	27233
Deng Cai	70	326	24524
Xiaofei He	70	260	28215
Irwin King	67	476	19056
Gang Wang	65	373	21579
Xiaodan Liang	61	318	14121

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

94% related

Facebook

10.9K papers, 570.1K citations

93% related

AT&T Labs

5.5K papers, 483.1K citations

38.6K papers, 1.3M citations

87% related

Performance

Metrics

7,410

Papers

106,380

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	5
2022	30
2021	1,352
2020	1,671
2019	1,459
2018	863