Institution

Alibaba Group

Company•Hangzhou, China•

About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).

...read moreread less

Topics: Computer science, Terminal (electronics), Graph (abstract data type), Node (networking), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

UniFuse: Unidirectional Fusion for 360° Panorama Depth Estimation

[...]

Hualie Jiang¹, Zhe Sheng², Siyu Zhu², Zilong Dong², Rui Huang¹ - Show less +1 more•Institutions (2)

The Chinese University of Hong Kong¹, Alibaba Group²

12 Feb 2021

TL;DR: Zhang et al. as discussed by the authors introduced a new framework to fuse features from the two projections, unidirectionally feeding the cubemap features to the equirectangular features only at the decoding stage.

...read moreread less

Abstract: Learning depth from spherical panoramas is becoming a popular research topic because a panorama has a full field-of-view of the environment and provides a relatively complete description of a scene. However, applying well-studied CNNs for perspective images to the standard representation of spherical panoramas, i.e. , the equirectangular projection, is suboptimal, as it becomes distorted towards the poles. Another representation is the cubemap projection, which is distortion-free but discontinued on edges and limited in the field-of-view. This letter introduces a new framework to fuse features from the two projections, unidirectionally feeding the cubemap features to the equirectangular features only at the decoding stage. Unlike the recent bidirectional fusion approach operating at both the encoding and decoding stages, our fusion scheme is much more efficient. Besides, we also designed a more effective fusion module for our fusion scheme. Experiments verify the effectiveness of our proposed fusion strategy and module, and our model achieves state-of-the-art performance on four popular datasets. Additional experiments show that our model also has the advantages of model complexity and generalization capability.

...read moreread less

80 citations

Book Chapter•DOI•

Gradient Centralization: A New Optimization Technique for Deep Neural Networks

[...]

Hongwei Yong¹, Jianqiang Huang², Xian-Sheng Hua², Lei Zhang¹•Institutions (2)

Hong Kong Polytechnic University¹, Alibaba Group²

23 Aug 2020

Abstract: Optimization techniques are of great importance to effectively and efficiently train a deep neural network (DNN). It has been shown that using the first and second order statistics (e.g., mean and variance) to perform Z-score standardization on network activations or weight vectors, such as batch normalization (BN) and weight standardization (WS), can improve the training performance. Different from these existing methods that mostly operate on activations or weights, we present a new optimization technique, namely gradient centralization (GC), which operates directly on gradients by centralizing the gradient vectors to have zero mean. GC can be viewed as a projected gradient descent method with a constrained loss function. We show that GC can regularize both the weight space and output feature space so that it can boost the generalization performance of DNNs. Moreover, GC improves the Lipschitzness of the loss function and its gradient so that the training process becomes more efficient and stable. GC is very simple to implement and it can be embedded into existing gradient based DNN optimizers with only one line of code. Our experiments on various applications, including general image classification, fine-grained image classification, detection and segmentation, demonstrate that GC can consistently improve the performance of DNN learning. The code of GC can be found at https://github.com/Yonghongwei/Gradient-Centralization.

...read moreread less

80 citations

Journal Article•DOI•

Guaranteeing Deadlines for Inter-Data Center Transfers

[...]

Hong Zhang¹, Kai Chen¹, Wei Bai¹, Dongsu Han², Chen Tian³, Hao Wang⁴, Haibing Guan⁵, Ming Zhang⁶ - Show less +4 more•Institutions (6)

Hong Kong University of Science and Technology¹, KAIST², Nanjing University³, University of Toronto⁴, Shanghai Jiao Tong University⁵, Alibaba Group⁶

01 Feb 2017-IEEE ACM Transactions on Networking

TL;DR: The simulations and test bed experiments show that Amoeba, by harnessing DNA’s malleability, accommodates 15% more user requests with deadlines, while achieving 60% higher WAN utilization than prior solutions.

...read moreread less

Abstract: Inter-data center wide area networks (inter-DC WANs) carry a significant amount of data transfers that require to be completed within certain time periods, or deadlines. However, very little work has been done to guarantee such deadlines. The crux is that the current inter-DC WAN lacks an interface for users to specify their transfer deadlines and a mechanism for provider to ensure the completion while maintaining high WAN utilization. In this paper, we address the problem by introducing a deadline-based network abstraction (DNA) for inter-DC WANs. DNA allows users to explicitly specify the amount of data to be delivered and the deadline by which it has to be completed. The malleability of DNA provides flexibility in resource allocation. Based on this, we develop a system called Amoeba that implements DNA. Our simulations and test bed experiments show that Amoeba , by harnessing DNA’s malleability, accommodates 15% more user requests with deadlines, while achieving 60% higher WAN utilization than prior solutions.

...read moreread less

80 citations

Proceedings Article•DOI•

A Short-Term Rainfall Prediction Model Using Multi-task Convolutional Neural Networks

[...]

Minghui Qiu¹, Peilin Zhao², Zhang Ke¹, Jun Huang¹, Shi Xing, Wang Xiaoguang, Wei Chu³ - Show less +3 more•Institutions (3)

Alibaba Group¹, South China University of Technology², Hong Kong Polytechnic University³

01 Nov 2017

TL;DR: This is the first attempt to use multi-task learning and deep learning techniques to predict short-term rainfall amount based on multi-site features and significantly outperforms a broad set of baseline models including the European Centre for Medium-range Weather Forecasts system.

...read moreread less

Abstract: Precipitation prediction, such as short-term rainfall prediction, is a very important problem in the field of meteorological service. In practice, most of recent studies focus on leveraging radar data or satellite images to make predictions. However, there is another scenario where a set of weather features are collected by various sensors at multiple observation sites. The observations of a site are sometimes incomplete but provide important clues for weather prediction at nearby sites, which are not fully exploited in existing work yet. To solve this problem, we propose a multi-task convolutional neural network model to automatically extract features from the time series measured at observation sites and leverage the correlation between the multiple sites for weather prediction via multi-tasking. To the best of our knowledge, this is the first attempt to use multi-task learning and deep learning techniques to predict short-term rainfall amount based on multi-site features. Specifically, we formulate the learning task as an end-to-end multi-site neural network model which allows to leverage the learned knowledge from one site to other correlated sites, and model the correlations between different sites. Extensive experiments show that the learned site correlations are insightful and the proposed model significantly outperforms a broad set of baseline models including the European Centre for Medium-range Weather Forecasts system (ECMWF).

...read moreread less

80 citations

Proceedings Article•DOI•

An Efficient Hardware Accelerator for Sparse Convolutional Neural Networks on FPGAs

[...]

Liqiang Lu¹, Jiaming Xie¹, Ruirui Huang², Jiansong Zhang², Wei Lin², Yun Liang¹ - Show less +2 more•Institutions (2)

Peking University¹, Alibaba Group²

01 Apr 2019

TL;DR: An FPGA accelerator for sparse CNNs is developed that can achieve 223.4-309.0 GOP/s for the modern CNNs on Xilinx ZCU102, which provides a 3.6x-12.9x speedup over previous dense CNN FPGAs.

...read moreread less

Abstract: Deep convolutional neural networks (CNN) have achieved remarkable performance with the cost of huge computation. As the CNN model becomes more complex and deeper, compressing CNN to sparse by pruning the redundant connection in networks has emerged as an attractive approach to reduce the amount of computation and memory requirement. In recent years, FPGAs have been demonstrated to be an effective hardware platform to accelerate CNN inference. However, most existing FPGA architectures focus on dense CNN models. The architecture designed for dense CNN models are inefficient when executing sparse models as most of the arithmetic operations involve addition and multiplication with zero operands. On the other hand, recent sparse FPGA accelerators only focus on FC layers. In this work, we aim to develop an FPGA accelerator for sparse CNNs. To efficiently deal with the irregular connection in the sparse convolutional layer, we propose a weight-oriented dataflow that processes each weight individually. Then we design an FPGA architecture which can handle input-weight connection and weight-output connection efficiently. For input-weight connection, we design a tile look-up table to eliminate the runtime indexing match of compressed weights. Moreover, we develop a weight layout to enable high on-chip memory access. To cooperate with the weight layout, a channel multiplexer is inserted to locate the address which can ensure no data access conflict. Experiments demonstrate that our accelerator can achieve 223.4-309.0 GOP/s for the modern CNNs on Xilinx ZCU102, which provides a 3.6x-12.9x speedup over previous dense CNN FPGA accelerators.

...read moreread less

79 citations

Collapse

Authors

Showing all 6829 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Lei Zhang	130	2312	86950
Jian Xu	94	1366	52057
Wei Chu	80	670	28771
Le Song	76	345	21382
Yuan Xie	76	739	24155
Narendra Ahuja	76	474	29517
Rong Jin	75	449	19456
Beng Chin Ooi	73	408	19174
Wotao Yin	72	303	27233
Deng Cai	70	326	24524
Xiaofei He	70	260	28215
Irwin King	67	476	19056
Gang Wang	65	373	21579
Xiaodan Liang	61	318	14121

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

94% related

Facebook

10.9K papers, 570.1K citations

93% related

AT&T Labs

5.5K papers, 483.1K citations

38.6K papers, 1.3M citations

87% related

Performance

Metrics

7,410

Papers

106,380

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	5
2022	30
2021	1,352
2020	1,671
2019	1,459
2018	863