Institution

Alibaba Group

Company•Hangzhou, China•

About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).

...read moreread less

Topics: Computer science, Terminal (electronics), Graph (abstract data type), Node (networking), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

Lessons and Actions: What We Learned from 10K SSD-Related Storage System Failures

[...]

Erci Xu¹, Mai Zheng², Feng Qin¹, Yikang Xu³, Jiesheng Wu³ - Show less +1 more•Institutions (3)

Ohio State University¹, Iowa State University², Alibaba Group³

01 Jan 2019

TL;DR: This paper study the reliability of SSD-based storage systems deployed in Alibaba Cloud, which cover near half a million SSDs and span over three years of usage under representative cloud services, and derives a number of major lessons and a set of effective methods to address the issues observed.

...read moreread less

Abstract: Modern datacenters increasingly use flash-based solid state drives (SSDs) for high performance and low energy cost. However, SSD introduces more complex failure modes compared to traditional hard disk. While great efforts have been made to understand the reliability of SSD itself, it remains unclear what types of system level failures are related to SSD, what are the root causes, and how the rest of the system interacts with SSD and contributes to failures. Answering these questions can help practitioners build and maintain highly reliable SSD-based storage systems. In this paper, we study the reliability of SSD-based storage systems deployed in Alibaba Cloud, which cover near half a million SSDs and span over three years of usage under representative cloud services. We take a holistic view to analyze both device errors and system failures to better understand the potential casual relations. Particularly, we focus on failures that are Reported As “SSD-Related” (RASR) by system status monitoring daemons. Through log analysis, field studies, and validation experiments, we identify the characteristics of RASR failures in terms of their distribution, symptoms, and correlations. Moreover, we derive a number of major lessons and a set of effective methods to address the issues observed. We believe that our study and experience would be beneficial to the community and could facilitate building highly-reliable SSD-based storage systems.

...read moreread less

43 citations

Proceedings Article•DOI•

A Minimax Game for Instance based Selective Transfer Learning

[...]

Bo Wang¹, Minghui Qiu², Xisen Wang¹, Yaliang Li¹, Yu Gong¹, Xiaoyi Zeng, Jun Huang¹, Bo Zheng¹, Deng Cai², Jingren Zhou¹ - Show less +6 more•Institutions (2)

Alibaba Group¹, Zhejiang University²

25 Jul 2019

TL;DR: This work proposes a general Minimax Game based model for selective transfer learning that outperforms the competing methods by a large margin and is shown to speed up the training process of the learning task in the target domain than traditional TL methods.

...read moreread less

Abstract: Deep neural network based transfer learning has been widely used to leverage information from the domain with rich data to help domain with insufficient data. When the source data distribution is different from the target data, transferring knowledge between these domains may lead to negative transfer. To mitigate this problem, a typical way is to select useful source domain data for transferring. However, limited studies focus on selecting high-quality source data to help neural network based transfer learning. To bridge this gap, we propose a general Minimax Game based model for selective Transfer Learning (MGTL). More specifically, we build a selector, a discriminator and a TL module in the proposed method. The discriminator aims to maximize the differences between selected source data and target data, while the selector acts as an attacker to selected source data that are close to the target to minimize the differences. The TL module trains on the selected data and provides rewards to guide the selector. Those three modules play a minimax game to help select useful source data for transferring. Our method is also shown to speed up the training process of the learning task in the target domain than traditional TL methods. To the best of our knowledge, this is the first to build a minimax game based model for selective transfer learning. To examine the generality of our method, we evaluate it on two different tasks: item recommendation and text retrieval. Extensive experiments over both public and real-world datasets demonstrate that our model outperforms the competing methods by a large margin. Meanwhile, the quantitative evaluation shows our model can select data which are close to target data. Our model is also deployed in a real-world system and significant improvement over the baselines is observed.

...read moreread less

43 citations

Proceedings Article•DOI•

Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy

[...]

Lin Xiexiong, Jian Weiyu, Jianshan He, Taifeng Wang¹, Wei Chu² - Show less +1 more•Institutions (2)

Alibaba Group¹, Hong Kong Polytechnic University²

01 Jul 2020

TL;DR: This paper proposes a method that uses recurrent knowledge interaction among response decoding steps to incorporate appropriate knowledge and introduces a knowledge copy mechanism using a knowledge-aware pointer network to copy words from external knowledge according to knowledge attention distribution.

...read moreread less

Abstract: Knowledge-driven conversation approaches have achieved remarkable research attention recently. However, generating an informative response with multiple relevant knowledge without losing fluency and coherence is still one of the main challenges. To address this issue, this paper proposes a method that uses recurrent knowledge interaction among response decoding steps to incorporate appropriate knowledge. Furthermore, we introduce a knowledge copy mechanism using a knowledge-aware pointer network to copy words from external knowledge according to knowledge attention distribution. Our joint neural conversation model which integrates recurrent Knowledge-Interaction and knowledge Copy (KIC) performs well on generating informative responses. Experiments demonstrate that our model with fewer parameters yields significant improvements over competitive baselines on two datasets Wizard-of-Wikipedia(average Bleu +87%; abs.: 0.034) and DuConv(average Bleu +20%; abs.: 0.047)) with different knowledge formats (textual & structured) and different languages (English & Chinese).

...read moreread less

43 citations

Posted Content•

Improving Biomedical Pretrained Language Models with Knowledge

[...]

Zheng Yuan¹, Yijia Liu², Chuanqi Tan², Songfang Huang², Fei Huang² - Show less +1 more•Institutions (2)

Tsinghua University¹, Alibaba Group²

21 Apr 2021-arXiv: Computation and Language

TL;DR: KeBioLM is proposed, a biomedical pretrained language model that explicitly leverages knowledge from the UMLS knowledge bases and has better ability to model medical knowledge.

...read moreread less

Abstract: Pretrained language models have shown success in many natural language processing tasks. Many works explore incorporating knowledge into language models. In the biomedical domain, experts have taken decades of effort on building large-scale knowledge bases. For example, the Unified Medical Language System (UMLS) contains millions of entities with their synonyms and defines hundreds of relations among entities. Leveraging this knowledge can benefit a variety of downstream tasks such as named entity recognition and relation extraction. To this end, we propose KeBioLM, a biomedical pretrained language model that explicitly leverages knowledge from the UMLS knowledge bases. Specifically, we extract entities from PubMed abstracts and link them to UMLS. We then train a knowledge-aware language model that firstly applies a text-only encoding layer to learn entity representation and applies a text-entity fusion encoding to aggregate entity representation. Besides, we add two training objectives as entity detection and entity linking. Experiments on the named entity recognition and relation extraction from the BLURB benchmark demonstrate the effectiveness of our approach. Further analysis on a collected probing dataset shows that our model has better ability to model medical knowledge.

...read moreread less

43 citations

Proceedings Article•DOI•

DAPPLE: a pipelined data parallel approach for training large models

[...]

Shiqing Fan¹, Yi Rong¹, Chen Meng¹, Zongyan Cao¹, Siyu Wang¹, Zhen Zheng¹, Chuan Wu², Guoping Long¹, Jun Yang¹, Lixue Xia¹, Lansong Diao¹, Xiaoyong Liu¹, Wei Lin¹ - Show less +9 more•Institutions (2)

Alibaba Group¹, University of Hong Kong²

17 Feb 2021

TL;DR: DAPPLE as mentioned in this paper is a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models, and it features a novel parallelization strategy planner to solve the partition and placement problems.

...read moreread less

Abstract: It is a challenging task to train large DNN models on sophisticated GPU platforms with diversified interconnect capabilities. Recently, pipelined training has been proposed as an effective approach for improving device utilization. However, there are still several tricky issues to address: improving computing efficiency while ensuring convergence, and reducing memory usage without incurring additional computing costs. We propose DAPPLE, a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models. It features a novel parallelization strategy planner to solve the partition and placement problems, and explores the optimal hybrid strategies of data and pipeline parallelism. We also propose a new runtime scheduling algorithm to reduce device memory usage, which is orthogonal to re-computation approach and does not come at the expense of training throughput. Experiments show that DAPPLE planner consistently outperforms strategies generated by PipeDream's planner by up to 3.23× speedup under synchronous training scenarios, and DAPPLE runtime outperforms GPipe by 1.6× speedup of training throughput and saves 12% of memory consumption at the same time.

...read moreread less

43 citations

Collapse

Authors

Showing all 6829 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Lei Zhang	130	2312	86950
Jian Xu	94	1366	52057
Wei Chu	80	670	28771
Le Song	76	345	21382
Yuan Xie	76	739	24155
Narendra Ahuja	76	474	29517
Rong Jin	75	449	19456
Beng Chin Ooi	73	408	19174
Wotao Yin	72	303	27233
Deng Cai	70	326	24524
Xiaofei He	70	260	28215
Irwin King	67	476	19056
Gang Wang	65	373	21579
Xiaodan Liang	61	318	14121

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

94% related

Facebook

10.9K papers, 570.1K citations

93% related

AT&T Labs

5.5K papers, 483.1K citations

38.6K papers, 1.3M citations

87% related

Performance

Metrics

7,410

Papers

106,380

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	5
2022	30
2021	1,352
2020	1,671
2019	1,459
2018	863