Institution

Alibaba Group

Company•Hangzhou, China•

About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).

...read moreread less

Topics: Computer science, Terminal (electronics), Graph (abstract data type), Node (networking), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Sentiment Classification towards Question-Answering with Hierarchical Matching Network

[...]

Shen Chenlin, Changlong Sun¹, Jingjing Wang², Yangyang Kang³, Shoushan Li², Xiaozhong Liu⁴, Luo Si³, Min Zhang², Guodong Zhou² - Show less +5 more•Institutions (4)

Zhejiang University¹, Soochow University (Suzhou)², Alibaba Group³, Indiana University⁴

01 Jan 2018

TL;DR: A high-quality annotated corpus with specially-designed annotation guidelines for QA-style sentiment classification is created and a three-stage hierarchical matching network is proposed to explore deep sentiment information in a QA text pair.

...read moreread less

Abstract: In an e-commerce environment, user-oriented question-answering (QA) text pair could carry rich sentiment information. In this study, we propose a novel task/method to address QA sentiment analysis. In particular, we create a high-quality annotated corpus with specially-designed annotation guidelines for QA-style sentiment classification. On the basis, we propose a three-stage hierarchical matching network to explore deep sentiment information in a QA text pair. First, we segment both the question and answer text into sentences and construct a number of [Q-sentence, A-sentence] units in each QA text pair. Then, by leveraging a QA bidirectional matching layer, the proposed approach can learn the matching vectors of each [Q-sentence, A-sentence] unit. Finally, we characterize the importance of the generated matching vectors via a self-matching attention layer. Experimental results, comparing with a number of state-of-the-art baselines, demonstrate the impressive effectiveness of the proposed approach for QA-style sentiment classification.

...read moreread less

33 citations

Posted Content•

Stagewise Training Accelerates Convergence of Testing Error Over SGD

[...]

Zhuoning Yuan¹, Yan Yan¹, Rong Jin², Tianbao Yang¹•Institutions (2)

University of Iowa¹, Alibaba Group²

10 Dec 2018-arXiv: Machine Learning

TL;DR: This paper considers a stagewise training strategy for minimizing empirical risk that satisfies the Polyak-\L ojasiewicz (PL) condition, which has been observed/proved for neural networks and also holds for a broad family of convex functions.

...read moreread less

Abstract: Stagewise training strategy is widely used for learning neural networks, which runs a stochastic algorithm (e.g., SGD) starting with a relatively large step size (aka learning rate) and geometrically decreasing the step size after a number of iterations. It has been observed that the stagewise SGD has much faster convergence than the vanilla SGD with a polynomially decaying step size in terms of both training error and testing error. {\it But how to explain this phenomenon has been largely ignored by existing studies.} This paper provides some theoretical evidence for explaining this faster convergence. In particular, we consider a stagewise training strategy for minimizing empirical risk that satisfies the Polyak-Łojasiewicz (PL) condition, which has been observed/proved for neural networks and also holds for a broad family of convex functions. For convex loss functions and two classes of "nice-behaviored" non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error. Experiments on stagewise learning of deep residual networks exhibits that it satisfies one type of non-convexity assumption and therefore can be explained by our theory. Of independent interest, the testing error bounds for the considered non-convex loss functions are dimensionality and norm independent.

...read moreread less

33 citations

Proceedings Article•DOI•

ROSE: Cluster Resource Scheduling via Speculative Over-Subscription

[...]

Xiaoyang Sun¹, Chunming Hu¹, Renyu Yang¹, Peter Garraghan², Tianyu Wo¹, Jie Xu¹, Jianyong Zhu¹, Chao Li³ - Show less +4 more•Institutions (3)

Beihang University¹, Lancaster University², Alibaba Group³

02 Jul 2018

TL;DR: This paper presents a new cluster scheduling system, ROSE, that is based on a multi-layered scheduling architecture with an ability to over-subscribe idle resources to accommodate unfulfilled resource requests and can almost double the average CPU utilization and reduce the workload makespan.

...read moreread less

Abstract: A long-standing challenge in cluster scheduling is to achieve a high degree of utilization of heterogeneous resources in a cluster. In practice there exists a substantial disparity between perceived and actual resource utilization. A scheduler might regard a cluster as fully utilized if a large resource request queue is present, but the actual resource utilization of the cluster can be in fact very low. This disparity results in the formation of idle resources, leading to inefficient resource usage and incurring high operational costs and an inability to provision services. In this paper we present a new cluster scheduling system, ROSE, that is based on a multi-layered scheduling architecture with an ability to over-subscribe idle resources to accommodate unfulfilled resource requests. ROSE books idle resources in a speculative manner: instead of waiting for resource allocation to be confirmed by the centralized scheduler, it requests intelligently to launch tasks within machines according to their suitability to oversubscribe resources. A threshold control with timely task rescheduling ensures fully-utilized cluster resources without generating potential task stragglers. Experimental results show that ROSE can almost double the average CPU utilization, from 36.37% to 65.10%, compared with a centralized scheduling scheme, and reduce the workload makespan by 30.11%, with an 8.23% disk utilization improvement over other scheduling strategies.

...read moreread less

33 citations

Proceedings Article•DOI•

An empirical study on crash recovery bugs in large-scale distributed systems

[...]

Yu Gao¹, Wensheng Dou¹, Feng Qin², Chushu Gao¹, Dong Wang¹, Jun Wei¹, Ruirui Huang³, Li Zhou³, Wu Yongming³ - Show less +5 more•Institutions (3)

Chinese Academy of Sciences¹, Ohio State University², Alibaba Group³

26 Oct 2018

TL;DR: CREB is presented, the most comprehensive study on 103 Crash REcovery Bugs from four popular open-source distributed systems, including ZooKeeper, Hadoop MapReduce, Cassandra and HBase, and obtains many interesting findings that can open up new research directions for combating crash recovery bugs.

...read moreread less

Abstract: In large-scale distributed systems, node crashes are inevitable, and can happen at any time. As such, distributed systems are usually designed to be resilient to these node crashes via various crash recovery mechanisms, such as write-ahead logging in HBase and hinted handoffs in Cassandra. However, faults in crash recovery mechanisms and their implementations can introduce intricate crash recovery bugs, and lead to severe consequences. In this paper, we present CREB, the most comprehensive study on 103 Crash REcovery Bugs from four popular open-source distributed systems, including ZooKeeper, Hadoop MapReduce, Cassandra and HBase. For all the studied bugs, we analyze their root causes, triggering conditions, bug impacts and fixing. Through this study, we obtain many interesting findings that can open up new research directions for combating crash recovery bugs.

...read moreread less

33 citations

Journal Article•DOI•

Robust Dual Clustering with Adaptive Manifold Regularization

[...]

Nengwen Zhao¹, Lefei Zhang¹, Bo Du¹, Qian Zhang², Jane You³, Dacheng Tao⁴ - Show less +2 more•Institutions (4)

Wuhan University¹, Alibaba Group², Hong Kong Polytechnic University³, University of Sydney⁴

01 Nov 2017-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A novel clustering algorithm called robust dual clustering with adaptive manifold regularization (RDC) is proposed, which simultaneously performs dual matrix factorization tasks with the target of an identical cluster indicator in both of the original and projected feature spaces, respectively.

...read moreread less

Abstract: In recent years, various data clustering algorithms have been proposed in the data mining and engineering communities. However, there are still drawbacks in traditional clustering methods which are worth to be further investigated, such as clustering for the high dimensional data, learning an ideal affinity matrix which optimally reveals the global data structure, discovering the intrinsic geometrical and discriminative properties of the data space, and reducing the noises influence brings by the complex data input. In this paper, we propose a novel clustering algorithm called robust dual clustering with adaptive manifold regularization (RDC), which simultaneously performs dual matrix factorization tasks with the target of an identical cluster indicator in both of the original and projected feature spaces, respectively. Among which, the $l_{2,1}$ -norm is used instead of the conventional $l_{2}$ -norm to measure the loss, which helps to improve the model robustness by relieving the influences by the noises and outliers. In order to better consider the intrinsic geometrical and discriminative data structure, we incorporate the manifold regularization term on the cluster indicator by using a particularly learned affinity matrix which is more suitable for the clustering task. Moreover, a novel augmented lagrangian method (ALM) based procedure is designed to effectively and efficiently seek the optimal solution of the proposed RDC optimization. Numerous experiments on the representative data sets demonstrate the superior performance of the proposed method compares to the existing clustering algorithms.

...read moreread less

33 citations

Collapse

Authors

Showing all 6829 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Lei Zhang	130	2312	86950
Jian Xu	94	1366	52057
Wei Chu	80	670	28771
Le Song	76	345	21382
Yuan Xie	76	739	24155
Narendra Ahuja	76	474	29517
Rong Jin	75	449	19456
Beng Chin Ooi	73	408	19174
Wotao Yin	72	303	27233
Deng Cai	70	326	24524
Xiaofei He	70	260	28215
Irwin King	67	476	19056
Gang Wang	65	373	21579
Xiaodan Liang	61	318	14121

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

94% related

Facebook

10.9K papers, 570.1K citations

93% related

AT&T Labs

5.5K papers, 483.1K citations

38.6K papers, 1.3M citations

87% related

Performance

Metrics

7,410

Papers

106,380

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	5
2022	30
2021	1,352
2020	1,671
2019	1,459
2018	863