Institution

Alibaba Group

Company•Hangzhou, China•

About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).

...read moreread less

Topics: Computer science, Terminal (electronics), Graph (abstract data type), Node (networking), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Large-scale Causal Approaches to Debiasing Post-click Conversion Rate Estimation with Multi-task Learning

[...]

Wenhao Zhang¹, Wentian Bao², Xiao-Yang Liu³, Keping Yang², Quan Lin², Hong Wen², Ramin Ramezani¹ - Show less +3 more•Institutions (3)

University of California, Los Angeles¹, Alibaba Group², Columbia University³

20 Apr 2020

TL;DR: Two principled, efficient and highly effective CVR estimators for industrial CVR estimation are proposed, namely, Multi-IPW and Multi-DR, based on the multi-task learning framework and mitigate the data sparsity issue.

...read moreread less

Abstract: Post-click conversion rate (CVR) estimation is a critical task in e-commerce recommender systems. This task is deemed quite challenging under industrial setting with two major issues: 1) selection bias caused by user self-selection, and 2) data sparsity due to the rare click events. A successful conversion typically has the following sequential events: ”exposure → click → conversion”. Conventional CVR estimators are trained in the click space, but inference is done in the entire exposure space. They fail to account for the causes of the missing data and treat them as missing at random. Hence, their estimations are highly likely to deviate from the real values by large. In addition, the data sparsity issue can also handicap many industrial CVR estimators which usually have large parameter spaces. In this paper, we propose two principled, efficient and highly effective CVR estimators for industrial CVR estimation, namely, Multi-IPW and Multi-DR. The proposed models approach the CVR estimation from a causal perspective and account for the causes of missing not at random. In addition, our methods are based on the multi-task learning framework and mitigate the data sparsity issue. Extensive experiments on industrial-level datasets show that our methods outperform the state-of-the-art CVR models.

...read moreread less

60 citations

Journal Article•DOI•

Boundary Enhanced Neural Span Classification for Nested Named Entity Recognition.

[...]

Chuanqi Tan¹, Wei Qiu¹, Mosha Chen¹, Rui Wang¹, Fei Huang¹ - Show less +1 more•Institutions (1)

Alibaba Group¹

03 Apr 2020

TL;DR: This work proposes a boundary enhanced neural span classification model that has the ability to generate high-quality candidate spans and greatly reduces the time complexity during inference, and incorporates an additional boundary detection task to predict those words that are boundaries of entities.

...read moreread less

Abstract: Named entity recognition (NER) is a well-studied task in natural language processing. However, the widely-used sequence labeling framework is usually difficult to detect entities with nested structures. The span-based method that can easily detect nested entities in different subsequences is naturally suitable for the nested NER problem. However, previous span-based methods have two main issues. First, classifying all subsequences is computationally expensive and very inefficient at inference. Second, the span-based methods mainly focus on learning span representations but lack of explicit boundary supervision. To tackle the above two issues, we propose a boundary enhanced neural span classification model. In addition to classifying the span, we propose incorporating an additional boundary detection task to predict those words that are boundaries of entities. The two tasks are jointly trained under a multitask learning framework, which enhances the span representation with additional boundary supervision. In addition, the boundary detection model has the ability to generate high-quality candidate spans, which greatly reduces the time complexity during inference. Experiments show that our approach outperforms all existing methods and achieves 85.3, 83.9, and 78.3 scores in terms of F1 on the ACE2004, ACE2005, and GENIA datasets, respectively.

...read moreread less

60 citations

Posted Content•

LotteryFL: Personalized and Communication-Efficient Federated Learning with Lottery Ticket Hypothesis on Non-IID Datasets.

[...]

Ang Li¹, Jingwei Sun¹, Binghui Wang¹, Lin Duan¹, Sicheng Li², Yiran Chen¹, Hai Li¹ - Show less +3 more•Institutions (2)

Duke University¹, Alibaba Group²

07 Aug 2020-arXiv: Learning

TL;DR: This work proposes LotteryFL -- a personalized and communication-efficient federated learning framework via exploiting the Lottery Ticket hypothesis, and constructs non-IID datasets based on MNIST, CIFAR-10 and EMNIST by taking feature distribution skew, label distribution skew and quantity skew into consideration.

...read moreread less

Abstract: Federated learning is a popular distributed machine learning paradigm with enhanced privacy. Its primary goal is learning a global model that offers good performance for the participants as many as possible. The technology is rapidly advancing with many unsolved challenges, among which statistical heterogeneity (i.e., non-IID) and communication efficiency are two critical ones that hinder the development of federated learning. In this work, we propose LotteryFL -- a personalized and communication-efficient federated learning framework via exploiting the Lottery Ticket hypothesis. In LotteryFL, each client learns a lottery ticket network (i.e., a subnetwork of the base model) by applying the Lottery Ticket hypothesis, and only these lottery networks will be communicated between the server and clients. Rather than learning a shared global model in classic federated learning, each client learns a personalized model via LotteryFL; the communication cost can be significantly reduced due to the compact size of lottery networks. To support the training and evaluation of our framework, we construct non-IID datasets based on MNIST, CIFAR-10 and EMNIST by taking feature distribution skew, label distribution skew and quantity skew into consideration. Experiments on these non-IID datasets demonstrate that LotteryFL significantly outperforms existing solutions in terms of personalization and communication cost.

...read moreread less

59 citations

Posted Content•

Blind Face Restoration via Deep Multi-scale Component Dictionaries

[...]

Xiaoming Li¹, Chaofeng Chen², Shangchen Zhou³, Xianhui Lin⁴, Wangmeng Zuo¹, Lei Zhang⁵ - Show less +2 more•Institutions (5)

Harbin Institute of Technology¹, University of Hong Kong², Nanyang Technological University³, Alibaba Group⁴, Hong Kong Polytechnic University⁵

02 Aug 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations and can achieve plausible performance in both quantitative and qualitative evaluation, and can generate realistic and promising results on real degraded images without requiring an identity-belonging reference.

...read moreread less

Abstract: Recent reference-based face restoration methods have received considerable attention due to their great capability in recovering high-frequency details on real low-quality images. However, most of these methods require a high-quality reference image of the same identity, making them only applicable in limited scenes. To address this issue, this paper suggests a deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations. To begin with, we use K-means to generate deep dictionaries for perceptually significant face components (\ie, left/right eyes, nose and mouth) from high-quality images. Next, with the degraded input, we match and select the most similar component features from their corresponding dictionaries and transfer the high-quality details to the input via the proposed dictionary feature transfer (DFT) block. In particular, component AdaIN is leveraged to eliminate the style diversity between the input and dictionary features (\eg, illumination), and a confidence score is proposed to adaptively fuse the dictionary feature to the input. Finally, multi-scale dictionaries are adopted in a progressive manner to enable the coarse-to-fine restoration. Experiments show that our proposed method can achieve plausible performance in both quantitative and qualitative evaluation, and more importantly, can generate realistic and promising results on real degraded images without requiring an identity-belonging reference. The source code and models are available at \url{this https URL}.

...read moreread less

59 citations

Proceedings Article•DOI•

Deep Attentive Sentence Ordering Network

[...]

Baiyun Cui¹, Yingming Li¹, Ming Chen², Zhongfei Zhang³•Institutions (3)

Zhejiang University¹, Alibaba Group², Binghamton University³

01 Jan 2018

TL;DR: A novel deep attentive sentence ordering network (referred as ATTOrderNet) which integrates self-attention mechanism with LSTMs in the encoding of input sentences enables us to capture global dependencies among sentences regardless of their input order and obtains a reliable representation of the sentence set.

...read moreread less

Abstract: In this paper, we propose a novel deep attentive sentence ordering network (referred as ATTOrderNet) which integrates self-attention mechanism with LSTMs in the encoding of input sentences. It enables us to capture global dependencies among sentences regardless of their input order and obtains a reliable representation of the sentence set. With this representation, a pointer network is exploited to generate an ordered sequence. The proposed model is evaluated on Sentence Ordering and Order Discrimination tasks. The extensive experimental results demonstrate its effectiveness and superiority to the state-of-the-art methods.

...read moreread less

59 citations

Collapse

Authors

Showing all 6829 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Lei Zhang	130	2312	86950
Jian Xu	94	1366	52057
Wei Chu	80	670	28771
Le Song	76	345	21382
Yuan Xie	76	739	24155
Narendra Ahuja	76	474	29517
Rong Jin	75	449	19456
Beng Chin Ooi	73	408	19174
Wotao Yin	72	303	27233
Deng Cai	70	326	24524
Xiaofei He	70	260	28215
Irwin King	67	476	19056
Gang Wang	65	373	21579
Xiaodan Liang	61	318	14121

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

94% related

Facebook

10.9K papers, 570.1K citations

93% related

AT&T Labs

5.5K papers, 483.1K citations

38.6K papers, 1.3M citations

87% related

Performance

Metrics

7,410

Papers

106,380

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	5
2022	30
2021	1,352
2020	1,671
2019	1,459
2018	863