Institution

Alibaba Group

Company•Hangzhou, China•

About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).

...read moreread less

Topics: Computer science, Terminal (electronics), Graph (abstract data type), Node (networking), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Multi-Interest Network with Dynamic Routing for Recommendation at Tmall

[...]

Chao Li¹, Zhiyuan Liu¹, Mengmeng Wu¹, Yuchi Xu¹, Pipei Huang¹, Huan Zhao², Guoliang Kang³, Qiwei Chen¹, Wei Li¹, Dik Lun Lee² - Show less +6 more•Institutions (3)

Alibaba Group¹, Hong Kong University of Science and Technology², University of Technology, Sydney³

17 Apr 2019-arXiv: Information Retrieval

TL;DR: This paper designs a multi-interest extractor layer based on the recently proposed dynamic routing mechanism, which is applicable for modeling and extracting diverse interests from user's behaviors, and proposes a technique named label-aware attention to help the learning process of user representations.

...read moreread less

Abstract: Industrial recommender systems usually consist of the matching stage and the ranking stage, in order to handle the billion-scale of users and items. The matching stage retrieves candidate items relevant to user interests, while the ranking stage sorts candidate items by user interests. Thus, the most critical ability is to model and represent user interests for either stage. Most of the existing deep learning-based models represent one user as a single vector which is insufficient to capture the varying nature of user's interests. In this paper, we approach this problem from a different view, to represent one user with multiple vectors encoding the different aspects of the user's interests. We propose the Multi-Interest Network with Dynamic routing (MIND) for dealing with user's diverse interests in the matching stage. Specifically, we design a multi-interest extractor layer based on capsule routing mechanism, which is applicable for clustering historical behaviors and extracting diverse interests. Furthermore, we develop a technique named label-aware attention to help learn a user representation with multiple vectors. Through extensive experiments on several public benchmarks and one large-scale industrial dataset from Tmall, we demonstrate that MIND can achieve superior performance than state-of-the-art methods for recommendation. Currently, MIND has been deployed for handling major online traffic at the homepage on Mobile Tmall App.

...read moreread less

187 citations

Proceedings Article•DOI•

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

[...]

Honggu Liu¹, Xiaodan Li², Wenbo Zhou¹, Yuefeng Chen², Yuan He², Hui Xue², Weiming Zhang¹, Nenghai Yu¹ - Show less +4 more•Institutions (2)

University of Science and Technology of China¹, Alibaba Group²

02 Mar 2021

TL;DR: Wang et al. as mentioned in this paper proposed a spatial-phase shallow learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery to improve the transferability.

...read moreread less

Abstract: The remarkable success in face forgery techniques has received considerable attention in computer vision due to security concerns. We observe that up-sampling is a necessary step of most face forgery techniques, and cumulative up-sampling will result in obvious changes in the frequency domain, especially in the phase spectrum. According to the property of natural images, the phase spectrum preserves abundant frequency components that provide extra information and complement the loss of the amplitude spectrum. To this end, we present a novel Spatial-Phase Shallow Learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery to improve the transferability, for face forgery detection. And we also theoretically analyze the validity of utilizing the phase spectrum. Moreover, we notice that local texture information is more crucial than high-level semantic information for the face forgery detection task. So we reduce the receptive fields by shallowing the network to suppress high-level features and focus on the local region. Extensive experiments show that SPSL can achieve the state-of-the-art performance on cross-datasets evaluation as well as multi-class classification and obtain comparable results on single dataset evaluation.

...read moreread less

183 citations

Book Chapter•DOI•

Suppress and Balance: A Simple Gated Network for Salient Object Detection

[...]

Xiaoqi Zhao¹, Youwei Pang¹, Lihe Zhang¹, Huchuan Lu¹, Lei Zhang², Lei Zhang³ - Show less +2 more•Institutions (3)

Dalian University of Technology¹, Hong Kong Polytechnic University², Alibaba Group³

23 Aug 2020

TL;DR: Zhang et al. as mentioned in this paper proposed a simple gated network (GateNet) to solve two key problems when the encoder exchanges information with the decoder: one is the lack of interference control between them, the other is without considering the disparity of the contributions of different encoder blocks.

...read moreread less

Abstract: Most salient object detection approaches use U-Net or feature pyramid networks (FPN) as their basic structures. These methods ignore two key problems when the encoder exchanges information with the decoder: one is the lack of interference control between them, the other is without considering the disparity of the contributions of different encoder blocks. In this work, we propose a simple gated network (GateNet) to solve both issues at once. With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder. We design a novel gated dual branch structure to build the cooperation among different levels of features and improve the discriminability of the whole network. Through the dual branch design, more details of the saliency map can be further restored. In addition, we adopt the atrous spatial pyramid pooling based on the proposed “Fold” operation (Fold-ASPP) to accurately localize salient objects of various scales. Extensive experiments on five challenging datasets demonstrate that the proposed model performs favorably against most state-of-the-art methods under different evaluation metrics.

...read moreread less

180 citations

Proceedings Article•DOI•

BERT with History Answer Embedding for Conversational Question Answering

[...]

Chen Qu¹, Liu Yang¹, Minghui Qiu², W. Bruce Croft¹, Yongfeng Zhang³, Mohit Iyyer¹ - Show less +2 more•Institutions (3)

University of Massachusetts Amherst¹, Alibaba Group², Rutgers University³

18 Jul 2019

TL;DR: This work proposes a conceptually simple yet highly effective approach referred to as history answer embedding that enables seamless integration of conversation history into a conversational question answering (ConvQA) model built on BERT (Bidirectional Encoder Representations from Transformers).

...read moreread less

Abstract: Conversational search is an emerging topic in the information retrieval community. One of the major challenges to multi-turn conversational search is to model the conversation history to answer the current question. Existing methods either prepend history turns to the current question or use complicated attention mechanisms to model the history. We propose a conceptually simple yet highly effective approach referred to as history answer embedding. It enables seamless integration of conversation history into a conversational question answering (ConvQA) model built on BERT (Bidirectional Encoder Representations from Transformers). We first explain our view that ConvQA is a simplified but concrete setting of conversational search, and then we provide a general framework to solve ConvQA. We further demonstrate the effectiveness of our approach under this framework. Finally, we analyze the impact of different numbers of history turns under different settings to provide new insights into conversation history modeling in ConvQA.

...read moreread less

177 citations

Proceedings Article•DOI•

Dual Encoding for Zero-Example Video Retrieval

[...]

Jianfeng Dong¹, Xirong Li², Chaoxi Xu², Shouling Ji³, Yuan He⁴, Gang Yang², Xun Wang¹ - Show less +3 more•Institutions (4)

Zhejiang Gongshang University¹, Renmin University of China², Zhejiang University³, Alibaba Group⁴

15 Jun 2019

TL;DR: In this paper, a dual deep encoding network is proposed to encode videos and queries into powerful dense representations of their own, achieving state-of-the-art performance for zero-example video retrieval.

...read moreread less

Abstract: This paper attacks the challenging problem of zero-example video retrieval. In such a retrieval paradigm, an end user searches for unlabeled videos by ad-hoc queries described in natural language text with no visual example provided. Given videos as sequences of frames and queries as sequences of words, an effective sequence-to-sequence cross-modal matching is required. The majority of existing methods are concept based, extracting relevant concepts from queries and videos and accordingly establishing associations between the two modalities. In contrast, this paper takes a concept-free approach, proposing a dual deep encoding network that encodes videos and queries into powerful dense representations of their own. Dual encoding is conceptually simple, practically effective and end-to-end. As experiments on three benchmarks, i.e. MSR-VTT, TRECVID 2016 and 2017 Ad-hoc Video Search show, the proposed solution establishes a new state-of-the-art for zero-example video retrieval.

...read moreread less

177 citations

Collapse

Authors

Showing all 6829 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Lei Zhang	130	2312	86950
Jian Xu	94	1366	52057
Wei Chu	80	670	28771
Le Song	76	345	21382
Yuan Xie	76	739	24155
Narendra Ahuja	76	474	29517
Rong Jin	75	449	19456
Beng Chin Ooi	73	408	19174
Wotao Yin	72	303	27233
Deng Cai	70	326	24524
Xiaofei He	70	260	28215
Irwin King	67	476	19056
Gang Wang	65	373	21579
Xiaodan Liang	61	318	14121

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

94% related

Facebook

10.9K papers, 570.1K citations

93% related

AT&T Labs

5.5K papers, 483.1K citations

38.6K papers, 1.3M citations

87% related

Performance

Metrics

7,410

Papers

106,380

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	5
2022	30
2021	1,352
2020	1,671
2019	1,459
2018	863