scispace - formally typeset
Search or ask a question
Institution

Alibaba Group

CompanyHangzhou, China
About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).


Papers
More filters
Proceedings Article
01 Jan 2016
TL;DR: This paper proposed a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus.
Abstract: In this paper, we propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus. In earlier work, we devised a data selection method based on semi-supervised convolutional neural networks (SSCNNs). The new method, Bi-SSCNN, is based on bitokens, which use bilingual information. When the new methods are tested on two translation tasks (Chinese-to-English and Arabic-to-English), they significantly outperform the other three data selection methods in the experiments. We also show that the BiSSCNN method is much more effective than other methods in preventing noisy sentence pairs from being chosen for training. More interestingly, this method only needs a tiny amount of in-domain data to train the selection model, which makes fine-grained topic-dependent translation adaptation possible. In the follow-up experiments, we find that neural machine translation (NMT) is more sensitive to noisy data than statistical machine translation (SMT). Therefore, Bi-SSCNN which can effectively screen out noisy sentence pairs, can benefit NMT much more than SMT.We observed a BLEU improvement over 3 points on an English-to-French WMT task when Bi-SSCNNs were used.

27 citations

Proceedings ArticleDOI
01 Jul 2019
TL;DR: A Reinforced Bidirectional Attention Network (RBAN) approach is proposed to address two inherent challenges in ASC-QA, i.e., semantic matching between question and answer, and data noise.
Abstract: In the literature, existing studies on aspect sentiment classification (ASC) focus on individual non-interactive reviews. This paper extends the research to interactive reviews and proposes a new research task, namely Aspect Sentiment Classification towards Question-Answering (ASC-QA), for real-world applications. This new task aims to predict sentiment polarities for specific aspects from interactive QA style reviews. In particular, a high-quality annotated corpus is constructed for ASC-QA to facilitate corresponding research. On this basis, a Reinforced Bidirectional Attention Network (RBAN) approach is proposed to address two inherent challenges in ASC-QA, i.e., semantic matching between question and answer, and data noise. Experimental results demonstrate the great advantage of the proposed approach to ASC-QA against several state-of-the-art baselines.

27 citations

Journal ArticleDOI
TL;DR: In this article, a reinforcement learning framework is proposed to train an AI agent that assists users in exploring the design space efficiently and generating well-optimized storylines, and an authoring tool that integrates a set of flexible interactions to support easy customization of storyline visualizations.
Abstract: Storyline visualizations are an effective means to present the evolution of plots and reveal the scenic interactions among characters. However, the design of storyline visualizations is a difficult task as users need to balance between aesthetic goals and narrative constraints. Despite that the optimization-based methods have been improved significantly in terms of producing aesthetic and legible layouts, the existing (semi-) automatic methods are still limited regarding 1) efficient exploration of the storyline design space and 2) flexible customization of storyline layouts. In this work, we propose a reinforcement learning framework to train an AI agent that assists users in exploring the design space efficiently and generating well-optimized storylines. Based on the framework, we introduce PlotThread, an authoring tool that integrates a set of flexible interactions to support easy customization of storyline visualizations. To seamlessly integrate the AI agent into the authoring process, we employ a mixed-initiative approach where both the agent and designers work on the same canvas to boost the collaborative design of storylines. We evaluate the reinforcement learning model through qualitative and quantitative experiments and demonstrate the usage of PlotThread using a collection of use cases.

27 citations

Journal ArticleDOI
TL;DR: This work proposes to cognitively formalize the semantics of the key elements of the DIKW in a conceptual process and shows the initial case for using this formalization to construct security protection solutions for edge computing scenarios centering on type conversions among typed resources formalized through the proposed formalization of theDIKW.
Abstract: Currently, with the growth of the Internet of Things devices and the emergence of massive edge resources, security protection content has not only empowered IoT devices with the accumulation of networked computing and storage as a flexible whole but also enabled storing, transferring and processing DIKW (data, information, knowledge, and wisdom) content at the edge of the network from multiple devices in a mobile manner. However, understanding various DIKW content or resources poses a conceptual challenge in unifying the semantics of the core concepts as a starting point. Through building metamodels of the DIKW framework, we propose to cognitively formalize the semantics of the key elements of the DIKW in a conceptual process. The formalization centers on modeling the perceived world only by relationships or semantics as the prime atomic comprising elements. Based on this cognitive world model, we reveal the difference between relationships and entities during the conceptualization process as a foundation for distinguishing data and information. Thereafter, we show the initial case for using this formalization to construct security protection solutions for edge computing scenarios centering on type conversions among typed resources formalized through our proposed formalization of the DIKW.

27 citations

Proceedings ArticleDOI
01 Nov 2018
TL;DR: A novel Heterogeneous Embedding Propagation model is proposed, which is to iteratively reconstruct a node's embedding from its heterogeneous neighbors in a weighted manner, and meanwhile propagate its embedding updates from reconstruction loss and/or classification loss to its neighbors.
Abstract: We study the important problem of user alignment in e-commerce: to predict whether two online user identities that access an e-commerce site from different devices belong to one real-world person. As input, we have a set of user activity logs from Taobao and some labeled user identity linkages. User activity logs can be modeled using a heterogeneous interaction graph (HIG), and subsequently the user alignment task can be formulated as a semi-supervised HIG embedding problem. HIG embedding is challenging for two reasons: its heterogeneous nature and the presence of edge features. To address the challenges, we propose a novel Heterogeneous Embedding Propagation (HEP) model. The core idea is to iteratively reconstruct a node's embedding from its heterogeneous neighbors in a weighted manner, and meanwhile propagate its embedding updates from reconstruction loss and/or classification loss to its neighbors. We conduct extensive experiments on large-scale datasets from Taobao, demonstrating that HEP significantly outperforms state-of-the-art baselines often by more than 10% in F-scores.

27 citations


Authors

Showing all 6829 results

NameH-indexPapersCitations
Philip S. Yu1481914107374
Lei Zhang130231286950
Jian Xu94136652057
Wei Chu8067028771
Le Song7634521382
Yuan Xie7673924155
Narendra Ahuja7647429517
Rong Jin7544919456
Beng Chin Ooi7340819174
Wotao Yin7230327233
Deng Cai7032624524
Xiaofei He7026028215
Irwin King6747619056
Gang Wang6537321579
Xiaodan Liang6131814121
Network Information
Related Institutions (5)
Microsoft
86.9K papers, 4.1M citations

94% related

Google
39.8K papers, 2.1M citations

94% related

Facebook
10.9K papers, 570.1K citations

93% related

AT&T Labs
5.5K papers, 483.1K citations

90% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20235
202230
20211,352
20201,671
20191,459
2018863