Institution

Alibaba Group

Company•Hangzhou, China•

About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).

...read moreread less

Topics: Computer science, Terminal (electronics), Graph (abstract data type), Node (networking), Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Interaction Embeddings for Prediction and Explanation in Knowledge Graphs

[...]

Wen Zhang¹, Bibek Paudel², Wei Zhang³, Abraham Bernstein², Huajun Chen¹ - Show less +1 more•Institutions (3)

Zhejiang University¹, University of Zurich², Alibaba Group³

30 Jan 2019

TL;DR: CrossE as mentioned in this paper uses crossover interactions between entities and relations to select related information when predicting a new triple, which is a novel knowledge graph embedding which explicitly simulates crossover interactions and achieves state-of-the-art results on complex and more challenging datasets.

...read moreread less

Abstract: Knowledge graph embedding aims to learn distributed representations for entities and relations, and is proven to be effective in many applications. Crossover interactions -- bi-directional effects between entities and relations --- help select related information when predicting a new triple, but haven't been formally discussed before. In this paper, we propose CrossE, a novel knowledge graph embedding which explicitly simulates crossover interactions. It not only learns one general embedding for each entity and relation as most previous methods do, but also generates multiple triple specific embeddings for both of them, named interaction embeddings. We evaluate embeddings on typical link prediction tasks and find that CrossE achieves state-of-the-art results on complex and more challenging datasets. Furthermore, we evaluate embeddings from a new perspective -- giving explanations for predicted triples, which is important for real applications. In this work, an explanation for a triple is regarded as a reliable closed-path between the head and the tail entity. Compared to other baselines, we show experimentally that CrossE, benefiting from interaction embeddings, is more capable of generating reliable explanations to support its predictions.

...read moreread less

91 citations

Journal Article•DOI•

Combining Graph-Based Learning With Automated Data Collection for Code Vulnerability Detection

[...]

Huanting Wang¹, Guixin Ye¹, Zhanyong Tang¹, Shin Hwei Tan², Songfang Huang³, Dingyi Fang¹, Yansong Feng⁴, Lizhong Bian, Zheng Wang⁵ - Show less +5 more•Institutions (5)

Northwest University (China)¹, Southern University of Science and Technology², Alibaba Group³, Peking University⁴, University of Leeds⁵

01 Jan 2021-IEEE Transactions on Information Forensics and Security

TL;DR: Funded leverages the advances in graph neural networks to develop a novel graph-based learning method to capture and reason about the program’s control, data, and call dependencies to identify software vulnerabilities at the function level from program source code.

...read moreread less

Abstract: This paper presents FUNDED (Flow-sensitive vUl-Nerability coDE Detection), a novel learning framework for building vulnerability detection models. Funded leverages the advances in graph neural networks (GNNs) to develop a novel graph-based learning method to capture and reason about the program’s control, data, and call dependencies. Unlike prior work that treats the program as a sequential sequence or an untyped graph, Funded learns and operates on a graph representation of the program source code, in which individual statements are connected to other statements through relational edges. By capturing the program syntax, semantics and flows, Funded finds better code representation for the downstream software vulnerability detection task. To provide sufficient training data to build an effective deep learning model, we combine probabilistic learning and statistical assessments to automatically gather high-quality training samples from open-source projects. This provides many real-life vulnerable code training samples to complement the limited vulnerable code samples available in standard vulnerability databases. We apply Funded to identify software vulnerabilities at the function level from program source code. We evaluate Funded on large real-world datasets with programs written in C, Java, Swift and Php, and compare it against six state-of-the-art code vulnerability detection models. Experimental results show that Funded significantly outperforms alternative approaches across evaluation settings.

...read moreread less

90 citations

Journal Article•DOI•

Learning deep semantic segmentation network under multiple weakly-supervised constraints for cross-domain remote sensing image semantic segmentation

[...]

Yansheng Li¹, Te Shi¹, Yongjun Zhang¹, Wei Chen¹, Zhibin Wang², Hao Li² - Show less +2 more•Institutions (2)

Wuhan University¹, Alibaba Group²

01 May 2021-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: A novel objective function with multiple weakly-supervised constraints to learn DSSN for cross-domain RS image semantic segmentation and a dynamic optimization strategy that dynamically adjusts the constraint weights of the objective function during the training process is presented.

...read moreread less

Abstract: Due to its wide applications, remote sensing (RS) image semantic segmentation has attracted increasing research interest in recent years. Benefiting from its hierarchical abstract ability, the deep semantic segmentation network (DSSN) has achieved tremendous success on RS image semantic segmentation and has gradually become the mainstream technology. However, the superior performance of DSSN highly depends on two conditions: (I) massive quantities of labeled training data exist; (II) the testing data seriously resemble the training data. In actual RS applications, it is difficult to fully meet these conditions due to the RS sensor variation and the distinct landscape variation in different geographic locations. To make DSSN fit the actual RS scenario, this paper exploits the cross-domain RS image semantic segmentation task, which means that DSSN is trained on one labeled dataset (i.e., the source domain) but is tested on another varied dataset (i.e., the target domain). In this setting, the performance of DSSN is inevitably very limited due to the data shift between the source and target domains. To reduce the disadvantageous influence of data shift, this paper proposes a novel objective function with multiple weakly-supervised constraints to learn DSSN for cross-domain RS image semantic segmentation. Through carefully examining the characteristics of cross-domain RS image semantic segmentation, multiple weakly-supervised constraints include the weakly-supervised transfer invariant constraint (WTIC), weakly-supervised pseudo-label constraint (WPLC) and weakly-supervised rotation consistency constraint (WRCC). Specifically, DualGAN is recommended to conduct unsupervised style transfer between the source and target domains to carry out WTIC. To make full use of the merits of multiple constraints, this paper presents a dynamic optimization strategy that dynamically adjusts the constraint weights of the objective function during the training process. With full consideration of the characteristics of the cross-domain RS image semantic segmentation task, this paper gives two cross-domain RS image semantic segmentation settings: (I) variation in geographic location and (II) variation in both geographic location and imaging mode. Extensive experiments demonstrate that our proposed method remarkably outperforms the state-of-the-art methods under both of these settings. The collected datasets and evaluation benchmarks have been made publicly available online ( https://github.com/te-shi/MUCSS ).

...read moreread less

90 citations

Proceedings Article•DOI•

Unpaired Image Captioning via Scene Graph Alignments

[...]

Jiuxiang Gu¹, Shafiq Joty¹, Jianfei Cai¹, Handong Zhao², Xu Yang¹, Gang Wang³ - Show less +2 more•Institutions (3)

Nanyang Technological University¹, Adobe Systems², Alibaba Group³

01 Oct 2019

TL;DR: This paper proposes an unsupervised feature alignment method that maps the scene graph features from the image to the sentence modality and can generate quite promising results without using any image-caption training pairs, outperforming existing methods by a wide margin.

...read moreread less

Abstract: Most of current image captioning models heavily rely on paired image-caption datasets. However, getting large scale image-caption paired data is labor-intensive and time-consuming. In this paper, we present a scene graph-based approach for unpaired image captioning. Our framework comprises an image scene graph generator, a sentence scene graph generator, a scene graph encoder, and a sentence decoder. Specifically, we first train the scene graph encoder and the sentence decoder on the text modality. To align the scene graphs between images and sentences, we propose an unsupervised feature alignment method that maps the scene graph features from the image to the sentence modality. Experimental results show that our proposed model can generate quite promising results without using any image-caption training pairs, outperforming existing methods by a wide margin.

...read moreread less

89 citations

Posted Content•

MLPerf Inference Benchmark

[...]

Vijay Janapa Reddi¹, Christine Cheng², David Kanter, Peter Mattson³, Guenther Schmuelling⁴, Carole-Jean Wu⁵, Brian M. Anderson³, Maximilien Breughe⁶, Mark Charlebois⁷, William Chou⁷, Ramesh Chukka², Cody Coleman⁸, Sam Davis, Pan Deng⁹, Greg Diamos, Jared Duke³, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Satish Idgunji⁶, Thomas B. Jablin³, Jeff Jiao, Tom St. John, Pankaj Kanwar³, David Lee¹⁰, Jeffery Liao¹¹, Anton Lokhmotov, Francisco Massa⁵, Peng Meng⁹, Paulius Micikevicius⁶, Colin Osborne, Gennady Pekhimenko¹², Arun Tejusve Raghunath Rajan², Dilip Sequeira⁶, Ashish Sirasao¹³, Fei Sun⁵, Hanlin Tang², Michael Thomson¹⁴, Frank Wei¹⁵, Ephrem C. Wu¹³, Lingjie Xu, Koichi Yamada², Bing Yu¹⁰, George Yuan⁶, Aaron Zhong, Peizhao Zhang⁵, Yuchen Zhou¹⁶ - Show less +43 more•Institutions (16)

Harvard University¹, Intel², Google³, Microsoft⁴, Facebook⁵, Nvidia⁶, Qualcomm⁷, Stanford University⁸, Tencent⁹, MediaTek¹⁰, Synopsys¹¹, University of Toronto¹², Xilinx¹³, Centaur Technology¹⁴, Alibaba Group¹⁵, General Motors¹⁶

06 Nov 2019-arXiv: Learning

TL;DR: MLPerf Inference as mentioned in this paper is a benchmarking method for evaluating ML inference systems with different architectures and architectures. And it is based on the first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities.

...read moreread less

Abstract: Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.

...read moreread less

89 citations

Collapse

Authors

Showing all 6829 results

Name	H-index	Papers	Citations
Philip S. Yu	148	1914	107374
Lei Zhang	130	2312	86950
Jian Xu	94	1366	52057
Wei Chu	80	670	28771
Le Song	76	345	21382
Yuan Xie	76	739	24155
Narendra Ahuja	76	474	29517
Rong Jin	75	449	19456
Beng Chin Ooi	73	408	19174
Wotao Yin	72	303	27233
Deng Cai	70	326	24524
Xiaofei He	70	260	28215
Irwin King	67	476	19056
Gang Wang	65	373	21579
Xiaodan Liang	61	318	14121

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

94% related

Facebook

10.9K papers, 570.1K citations

93% related

AT&T Labs

5.5K papers, 483.1K citations

38.6K papers, 1.3M citations

87% related

Performance

Metrics

7,410

Papers

106,380

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	5
2022	30
2021	1,352
2020	1,671
2019	1,459
2018	863