scispace - formally typeset
Search or ask a question
Institution

Alibaba Group

CompanyHangzhou, China
About: Alibaba Group is a company organization based out in Hangzhou, China. It is known for research contribution in the topics: Computer science & Terminal (electronics). The organization has 6810 authors who have published 7389 publications receiving 55653 citations. The organization is also known as: Alibaba Group Holding Limited & Alibaba Group (Cayman Islands).


Papers
More filters
Proceedings ArticleDOI
01 Jun 2021
TL;DR: Wang et al. as discussed by the authors proposed a generalized pooling operator (GPO) to automatically adapt itself to the best pooling strategy for different features, requiring no manual tuning while staying effective and efficient.
Abstract: Visual Semantic Embedding (VSE) is a dominant approach for vision-language retrieval, which aims at learning a deep embedding space such that visual data are embedded close to their semantic text labels or descriptions. Recent VSE models use complex methods to better contextualize and aggregate multi-modal features into holistic embeddings. However, we discover that surprisingly simple (but carefully selected) global pooling functions (e.g., max pooling) outperform those complex models, across different feature extractors. Despite its simplicity and effectiveness, seeking the best pooling function for different data modality and feature extractor is costly and tedious, especially when the size of features varies (e.g., text, video). Therefore, we propose a Generalized Pooling Operator (GPO), which learns to automatically adapt itself to the best pooling strategy for different features, requiring no manual tuning while staying effective and efficient. We extend the VSE model using this proposed GPO and denote it as VSE∞.Without bells and whistles, VSE∞ outperforms previous VSE methods significantly on image-text retrieval benchmarks across popular feature extractors. With a simple adaptation, variants of VSE∞ further demonstrate its strength by achieving the new state of the art on two video-text retrieval datasets. Comprehensive experiments and visualizations confirm that GPO always discovers the best pooling strategy and can be a plug-and-play feature aggregation module for standard VSE models. Code and pre-trained models are available at http://jcchen.me/vse_infty/

37 citations

Patent
21 Sep 2016
TL;DR: In this article, a device and method for automatically allocating computing resources is described, which includes receiving a task from a client, the task including a plurality of instances and a resource description manifest representing resource needs of the plurality of instance, determining an initial computing resource allocation of a cluster of machines based on the resource description, wherein the initial computing resources allocation is determined according to the resource needs included in the task description manifest.
Abstract: A device and method for automatically allocating computing resources is disclosed herein. The method includes receiving a task from a client, the task including a plurality of instances and a resource description manifest representing resource needs of the plurality of instances; determining an initial computing resource allocation of a cluster of machines based on the resource description manifest, wherein the initial computing resource allocation is determined based on the resource needs included in the resource description manifest; determining that the resource description manifest indicates a request to utilize an actual computing resource allocation in excess of the initial computing resource allocation; configuring a plurality of actual computing resources to process the plurality of instances, wherein the plurality of actual computing resources are configured to utilize resources in excess of the initial computing resource allocation; and executing the plurality of instances using the plurality of actual computing resources.

37 citations

Posted Content
Dai Rui1, Shenkun Xu1, Qian Gu1, Chenguang Ji1, Kaikui Liu1 
TL;DR: The Hybrid Spatio-Temporal Graph Convolutional Network (H-STGCN), which is able to "deduce" future travel time by exploiting the data of upcoming traffic volume by taking advantage of the piecewise-linear flow-density relationship.
Abstract: Traffic forecasting has recently attracted increasing interest due to the popularity of online navigation services, ridesharing and smart city projects. Owing to the non-stationary nature of road traffic, forecasting accuracy is fundamentally limited by the lack of contextual information. To address this issue, we propose the Hybrid Spatio-Temporal Graph Convolutional Network (H-STGCN), which is able to "deduce" future travel time by exploiting the data of upcoming traffic volume. Specifically, we propose an algorithm to acquire the upcoming traffic volume from an online navigation engine. Taking advantage of the piecewise-linear flow-density relationship, a novel transformer structure converts the upcoming volume into its equivalent in travel time. We combine this signal with the commonly-utilized travel-time signal, and then apply graph convolution to capture the spatial dependency. Particularly, we construct a compound adjacency matrix which reflects the innate traffic proximity. We conduct extensive experiments on real-world datasets. The results show that H-STGCN remarkably outperforms state-of-the-art methods in various metrics, especially for the prediction of non-recurring congestion.

36 citations

Posted Content
TL;DR: This paper designs a novel classifier determinacy disparity (CDD) metric, which formulates classifier discrepancy as the class relevance of distinct target predictions and implicitly introduces constraint on the target feature discriminability.
Abstract: Unsupervised domain adaptation challenges the problem of transferring knowledge from a well-labelled source domain to an unlabelled target domain. Recently,adversarial learning with bi-classifier has been proven effective in pushing cross-domain distributions close. Prior approaches typically leverage the disagreement between bi-classifier to learn transferable representations, however, they often neglect the classifier determinacy in the target domain, which could result in a lack of feature discriminability. In this paper, we present a simple yet effective method, namely Bi-Classifier Determinacy Maximization(BCDM), to tackle this problem. Motivated by the observation that target samples cannot always be separated distinctly by the decision boundary, here in the proposed BCDM, we design a novel classifier determinacy disparity (CDD) metric, which formulates classifier discrepancy as the class relevance of distinct target predictions and implicitly introduces constraint on the target feature discriminability. To this end, the BCDM can generate discriminative representations by encouraging target predictive outputs to be consistent and determined, meanwhile, preserve the diversity of predictions in an adversarial manner. Furthermore, the properties of CDD as well as the theoretical guarantees of BCDM's generalization bound are both elaborated. Extensive experiments show that BCDM compares favorably against the existing state-of-the-art domain adaptation methods.

36 citations

Journal ArticleDOI
TL;DR: A new integration framework of texture and color information for background modeling is proposed, in which the foreground decision equation includes three parts (one part for color information, one part for texture information, and the left part for the integration of color and texture information).
Abstract: The detection of moving objects in videos is very important in many video processing applications, and background modeling is often an indispensable process to achieve this goal. Most of the traditional background modeling methods utilize color or texture information. However, color information is sensitive to illumination variations and texture information cannot be utilized to separate smooth foreground from smooth background in most cases. Achieving good performance in terms of high foreground detection accuracy and low computational cost is also challenging. In this paper, we propose a new integration framework of texture and color information for background modeling, in which the foreground decision equation includes three parts (one part for color information, one part for texture information, and the left part for the integration of color and texture information). This framework is able to combine the advantages of texture and color features while inhibiting their disadvantages as well. Moreover, we propose a block-based method to accelerate the background modeling. In particular, in the texture information modeling process, a single histogram model is established for each block whose bins indicate the occurrence probabilities of different patterns, which is different from the traditional multihistogram model for block-based background modeling, and then dominant background patterns are selected to calculate the background likelihood of new coming blocks. Dynamic background and multimodal problems can be handled through this technique. To evaluate the foreground detection performance reasonably, a new quality measure is proposed. Extensive experiments on various challenging videos validate the effectiveness of the proposed method over state-of-the-art methods.

36 citations


Authors

Showing all 6829 results

NameH-indexPapersCitations
Philip S. Yu1481914107374
Lei Zhang130231286950
Jian Xu94136652057
Wei Chu8067028771
Le Song7634521382
Yuan Xie7673924155
Narendra Ahuja7647429517
Rong Jin7544919456
Beng Chin Ooi7340819174
Wotao Yin7230327233
Deng Cai7032624524
Xiaofei He7026028215
Irwin King6747619056
Gang Wang6537321579
Xiaodan Liang6131814121
Network Information
Related Institutions (5)
Microsoft
86.9K papers, 4.1M citations

94% related

Google
39.8K papers, 2.1M citations

94% related

Facebook
10.9K papers, 570.1K citations

93% related

AT&T Labs
5.5K papers, 483.1K citations

90% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20235
202230
20211,352
20201,671
20191,459
2018863