Showing papers on "Matching (statistics) published in 2019"

PDF

Open Access

Journal Article•DOI•

Why Propensity Scores Should Not Be Used for Matching

[...]

Gary King¹, Richard A. Nielsen²•Institutions (2)

Harvard University¹, Massachusetts Institute of Technology²

01 Oct 2019-Political Analysis

TL;DR: It is shown that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal—thus increasing imbalance, inefficiency, model dependence, and bias.

...read moreread less

Abstract: We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal—thus increasing imbalance, inefficiency, model dependence, and bias. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest researchers replace PSM with one of the other available matching methods, propensity scores have other productive uses.

...read moreread less

991 citations

Proceedings Article•DOI•

Moment Matching for Multi-Source Domain Adaptation

[...]

Xingchao Peng¹, Qinxun Bai¹, Xide Xia¹, Zijun Huang², Kate Saenko¹, Bo Wang - Show less +2 more•Institutions (2)

Boston University¹, Columbia University²

01 Oct 2019

TL;DR: A new deep learning approach, Moment Matching for Multi-Source Domain Adaptation (M3SDA), which aims to transfer knowledge learned from multiple labeled source domains to an unlabeled target domain by dynamically aligning moments of their feature distributions.

...read moreread less

Abstract: Conventional unsupervised domain adaptation (UDA) assumes that training data are sampled from a single domain. This neglects the more practical scenario where training data are collected from multiple sources, requiring multi-source domain adaptation. We make three major contributions towards addressing this problem. First, we collect and annotate by far the largest UDA dataset, called DomainNet, which contains six domains and about 0.6 million images distributed among 345 categories, addressing the gap in data availability for multi-source UDA research. Second, we propose a new deep learning approach, Moment Matching for Multi-Source Domain Adaptation (M3SDA), which aims to transfer knowledge learned from multiple labeled source domains to an unlabeled target domain by dynamically aligning moments of their feature distributions. Third, we provide new theoretical insights specifically for moment matching approaches in both single and multiple source domain adaptation. Extensive experiments are conducted to demonstrate the power of our new dataset in benchmarking state-of-the-art multi-source domain adaptation methods, as well as the advantage of our proposed model. Dataset and Code are available at http://ai.bu.edu/M3SDA/

...read moreread less

597 citations

Proceedings Article•DOI•

GA-Net: Guided Aggregation Net for End-To-End Stereo Matching

[...]

Feihu Zhang¹, Victor Adrian Prisacariu¹, Ruigang Yang², Philip H. S. Torr¹•Institutions (2)

University of Oxford¹, Baidu²

15 Jun 2019

TL;DR: In this article, a semi-global aggregation layer and a local guided aggregation layer are proposed to capture local and the whole-image cost dependencies respectively, which can be used to replace the widely used 3D convolutional layer which is computationally costly and memory-consuming.

...read moreread less

Abstract: In the stereo matching task, matching cost aggregation is crucial in both traditional methods and deep neural network models in order to accurately estimate disparities. We propose two novel neural net layers, aimed at capturing local and the whole-image cost dependencies respectively. The first is a semi-global aggregation layer which is a differentiable approximation of the semi-global matching, the second is the local guided aggregation layer which follows a traditional cost filtering strategy to refine thin structures. These two layers can be used to replace the widely used 3D convolutional layer which is computationally costly and memory-consuming as it has cubic computational/memory complexity. In the experiments, we show that nets with a two-layer guided aggregation block easily outperform the state-of-the-art GC-Net which has nineteen 3D convolutional layers. We also train a deep guided aggregation network (GA-Net) which gets better accuracies than state-of-the-art methods on both Scene Flow dataset and KITTI benchmarks.

...read moreread less

503 citations

Journal Article•DOI•

Locality Preserving Matching

[...]

Jiayi Ma¹, Ji Zhao, Junjun Jiang², Huabing Zhou³, Xiaojie Guo⁴ - Show less +1 more•Institutions (4)

Wuhan University¹, Harbin Institute of Technology², Wuhan Institute of Technology³, Tianjin University⁴

01 May 2019-International Journal of Computer Vision

TL;DR: The authors' method can accomplish the mismatch removal from thousands of putative correspondences in only a few milliseconds, and achieves better or favorably competitive performance in accuracy while intensively cutting time cost by more than two orders of magnitude.

...read moreread less

Abstract: Seeking reliable correspondences between two feature sets is a fundamental and important task in computer vision. This paper attempts to remove mismatches from given putative image feature correspondences. To achieve the goal, an efficient approach, termed as locality preserving matching (LPM), is designed, the principle of which is to maintain the local neighborhood structures of those potential true matches. We formulate the problem into a mathematical model, and derive a closed-form solution with linearithmic time and linear space complexities. Our method can accomplish the mismatch removal from thousands of putative correspondences in only a few milliseconds. To demonstrate the generality of our strategy for handling image matching problems, extensive experiments on various real image pairs for general feature matching, as well as for point set registration, visual homing and near-duplicate image retrieval are conducted. Compared with other state-of-the-art alternatives, our LPM achieves better or favorably competitive performance in accuracy while intensively cutting time cost by more than two orders of magnitude.

...read moreread less

416 citations

Journal Article•DOI•

Learning Two-Branch Neural Networks for Image-Text Matching Tasks

[...]

Liwei Wang¹, Yin Li², Jing Huang¹, Svetlana Lazebnik¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, Georgia Institute of Technology²

01 Feb 2019-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this article, a two-branch neural network is proposed to learn the similarity between image-sentence matching and visual grounding, which achieves high accuracies for phrase localization on the Flickr30K Entities dataset and for bi-directional imagesentence retrieval on Flickr30k and MSCOCO datasets.

...read moreread less

Abstract: Image-language matching tasks have recently attracted a lot of attention in the computer vision field. These tasks include image-sentence matching, i.e., given an image query, retrieving relevant sentences and vice versa, and region-phrase matching or visual grounding, i.e., matching a phrase to relevant regions. This paper investigates two-branch neural networks for learning the similarity between these two data modalities. We propose two network structures that produce different output representations. The first one, referred to as an embedding network , learns an explicit shared latent embedding space with a maximum-margin ranking loss and novel neighborhood constraints. Compared to standard triplet sampling, we perform improved neighborhood sampling that takes neighborhood information into consideration while constructing mini-batches. The second network structure, referred to as a similarity network , fuses the two branches via element-wise product and is trained with regression loss to directly predict a similarity score. Extensive experiments show that our networks achieve high accuracies for phrase localization on the Flickr30K Entities dataset and for bi-directional image-sentence retrieval on Flickr30K and MSCOCO datasets.

...read moreread less

391 citations

Proceedings Article•DOI•

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

[...]

Xin Wang¹, Qiuyuan Huang², Asli Celikyilmaz², Jianfeng Gao², Dinghan Shen³, Yuan-Fang Wang¹, William Yang Wang¹, Lei Zhang² - Show less +4 more•Institutions (3)

University of California, Santa Barbara¹, Microsoft², Duke University³

15 Jun 2019

TL;DR: In this paper, a reinforcement learning-based approach is proposed to enforce cross-modal grounding both locally and globally via reinforcement learning (RL), where a matching critic is used to provide an intrinsic reward to encourage global matching between instructions and trajectories.

...read moreread less

Abstract: Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments. In this paper, we study how to address three critical challenges for this task: the cross-modal grounding, the ill-posed feedback, and the generalization problems. First, we propose a novel Reinforced Cross-Modal Matching (RCM) approach that enforces cross-modal grounding both locally and globally via reinforcement learning (RL). Particularly, a matching critic is used to provide an intrinsic reward to encourage global matching between instructions and trajectories, and a reasoning navigator is employed to perform cross-modal grounding in the local visual scene. Evaluation on a VLN benchmark dataset shows that our RCM model significantly outperforms previous methods by 10% on SPL and achieves the new state-of-the-art performance. To improve the generalizability of the learned policy, we further introduce a Self-Supervised Imitation Learning (SIL) method to explore unseen environments by imitating its own past, good decisions. We demonstrate that SIL can approximate a better and more efficient policy, which tremendously minimizes the success rate performance gap between seen and unseen environments (from 30.7% to 11.7%).

...read moreread less

331 citations

Journal Article•DOI•

Computation Resource Allocation and Task Assignment Optimization in Vehicular Fog Computing: A Contract-Matching Approach

[...]

Zhenyu Zhou¹, Pengju Liu¹, Junhao Feng¹, Yan Zhang², Shahid Mumtaz, Jonathan Rodriguez - Show less +2 more•Institutions (2)

North China Electric Power University¹, University of Oslo²

23 Jan 2019-IEEE Transactions on Vehicular Technology

TL;DR: This paper proposes an efficient incentive mechanism based on contract theoretical modeling to minimize the network delay from a contract-matching integration perspective and demonstrates that significant performance improvement can be achieved by the proposed scheme.

...read moreread less

Abstract: Vehicular fog computing (VFC) has emerged as a promising solution to relieve the overload on the base station and reduce the processing delay during the peak time. The computation tasks can be offloaded from the base station to vehicular fog nodes by leveraging the under-utilized computation resources of nearby vehicles. However, the wide-area deployment of VFC still confronts several critical challenges, such as the lack of efficient incentive and task assignment mechanisms. In this paper, we address the above challenges and provide a solution to minimize the network delay from a contract-matching integration perspective. First, we propose an efficient incentive mechanism based on contract theoretical modeling. The contract is tailored for the unique characteristic of each vehicle type to maximize the expected utility of the base station. Next, we transform the task assignment problem into a two-sided matching problem between vehicles and user equipment. The formulated problem is solved by a pricing-based stable matching algorithm, which iteratively carries out the “propose” and “price-rising” procedures to derive a stable matching based on the dynamically updated preference lists. Finally, numerical results demonstrate that significant performance improvement can be achieved by the proposed scheme.

...read moreread less

263 citations

Proceedings Article•DOI•

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

[...]

Zhichao Yin¹, Trevor Darrell¹, Fisher Yu¹•Institutions (1)

University of California, Berkeley¹

15 Jun 2019

TL;DR: Hierarchical Discrete Distribution Decomposition (HD^3), a framework suitable for learning probabilistic pixel correspondences in both optical flow and stereo matching, is proposed and achieves state-of-the-art results.

...read moreread less

Abstract: Explicit representations of the global match distributions of pixel-wise correspondences between pairs of images are desirable for uncertainty estimation and downstream applications. However, the computation of the match density for each pixel may be prohibitively expensive due to the large number of candidates. In this paper, we propose Hierarchical Discrete Distribution Decomposition (HD^3), a framework suitable for learning probabilistic pixel correspondences in both optical flow and stereo matching. We decompose the full match density into multiple scales hierarchically, and estimate the local matching distributions at each scale conditioned on the matching and warping at coarser scales. The local distributions can then be composed together to form the global match density. Despite its simplicity, our probabilistic method achieves state-of-the-art results for both optical flow and stereo matching on established benchmarks. We also find the estimated uncertainty is a good indication of the reliability of the predicted correspondences.

...read moreread less

243 citations

Journal Article•DOI•

Data-driven performance analyses of wastewater treatment plants: A review.

[...]

Kathryn B. Newhart¹, Ryan W. Holloway, Amanda S. Hering², Tzahi Y. Cath¹•Institutions (2)

Colorado School of Mines¹, Baylor University²

15 Jun 2019-Water Research

TL;DR: An overview of data-driven methods of achieving fault detection, variable prediction, and advanced control of WWTP is provided and practical guidance is given for matching a desired goal with a particular methodology along with considerations regarding the assumed data structure.

...read moreread less

200 citations

Journal Article•DOI•

Efficient and Privacy-Preserving Carpooling Using Blockchain-Assisted Vehicular Fog Computing

[...]

Meng Li¹, Liehuang Zhu¹, Xiaodong Lin²•Institutions (2)

Beijing Institute of Technology¹, Wilfrid Laurier University²

01 Jun 2019-IEEE Internet of Things Journal

TL;DR: This work proposes an efficient and privacy-preserving carpooling scheme using blockchain-assisted vehicular fog computing to support conditional privacy, one-to-many matching, destination matching, and data auditability, and authenticates users in a conditionally anonymous way.

...read moreread less

Abstract: Carpooling enables passengers to share a vehicle to reduce traveling time, vehicle carbon emissions, and traffic congestion. However, the majority of passengers lean to find local drivers, but querying a remote cloud server leads to an unnecessary communication overhead and an increased response delay. Recently, fog computing is introduced to provide local data processing with low latency, but it also raises new security and privacy concerns because users’ private information (e.g., identity and location) could be disclosed when these information are shared during carpooling. While they can be encrypted before transmission, it makes user matching a challenging task and malicious users can upload false locations. Moreover, carpooling records should be kept in a distributed manner to guarantee reliable data auditability. To address these problems, we propose an efficient and privacy-preserving carpooling scheme using blockchain-assisted vehicular fog computing to support conditional privacy, one-to-many matching, destination matching, and data auditability. Specifically, we authenticate users in a conditionally anonymous way. Also, we adopt private proximity test to achieve one-to-many proximity matching and extend it to efficiently establish a secret communication key between a passenger and a driver. We store all location grids into a tree and achieve get-off location matching using a range query technique. A private blockchain is built to store carpooling records. Finally, we analyze the security and privacy properties of the proposed scheme, and evaluate its performance in terms of computational costs and communication overhead.

...read moreread less

181 citations

Journal Article•DOI•

When Should We Use Unit Fixed Effects Regression Models for Causal Inference with Longitudinal Data

[...]

Kosuke Imai¹, In Song Kim²•Institutions (2)

Harvard University¹, Massachusetts Institute of Technology²

01 Apr 2019-American Journal of Political Science

TL;DR: A new nonparametric matching framework is introduced that elucidates how various unit fixed effects models implicitly compare treated and control observations to draw causal inference and enables a diverse set of identification strategies to adjust for unobservables in the absence of dynamic causal relationships between treatment and outcome variables.

...read moreread less

Abstract: Many researchers use unit fixed effects regression models as their default methods for causal inference with longitudinal data. We show that the ability of these models to adjust for unobserved time‐invariant confounders comes at the expense of dynamic causal relationships, which are permitted under an alternative selection‐on‐observables approach. Using the nonparametric directed acyclic graph, we highlight two key causal identification assumptions of unit fixed effects models: Past treatments do not directly influence current outcome, and past outcomes do not affect current treatment. Furthermore, we introduce a new nonparametric matching framework that elucidates how various unit fixed effects models implicitly compare treated and control observations to draw causal inference. By establishing the equivalence between matching and weighted unit fixed effects estimators, this framework enables a diverse set of identification strategies to adjust for unobservables in the absence of dynamic causal relationships between treatment and outcome variables. We illustrate the proposed methodology through its application to the estimation of GATT membership effects on dyadic trade volume.

...read moreread less

Proceedings Article•DOI•

Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model

[...]

Weining Wang¹, Yan Huang¹, Liang Wang•Institutions (1)

Chinese Academy of Sciences¹

15 Jun 2019

TL;DR: A recurrent neural network based reinforcement learning model is proposed which selectively observes a sequence of frames and associates the given sentence with video content in a matching-based manner and extends the method to a semantic matching reinforcement learning (SM-RL) model by extracting semantic concepts of videos and then fusing them with global context features.

...read moreread less

Abstract: Current studies on action detection in untrimmed videos are mostly designed for action classes, where an action is described at word level such as jumping, tumbling, swing, etc. This paper focuses on a rarely investigated problem of localizing an activity via a sentence query which would be more challenging and practical. Considering that current methods are generally time-consuming due to the dense frame-processing manner, we propose a recurrent neural network based reinforcement learning model which selectively observes a sequence of frames and associates the given sentence with video content in a matching-based manner. However, directly matching sentences with video content performs poorly due to the large visual-semantic discrepancy. Thus, we extend the method to a semantic matching reinforcement learning (SM-RL) model by extracting semantic concepts of videos and then fusing them with global context features. Extensive experiments on three benchmark datasets, TACoS, Charades-STA and DiDeMo, show that our method achieves the state-of-the-art performance with a high detection speed, demonstrating both effectiveness and efficiency of our method.

...read moreread less

Proceedings Article•DOI•

PubLayNet: Largest Dataset Ever for Document Layout Analysis

[...]

Xu Zhong¹, Jianbin Tang¹, Antonio Jimeno Yepes¹•Institutions (1)

IBM¹

16 Aug 2019

TL;DR: The PubLayNet dataset for document layout analysis is developed by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central and demonstrated that deep neural networks trained on Pub LayNet accurately recognize the layout of scientific articles.

...read moreread less

Abstract: Recognizing the layout of unstructured digital documents is an important step when parsing the documents into structured machine-readable format for downstream applications. Deep neural networks that are developed for computer vision have been proven to be an effective method to analyze layout of document images. However, document layout datasets that are currently publicly available are several magnitudes smaller than established computing vision datasets. Models have to be trained by transfer learning from a base model that is pre-trained on a traditional computer vision dataset. In this paper, we develop the PubLayNet dataset for document layout analysis by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central. The size of the dataset is comparable to established computer vision datasets, containing over 360 thousand document images, where typical document layout elements are annotated. The experiments demonstrate that deep neural networks trained on PubLayNet accurately recognize the layout of scientific articles. The pre-trained models are also a more effective base mode for transfer learning on a different document domain. We release the dataset (https://github.com/ibm-aur-nlp/PubLayNet) to support development and evaluation of more advanced models for document layout analysis.

...read moreread less

Posted Content•

Sliced Score Matching: A Scalable Approach to Density and Score Estimation

[...]

Yang Song¹, Sahaj Garg², Jiaxin Shi³, Stefano Ermon²•Institutions (3)

East China Normal University¹, Stanford University², Tsinghua University³

17 May 2019-arXiv: Learning

TL;DR: In this paper, sliced score matching is used to learn deep score estimators for implicit distributions, which can be used for variational inference with implicit distributions and training Wasserstein Auto-Encoders.

...read moreread less

Abstract: Score matching is a popular method for estimating unnormalized statistical models. However, it has been so far limited to simple, shallow models or low-dimensional data, due to the difficulty of computing the Hessian of log-density functions. We show this difficulty can be mitigated by projecting the scores onto random vectors before comparing them. This objective, called sliced score matching, only involves Hessian-vector products, which can be easily implemented using reverse-mode automatic differentiation. Therefore, sliced score matching is amenable to more complex models and higher dimensional data compared to score matching. Theoretically, we prove the consistency and asymptotic normality of sliced score matching estimators. Moreover, we demonstrate that sliced score matching can be used to learn deep score estimators for implicit distributions. In our experiments, we show sliced score matching can learn deep energy-based models effectively, and can produce accurate score estimates for applications such as variational inference with implicit distributions and training Wasserstein Auto-Encoders.

...read moreread less

Posted Content•

GA-Net: Guided Aggregation Net for End-to-end Stereo Matching.

[...]

Feihu Zhang¹, Victor Adrian Prisacariu¹, Ruigang Yang², Philip H. S. Torr¹•Institutions (2)

University of Oxford¹, Baidu²

13 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: Two novel neural net layers, aimed at capturing local and the whole-image cost dependencies respectively are proposed, which can be used to replace the widely used 3D convolutional layer which is computationally costly and memory-consuming as it has cubic computational/memory complexity.

...read moreread less

Proceedings Article•DOI•

One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues.

[...]

Chongyang Tao¹, Wei Wu², Can Xu², Wenpeng Hu¹, Dongyan Zhao¹, Rui Yan¹ - Show less +2 more•Institutions (2)

Peking University¹, Microsoft²

01 Jul 2019

TL;DR: Evaluation results on three benchmark data sets indicate that IoI can significantly outperform state-of-the-art methods in terms of various matching metrics and unveil how the depth of interaction affects the performance of IoI.

...read moreread less

Abstract: Currently, researchers have paid great attention to retrieval-based dialogues in open-domain. In particular, people study the problem by investigating context-response matching for multi-turn response selection based on publicly recognized benchmark data sets. State-of-the-art methods require a response to interact with each utterance in a context from the beginning, but the interaction is performed in a shallow way. In this work, we let utterance-response interaction go deep by proposing an interaction-over-interaction network (IoI). The model performs matching by stacking multiple interaction blocks in which residual information from one time of interaction initiates the interaction process again. Thus, matching information within an utterance-response pair is extracted from the interaction of the pair in an iterative fashion, and the information flows along the chain of the blocks via representations. Evaluation results on three benchmark data sets indicate that IoI can significantly outperform state-of-the-art methods in terms of various matching metrics. Through further analysis, we also unveil how the depth of interaction affects the performance of IoI.

...read moreread less

Journal Article•DOI•

Fast and Robust Matching for Multimodal Remote Sensing Image Registration

[...]

Yuanxin Ye¹, Lorenzo Bruzzone², Jie Shan³, Francesca Bovolo⁴, Qing Zhu¹ - Show less +1 more•Institutions (4)

Southwest Jiaotong University¹, University of Trento², Purdue University³, fondazione bruno kessler⁴

18 Jul 2019-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: The proposed matching framework has been evaluated using many different types of multimodal images, and the results demonstrate its superior matching performance with respect to the state-of-the-art methods.

...read moreread less

Abstract: While image matching has been studied in remote sensing community for decades, matching multimodal data [e.g., optical, light detection and ranging (LiDAR), synthetic aperture radar (SAR), and map] remains a challenging problem because of significant nonlinear intensity differences between such data. To address this problem, we present a novel fast and robust template matching framework integrating local descriptors for multimodal images. First, a local descriptor [such as histogram of oriented gradient (HOG) and local self-similarity (LSS) or speeded-up robust feature (SURF)] is extracted at each pixel to form a pixelwise feature representation of an image. Then, we define a fast similarity measure based on the feature representation using the fast Fourier transform (FFT) in the frequency domain. A template matching strategy is employed to detect correspondences between images. In this procedure, we also propose a novel pixelwise feature representation using orientated gradients of images, which is named channel features of orientated gradients (CFOG). This novel feature is an extension of the pixelwise HOG descriptor with superior performance in image matching and computational efficiency. The major advantages of the proposed matching framework include: 1) structural similarity representation using the pixelwise feature description and 2) high computational efficiency due to the use of FFT. The proposed matching framework has been evaluated using many different types of multimodal images, and the results demonstrate its superior matching performance with respect to the state-of-the-art methods.

...read moreread less

Proceedings Article•DOI•

Sherlock: A Deep Learning Approach to Semantic Data Type Detection

[...]

Madelon Hulsebos¹, Kevin Hu¹, Michiel A. Bakker¹, Emanuel Zgraggen¹, Arvind Satyanarayan¹, Tim Kraska¹, Çağatay Demiralp, César A. Hidalgo¹ - Show less +4 more•Institutions (1)

Massachusetts Institute of Technology¹

25 Jul 2019

TL;DR: Sherlock is introduced, a multi-input deep neural network for detecting semantic types that achieves a support-weighted F$_1 score of $0.89, exceeding that of machine learning baselines, dictionary and regular expression benchmarks, and the consensus of crowdsourced annotations.

...read moreread less

Abstract: Correctly detecting the semantic type of data columns is crucial for data science tasks such as automated data cleaning, schema matching, and data discovery. Existing data preparation and analysis systems rely on dictionary lookups and regular expression matching to detect semantic types. However, these matching-based approaches often are not robust to dirty data and only detect a limited number of types. We introduce Sherlock, a multi-input deep neural network for detecting semantic types. We train Sherlock on $686,765$ data columns retrieved from the VizNet corpus by matching $78$ semantic types from DBpedia to column headers. We characterize each matched column with $1,588$ features describing the statistical properties, character distributions, word embeddings, and paragraph vectors of column values. Sherlock achieves a support-weighted F$_1$ score of $0.89$, exceeding that of machine learning baselines, dictionary and regular expression benchmarks, and the consensus of crowdsourced annotations.

...read moreread less

Journal Article•DOI•

A survey of advances in vision-based vehicle re-identification

[...]

Sultan Daud Khan, Habib Ullah

01 May 2019-Computer Vision and Image Understanding

TL;DR: The detail analysis of different V-reID methods in terms of mean average precision (mAP) and cumulative matching curve (CMC) provide objective insight into the strengths and weaknesses of these methods.

...read moreread less

Proceedings Article•DOI•

Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots

[...]

Chongyang Tao¹, Wei Wu², Can Xu², Wenpeng Hu¹, Dongyan Zhao¹, Rui Yan¹ - Show less +2 more•Institutions (2)

Peking University¹, Microsoft²

30 Jan 2019

TL;DR: This work proposes a multi-representation fusion network where the representations can be fused into matching at an early stage, at an intermediate stage, or at the last stage, and demonstrates the effect of each representation to matching, which sheds light on how to select them in practical systems.

...read moreread less

Abstract: We consider context-response matching with multiple types of representations for multi-turn response selection in retrieval-based chatbots. The representations encode semantics of contexts and responses on words, n-grams, and sub-sequences of utterances, and capture both short-term and long-term dependencies among words. With such a number of representations in hand, we study how to fuse them in a deep neural architecture for matching and how each of them contributes to matching. To this end, we propose a multi-representation fusion network where the representations can be fused into matching at an early stage, at an intermediate stage, or at the last stage. We empirically compare different representations and fusing strategies on two benchmark data sets. Evaluation results indicate that late fusion is always better than early fusion, and by fusing the representations at the last stage, our model significantly outperforms the existing methods, and achieves new state-of-the-art performance on both data sets. Through a thorough ablation study, we demonstrate the effect of each representation to matching, which sheds light on how to select them in practical systems.

...read moreread less

Proceedings Article•DOI•

Fast and Robust Multi-Person 3D Pose Estimation From Multiple Views

[...]

Junting Dong¹, Wen Jiang¹, Qixing Huang², Hujun Bao¹, Xiaowei Zhou¹ - Show less +1 more•Institutions (2)

Zhejiang University¹, University of Texas at Austin²

14 Jan 2019

TL;DR: This paper addresses the problem of 3D pose estimation for multiple people in a few calibrated camera views by using a multi-way matching algorithm to cluster the detected 2D poses in all views and proposes to combine geometric and appearance cues for cross-view matching.

...read moreread less

Abstract: This paper addresses the problem of 3D pose estimation for multiple people in a few calibrated camera views. The main challenge of this problem is to find the cross-view correspondences among noisy and incomplete 2D pose predictions. Most previous methods address this challenge by directly reasoning in 3D using a pictorial structure model, which is inefficient due to the huge state space. We propose a fast and robust approach to solve this problem. Our key idea is to use a multi-way matching algorithm to cluster the detected 2D poses in all views. Each resulting cluster encodes 2D poses of the same person across different views and consistent correspondences across the keypoints, from which the 3D pose of each person can be effectively inferred. The proposed convex optimization based multi-way matching algorithm is efficient and robust against missing and false detections, without knowing the number of people in the scene. Moreover, we propose to combine geometric and appearance cues for cross-view matching. The proposed approach achieves significant performance gains from the state-of-the-art (96.3% vs. 90.6% and 96.9% vs. 88% on the Campus and Shelf datasets, respectively), while being efficient for real-time applications.

...read moreread less

Proceedings Article•DOI•

Multi-Level Context Ultra-Aggregation for Stereo Matching

[...]

Guang-Yu Nie¹, Ming-Ming Cheng², Yun Liu², Zhengfa Liang, Deng-Ping Fan², Yue Liu¹, Yongtian Wang¹ - Show less +3 more•Institutions (2)

Beijing Institute of Technology¹, Nankai University²

15 Jun 2019

TL;DR: Experimental results show that the MCUA scheme for cost volume calculation outperforms state-of-the-art methods by a notable margin and effectively improves the accuracy of stereo matching.

...read moreread less

Abstract: Exploiting multi-level context information to cost volume can improve the performance of learning-based stereo matching methods. In recent years, 3-D Convolution Neural Networks (3-D CNNs) show the advantages in regularizing cost volume but are limited by unary features learning in matching cost computation. However, existing methods only use features from plain convolution layers or a simple aggregation of multi-level features to calculate cost volume, which is insufficient because stereo matching requires discriminative features to identify corresponding pixels in rectified stereo image pairs. In this paper, we propose a unary features descriptor using multi-level context ultra-aggregation (MCUA), which encapsulates all convolutional features into a more discriminative representation by intra- and inter-level features combination. Specifically, a child module that takes low-resolution images as input captures larger context information; the larger context information from each layer is densely connected to the main branch of the network. MCUA makes good usage of multi-level features with richer context and performs the image-to-image prediction holistically. We introduce our MCUA scheme for cost volume calculation and test it on PSM-Net. We also evaluate our method on Scene Flow and KITTI 2012/2015 stereo datasets. Experimental results show that our method outperforms state-of-the-art methods by a notable margin and effectively improves the accuracy of stereo matching.

...read moreread less

Journal Article•DOI•

Tracking community evolution in social networks: A survey

[...]

Narimene Dakiche¹, Fatima Benbouzid-Si Tayeb¹, Yahya Slimani, Karima Benatchba¹•Institutions (1)

École Normale Supérieure¹

01 May 2019-Information Processing and Management

TL;DR: A classification of various methods for tracking community evolution in dynamic social networks into four main approaches using as a criterion the functioning principle is proposed, based on independent successive static detection and matching.

...read moreread less

Abstract: This paper presents a survey of previous studies done on the problem of tracking community evolution over time in dynamic social networks. This problem is of crucial importance in the field of social network analysis. The goal of our paper is to classify existing methods dealing with the issue. We propose a classification of various methods for tracking community evolution in dynamic social networks into four main approaches using as a criterion the functioning principle: the first one is based on independent successive static detection and matching; the second is based on dependent successive static detection; the third is based on simultaneous study of all stages of community evolution; finally, the fourth and last one concerns methods working directly on temporal networks. Our paper starts by giving basic concepts about social networks, community structure and strategies for evaluating community detection methods. Then, it describes the different approaches, and exposes the strengths as well as the weaknesses of each.

...read moreread less

Proceedings Article•DOI•

Unsupervised Person Re-Identification by Camera-Aware Similarity Consistency Learning

[...]

Ancong Wu¹, Wei-Shi Zheng¹, Jianhuang Lai¹•Institutions (1)

Sun Yat-sen University¹

01 Oct 2019

TL;DR: A coarse-to-fine consistency learning scheme to learn consistency globally and locally in two steps to avoid learning ineffective knowledge in consistency learning and preserve the prior common knowledge of intra-camera matching in the pretrained model as reliable guiding information, which does not suffer from cross-camera scene variation as cross- camera matching.

...read moreread less

Abstract: For matching pedestrians across disjoint camera views in surveillance, person re-identification (Re-ID) has made great progress in supervised learning. However, it is infeasible to label data in a number of new scenes when extending a Re-ID system. Thus, studying unsupervised learning for Re-ID is important for saving labelling cost. Yet, cross-camera scene variation is a key challenge for unsupervised Re-ID, such as illumination, background and viewpoint variations, which cause domain shift in the feature space and result in inconsistent pairwise similarity distributions that degrade matching performance. To alleviate the effect of cross-camera scene variation, we propose a Camera-Aware Similarity Consistency Loss to learn consistent pairwise similarity distributions for intra-camera matching and cross-camera matching. To avoid learning ineffective knowledge in consistency learning, we preserve the prior common knowledge of intra-camera matching in the pretrained model as reliable guiding information, which does not suffer from cross-camera scene variation as cross-camera matching. To learn similarity consistency more effectively, we further develop a coarse-to-fine consistency learning scheme to learn consistency globally and locally in two steps. Experiments show that our method outperformed the state-of-the-art unsupervised Re-ID methods.

...read moreread less

Proceedings Article•DOI•

Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification

[...]

Zhi-Xiu Ye¹, Zhen-Hua Ling¹•Institutions (1)

University of Science and Technology of China¹

01 Jun 2019

TL;DR: A multi-level matching and aggregation network (MLMAN) for few-shot relation classification that encodes the query instance and each support set in an interactive way by considering their matching information at both local and instance levels.

...read moreread less

Abstract: This paper presents a multi-level matching and aggregation network (MLMAN) for few-shot relation classification. Previous studies on this topic adopt prototypical networks, which calculate the embedding vector of a query instance and the prototype vector of the support set for each relation candidate independently. On the contrary, our proposed MLMAN model encodes the query instance and each support set in an interactive way by considering their matching information at both local and instance levels. The final class prototype for each support set is obtained by attentive aggregation over the representations of support instances, where the weights are calculated using the query instance. Experimental results demonstrate the effectiveness of our proposed methods, which achieve a new state-of-the-art performance on the FewRel dataset.

...read moreread less

Journal Article•DOI•

Propensity-score matching with competing risks in survival analysis

[...]

Peter C. Austin¹, Peter C. Austin², Peter C. Austin³, Jason P. Fine⁴•Institutions (4)

University of Toronto¹, Sunnybrook Research Institute², International Council for the Exploration of the Sea³, University of North Carolina at Chapel Hill⁴

28 Feb 2019-Statistics in Medicine

TL;DR: This work describes how both relative and absolute measures of treatment effect can be obtained when using propensity‐score matching with competing risks data and recommends that a marginal subdistribution hazard model that accounts for the within‐pair clustering of outcomes be used to test the equality of CIFs and to estimate subdist distribution hazard ratios.

...read moreread less

Abstract: Propensity-score matching is a popular analytic method to remove the effects of confounding due to measured baseline covariates when using observational data to estimate the effects of treatment. Time-to-event outcomes are common in medical research. Competing risks are outcomes whose occurrence precludes the occurrence of the primary time-to-event outcome of interest. All non-fatal outcomes and all cause-specific mortality outcomes are potentially subject to competing risks. There is a paucity of guidance on the conduct of propensity-score matching in the presence of competing risks. We describe how both relative and absolute measures of treatment effect can be obtained when using propensity-score matching with competing risks data. Estimates of the relative effect of treatment can be obtained by using cause-specific hazard models in the matched sample. Estimates of absolute treatment effects can be obtained by comparing cumulative incidence functions (CIFs) between matched treated and matched control subjects. We conducted a series of Monte Carlo simulations to compare the empirical type I error rate of different statistical methods for testing the equality of CIFs estimated in the matched sample. We also examined the performance of different methods to estimate the marginal subdistribution hazard ratio. We recommend that a marginal subdistribution hazard model that accounts for the within-pair clustering of outcomes be used to test the equality of CIFs and to estimate subdistribution hazard ratios. We illustrate the described methods by using data on patients discharged from hospital with acute myocardial infarction to estimate the effect of discharge prescribing of statins on cardiovascular death.

...read moreread less

Proceedings Article•DOI•

Adversarial Representation Learning for Text-to-Image Matching

[...]

Nikolaos Sarafianos¹, Xiang Xu¹, Ioannis A. Kakadiaris¹•Institutions (1)

University of Houston¹

01 Oct 2019

TL;DR: TIMAM as mentioned in this paper is a text-image modality adversarial matching approach that learns modality-invariant feature representations using adversarial and cross-modal matching objectives, achieving state-of-the-art performance on four widely-used publicly-available datasets.

...read moreread less

Abstract: For many computer vision applications such as image captioning, visual question answering, and person search, learning discriminative feature representations at both image and text level is an essential yet challenging problem. Its challenges originate from the large word variance in the text domain as well as the difficulty of accurately measuring the distance between the features of the two modalities. Most prior work focuses on the latter challenge, by introducing loss functions that help the network learn better feature representations but fail to account for the complexity of the textual input. With that in mind, we introduce TIMAM: a Text-Image Modality Adversarial Matching approach that learns modality-invariant feature representations using adversarial and cross-modal matching objectives. In addition, we demonstrate that BERT, a publicly-available language model that extracts word embeddings, can successfully be applied in the text-to-image matching domain. The proposed approach achieves state-of-the-art cross-modal matching performance on four widely-used publicly-available datasets resulting in absolute improvements ranging from 2% to 5% in terms of rank-1 accuracy.

...read moreread less

Journal Article•DOI•

Score level based latent fingerprint enhancement and matching using SIFT feature

[...]

Adhiyaman Manickam¹, Ezhilmaran Devarasan², Gunasekaran Manogaran³, M. K. Priyan³, R. Varatharajan, Ching-Hsien Hsu⁴, Raja Krishnamoorthi¹ - Show less +3 more•Institutions (4)

Saveetha University¹, VIT University², University of California, Davis³, Chung Hua University⁴

01 Feb 2019-Multimedia Tools and Applications

TL;DR: This model is to propose the enhancement and matching for latent fingerprints using Scale Invariant Feature Transformation (SIFT), and the matching result is obtained satisfactory compare than minutiae points.

...read moreread less

Abstract: Latent fingerprint identification is such a difficult task to law enforcement agencies and border security in identifying suspects. It is a too complicate due to poor quality images with non-linear distortion and complex background noise. Hence, the image quality is required for matching those latent fingerprints. The current researchers have been working based on minutiae points for fingerprint matching because of their accuracy are acceptable. In an effort to extend technology for fingerprint matching, our model is to propose the enhancementand matching for latent fingerprints using Scale Invariant Feature Transformation (SIFT). It has involved in two phases (i) Latent fingerprint contrast enhancement using intuitionistic type-2 fuzzy set (ii) Extract the SIFTfeature points from the latent fingerprints. Then thematching algorithm is performedwith n- number of images and scoresare calculated by Euclidean distance. We tested our algorithm for matching, usinga public domain fingerprint database such as FVC-2004 and IIIT-latent fingerprint. The experimental consequences indicatethe matching result is obtained satisfactory compare than minutiae points.

...read moreread less

Journal Article•DOI•

DeepCF: A Unified Framework of Representation Learning and Matching Function Learning in Recommender System

[...]

Zhi-Hong Deng¹, Ling Huang¹, Chang-Dong Wang¹, Jian-Huang Lai¹, Philip S. Yu² - Show less +1 more•Institutions (2)

Sun Yat-sen University¹, University of Illinois at Chicago²

17 Jul 2019

TL;DR: A general framework named DeepCF, short for Deep Collaborative Filtering, is proposed, to combine the strengths of the two types of methods and overcome such flaws in dot product and low-rank relations respectively.

...read moreread less

Abstract: In general, recommendation can be viewed as a matching problem, i.e., match proper items for proper users. However, due to the huge semantic gap between users and items, it’s almost impossible to directly match users and items in their initial representation spaces. To solve this problem, many methods have been studied, which can be generally categorized into two types, i.e., representation learning-based CF methods and matching function learning-based CF methods. Representation learning-based CF methods try to map users and items into a common representation space. In this case, the higher similarity between a user and an item in that space implies they match better. Matching function learning-based CF methods try to directly learn the complex matching function that maps user-item pairs to matching scores. Although both methods are well developed, they suffer from two fundamental flaws, i.e., the limited expressiveness of dot product and the weakness in capturing low-rank relations respectively. To this end, we propose a general framework named DeepCF, short for Deep Collaborative Filtering, to combine the strengths of the two types of methods and overcome such flaws. Extensive experiments on four publicly available datasets demonstrate the effectiveness of the proposed DeepCF framework.

...read moreread less

Proceedings Article•DOI•

Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking

[...]

Tan Wang¹, Xing Xu¹, Yang Yang¹, Alan Hanjalic², Heng Tao Shen¹, Jingkuan Song¹ - Show less +2 more•Institutions (2)

University of Electronic Science and Technology of China¹, Delft University of Technology²

15 Oct 2019

TL;DR: This work proposes a novel Multi-modal Tensor Fusion Network (MTFN) to explicitly learn an accurate image-text similarity function with rank-based tensor fusion rather than seeking a common embedding space for each image- text instance.

...read moreread less

Abstract: A major challenge in matching images and text is that they have intrinsically different data distributions and feature representations. Most existing approaches are based either on embedding or classification, the first one mapping image and text instances into a common embedding space for distance measuring, and the second one regarding image-text matching as a binary classification problem. Neither of these approaches can, however, balance the matching accuracy and model complexity well. We propose a novel framework that achieves remarkable matching performance with acceptable model complexity. Specifically, in the training stage, we propose a novel Multi-modal Tensor Fusion Network (MTFN) to explicitly learn an accurate image-text similarity function with rank-based tensor fusion rather than seeking a common embedding space for each image-text instance. Then, during testing, we deploy a generic Cross-modal Re-ranking (RR) scheme for refinement without requiring additional training procedure. Extensive experiments on two datasets demonstrate that our MTFN-RR consistently achieves the state-of-the-art matching performance with much less time complexity.

...read moreread less

Collapse