Showing papers on "Matching (statistics) published in 2012"

PDF

Open Access

Journal Article•DOI•

Hiring as cultural matching: The case of elite professional service firms

[...]

28 Nov 2012-American Sociological Review

TL;DR: The authors presents a case study of hiring in elite professional service firms, and investigates the often suggested but heretofore empiricallistic assumptions about culture as a vehicle of labor market sorting.

...read moreread less

Abstract: This article presents culture as a vehicle of labor market sorting. Providing a case study of hiring in elite professional service firms, I investigate the often suggested but heretofore empiricall...

...read moreread less

813 citations

Book•

Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection

[...]

Peter Christen

05 Jul 2012

TL;DR: Data matching (also known as record or data linkage, entity resolution, object identification, or field matching) is the task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database as mentioned in this paper.

...read moreread less

Abstract: Data matching (also known as record or data linkage, entity resolution, object identification, or field matching) is the task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database. Based on research in various domains including applied statistics, health informatics, data mining, machine learning, artificial intelligence, database management, and digital libraries, significant advances have been achieved over the last decade in all aspects of the data matching process, especially on how to improve the accuracy of data matching, and its scalability to large databases. Peter Christens book is divided into three parts: Part I, Overview, introduces the subject by presenting several sample applications and their special challenges, as well as a general overview of a generic data matching process. Part II, Steps of the Data Matching Process, then details its main steps like pre-processing, indexing, field and record comparison, classification, and quality evaluation. Lastly, part III, Further Topics, deals with specific aspects like privacy, real-time matching, or matching unstructured data. Finally, it briefly describes the main features of many research and open source systems available today. By providing the reader with a broad range of data matching concepts and techniques and touching on all aspects of the data matching process, this book helps researchers as well as students specializing in data quality or data matching aspects to familiarize themselves with recent research advances and to identify open research challenges in the area of data matching. To this end, each chapter of the book includes a final section that provides pointers to further background and research material. Practitioners will better understand the current state of the art in data matching as well as the internal workings and limitations of current systems. Especially, they will learn that it is often not feasible to simply implement an existing off-the-shelf data matching system without substantial adaption and customization. Such practical considerations are discussed for each of the major steps in the data matching process.

...read moreread less

713 citations

Journal Article•DOI•

A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication

[...]

Peter Christen¹•Institutions (1)

Australian National University¹

01 Sep 2012-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A survey of 12 variations of 6 indexing techniques for record linkage and deduplication aimed at reducing the number of record pairs to be compared in the matching process by removing obvious nonmatching pairs, while at the same time maintaining high matching quality is presented.

...read moreread less

Abstract: Record linkage is the process of matching records from several databases that refer to the same entities. When applied on a single database, this process is known as deduplication. Increasingly, matched data are becoming important in many application areas, because they can contain information that is not available otherwise, or that is too costly to acquire. Removing duplicate records in a single database is a crucial step in the data cleaning process, because duplicates can severely influence the outcomes of any subsequent data processing or data mining. With the increasing size of today's databases, the complexity of the matching process becomes one of the major challenges for record linkage and deduplication. In recent years, various indexing techniques have been developed for record linkage and deduplication. They are aimed at reducing the number of record pairs to be compared in the matching process by removing obvious nonmatching pairs, while at the same time maintaining high matching quality. This paper presents a survey of 12 variations of 6 indexing techniques. Their complexity is analyzed, and their performance and scalability is evaluated within an experimental framework using both synthetic and real data sets. No such detailed survey has so far been published.

...read moreread less

663 citations

Book Chapter•DOI•

Relaxed pairwise learned metric for person re-identification

[...]

Martin Hirzer¹, Peter M. Roth¹, Martin Köstinger¹, Horst Bischof¹•Institutions (1)

Graz University of Technology¹

07 Oct 2012

TL;DR: This paper proposes to learn a metric from pairs of samples from different cameras, so that even less sophisticated features describing color and texture information are sufficient for finally getting state-of-the-art classification results.

...read moreread less

Abstract: Matching persons across non-overlapping cameras is a rather challenging task. Thus, successful methods often build on complex feature representations or sophisticated learners. A recent trend to tackle this problem is to use metric learning to find a suitable space for matching samples from different cameras. However, most of these approaches ignore the transition from one camera to the other. In this paper, we propose to learn a metric from pairs of samples from different cameras. In this way, even less sophisticated features describing color and texture information are sufficient for finally getting state-of-the-art classification results. Moreover, once the metric has been learned, only linear projections are necessary at search time, where a simple nearest neighbor classification is performed. The approach is demonstrated on three publicly available datasets of different complexity, where it can be seen that state-of-the-art results can be obtained at much lower computational costs.

...read moreread less

472 citations

Journal Article•DOI•

How to control confounding effects by statistical analysis

[...]

Mohamad Amin Pourhoseingholi¹, Ahmad Reza Baghestani², Mohsen Vahedi³•Institutions (3)

Shahid Beheshti University of Medical Sciences and Health Services¹, Islamic Azad University South Tehran Branch², Tehran University of Medical Sciences³

10 Mar 2012-Gastroenterology and hepatology from bed to bench

TL;DR: A Confounder is a variable whose presence affects the variables being studied so that the results do not reflect the actual relationship.

...read moreread less

Abstract: A Confounder is a variable whose presence affects the variables being studied so that the results do not reflect the actual relationship. There are various ways to exclude or control confounding variables including Randomization, Restriction and Matching. But all these methods are applicable at the time of study design. When experimental designs are premature, impractical, or impossible, researchers must rely on statistical methods to adjust for potentially confounding effects. These Statistical models (especially regression models) are flexible to eliminate the effects of confounders.

...read moreread less

373 citations

Journal Article•DOI•

One-to-many propensity score matching in cohort studies

[...]

Jeremy A. Rassen¹, Abhi Shelat², Jessica A. Myers¹, Robert J. Glynn¹, Kenneth J. Rothman³, Sebastian Schneeweiss¹ - Show less +2 more•Institutions (3)

Brigham and Women's Hospital¹, University of Virginia², Research Triangle Park³

01 May 2012-Pharmacoepidemiology and Drug Safety

TL;DR: Among the large number of cohort studies that employ propensity score matching, most match patients 1:1 but increasing the matching ratio is thought to improve precision but may come with a trade‐off with respect to bias.

...read moreread less

Abstract: Background Among the large number of cohort studies that employ propensity score matching, most match patients 1:1. Increasing the matching ratio is thought to improve precision but may come with a trade-off with respect to bias. Objective To evaluate several methods of propensity score matching in cohort studies through simulation and empirical analyses. Methods We simulated cohorts of 20 000 patients with exposure prevalence of 10%–50%. We simulated five dichotomous and five continuous confounders. We estimated propensity scores and matched using digit-based greedy (“greedy”), pairwise nearest neighbor within a caliper (“nearest neighbor”), and a nearest neighbor approach that sought to balance the scores of the comparison patient above and below that of the treated patient (“balanced nearest neighbor”). We matched at both fixed and variable matching ratios and also evaluated sequential and parallel schemes for the order of formation of 1:n match groups. We then applied this same approach to two cohorts of patients drawn from administrative claims data. Results Increasing the match ratio beyond 1:1 generally resulted in somewhat higher bias. It also resulted in lower variance with variable ratio matching but higher variance with fixed. The parallel approach generally resulted in higher mean squared error but lower bias than the sequential approach. Variable ratio, parallel, balanced nearest neighbor matching generally yielded the lowest bias and mean squared error. Conclusions 1:n matching can be used to increase precision in cohort studies. We recommend a variable ratio, parallel, balanced 1:n, nearest neighbor approach that increases precision over 1:1 matching at a small cost in bias. Copyright © 2012 John Wiley & Sons, Ltd.

...read moreread less

363 citations

Book Chapter•DOI•

Person re-identification: what features are important?

[...]

Chunxiao Liu¹, Shaogang Gong², Chen Change Loy, Xinggang Lin¹•Institutions (2)

Tsinghua University¹, Queen Mary University of London²

07 Oct 2012

TL;DR: This study shows that certain features play more important role than others under different circumstances, and proposes a novel unsupervised approach for learning a bottom-up feature importance, so features extracted from different individuals are weighted adaptively driven by their unique and inherent appearance attributes.

...read moreread less

Abstract: State-of-the-art person re-identification methods seek robust person matching through combining various feature types. Often, these features are implicitly assigned with a single vector of global weights, which are assumed to be universally good for all individuals, independent to their different appearances. In this study, we show that certain features play more important role than others under different circumstances. Consequently, we propose a novel unsupervised approach for learning a bottom-up feature importance, so features extracted from different individuals are weighted adaptively driven by their unique and inherent appearance attributes. Extensive experiments on two public datasets demonstrate that attribute-sensitive feature importance facilitates more accurate person matching when it is fused together with global weights obtained using existing methods.

...read moreread less

318 citations

Proceedings Article•DOI•

Image Retrieval for Image-Based Localization Revisited.

[...]

Torsten Sattler¹, Tobias Weyand¹, Bastian Leibe¹, Leif Kobbelt¹•Institutions (1)

RWTH Aachen University¹

01 Jan 2012

TL;DR: It is shown that retrieval methods using a selective voting scheme are able to outperform state-of-the-art direct matching methods and how both selective voting and correspondence computation can be accelerated by using a Hamming embedding of feature descriptors.

...read moreread less

Abstract: To reliably determine the camera pose of an image relative to a 3D point cloud of a scene, correspondences between 2D features and 3D points are needed. Recent work has demonstrated that directly matching the features against the points outperforms methods that take an intermediate image retrieval step in terms of the number of images that can be localized successfully. Yet, direct matching is inherently less scalable than retrievalbased approaches. In this paper, we therefore analyze the algorithmic factors that cause the performance gap and identify false positive votes as the main source of the gap. Based on a detailed experimental evaluation, we show that retrieval methods using a selective voting scheme are able to outperform state-of-the-art direct matching methods. We explore how both selective voting and correspondence computation can be accelerated by using a Hamming embedding of feature descriptors. Furthermore, we introduce a new dataset with challenging query images for the evaluation of image-based localization.

...read moreread less

302 citations

Journal Article•DOI•

Estimating Heterogeneous Treatment Effects with Observational Data

[...]

Yu Xie¹, Jennie E. Brand², Ben Jann³•Institutions (3)

University of Michigan¹, University of California, Los Angeles², University of Bern³

01 Aug 2012-Sociological Methodology

TL;DR: A practical approach to studying heterogeneous treatment effects as a function of the treatment propensity, under the same assumption commonly underlying regression analysis: ignorability is discussed.

...read moreread less

Abstract: Individuals differ not only in their background characteristics, but also in how they respond to a particular treatment, intervention, or stimulation. In particular, treatment effects may vary systematically by the propensity for treatment. In this paper, we discuss a practical approach to studying heterogeneous treatment effects as a function of the treatment propensity, under the same assumption commonly underlying regression analysis: ignorability. We describe one parametric method and two non-parametric methods for estimating interactions between treatment and the propensity for treatment. For the first method, we begin by estimating propensity scores for the probability of treatment given a set of observed covariates for each unit and construct balanced propensity score strata; we then estimate propensity score stratum-specific average treatment effects and evaluate a trend across them. For the second method, we match control units to treated units based on the propensity score and transform the data into treatment-control comparisons at the most elementary level at which such comparisons can be constructed; we then estimate treatment effects as a function of the propensity score by fitting a non-parametric model as a smoothing device. For the third method, we first estimate non-parametric regressions of the outcome variable as a function of the propensity score separately for treated units and for control units and then take the difference between the two non-parametric regressions. We illustrate the application of these methods with an empirical example of the effects of college attendance on womens fertility.

...read moreread less

287 citations

Proceedings Article•

Evaluation of local detectors and descriptors for fast feature matching

[...]

Ondrej Miksik, Krystian Mikolajczyk

01 Nov 2012

TL;DR: A performance evaluation of recent feature detectors is provided and compares their matching precision and speed in randomized kd-trees setup as well as an evaluation of binary descriptors with efficient computation of Hamming distance.

...read moreread less

Abstract: Local feature detectors and descriptors are widely used in many computer vision applications and various methods have been proposed during the past decade. There have been a number of evaluations focused on various aspects of local features, matching accuracy in particular, however there has been no comparisons considering the accuracy and speed trade-offs of recent extractors such as BRIEF, BRISK, ORB, MRRID, MROGH and LIOP. This paper provides a performance evaluation of recent feature detectors and compares their matching precision and speed in randomized kd-trees setup as well as an evaluation of binary descriptors with efficient computation of Hamming distance.

...read moreread less

282 citations

Proceedings Article•DOI•

InfoGather: entity augmentation and attribute discovery by holistic matching with web tables

[...]

Mohamed Yakout¹, Kris Ganjam², Kaushik Chakrabarti², Surajit Chaudhuri²•Institutions (2)

Purdue University¹, Microsoft²

20 May 2012

TL;DR: A novel architecture that leverages preprocessing in MapReduce to achieve extremely fast response times at query time is proposed and has significantly higher precision and coverage and four orders of magnitude faster response times compared with the state-of-the-art approach.

...read moreread less

Abstract: The Web contains a vast corpus of HTML tables, specifically entity attribute tables. We present three core operations, namely entity augmentation by attribute name, entity augmentation by example and attribute discovery, that are useful for "information gathering" tasks (e.g., researching for products or stocks). We propose to use web table corpus to perform them automatically. We require the operations to have high precision and coverage, have fast (ideally interactive) response times and be applicable to any arbitrary domain of entities. The naive approach that attempts to directly match the user input with the web tables suffers from poor precision and coverage.Our key insight is that we can achieve much higher precision and coverage by considering indirectly matching tables in addition to the directly matching ones. The challenge is to be robust to spuriously matched tables: we address it by developing a holistic matching framework based on topic sensitive pagerank and an augmentation framework that aggregates predictions from multiple matched tables. We propose a novel architecture that leverages preprocessing in MapReduce to achieve extremely fast response times at query time. Our experiments on real-life datasets and 573M web tables show that our approach has (i) significantly higher precision and coverage and (ii) four orders of magnitude faster response times compared with the state-of-the-art approach.

...read moreread less

Journal Article•DOI•

Evaluation of existing image matching methods for deriving glacier surface displacements globally from optical satellite imagery

[...]

Torborg Heid, Andreas Kääb

15 Mar 2012-Remote Sensing of Environment

TL;DR: In this paper, the authors compare and evaluate different image matching methods for glacier flow determination over large scales, and they consider CCF-O and COSI-Corr to be the two most robust matching methods.

...read moreread less

Journal Article•DOI•

Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure After Surgery

[...]

José R. Zubizarreta¹•Institutions (1)

University of Pennsylvania¹

21 Dec 2012-Journal of the American Statistical Association

TL;DR: In this paper, the authors present a new method for optimal matching in observational studies based on mixed integer programming, which achieves covariate balance directly by minimizing both the total sum of distances and a weighted sum of specific measures of covariate imbalance.

...read moreread less

Abstract: This article presents a new method for optimal matching in observational studies based on mixed integer programming. Unlike widely used matching methods based on network algorithms, which attempt to achieve covariate balance by minimizing the total sum of distances between treated units and matched controls, this new method achieves covariate balance directly, either by minimizing both the total sum of distances and a weighted sum of specific measures of covariate imbalance, or by minimizing the total sum of distances while constraining the measures of imbalance to be less than or equal to certain tolerances. The inclusion of these extra terms in the objective function or the use of these additional constraints explicitly optimizes or constrains the criteria that will be used to evaluate the quality of the match. For example, the method minimizes or constrains differences in univariate moments, such as means, variances, and skewness; differences in multivariate moments, such as correlations between covari...

...read moreread less

Posted Content•

Propensity score matching in SPSS

[...]

Felix Thoemmes

30 Jan 2012-arXiv: Applications

TL;DR: The presented SPSS custom dialog allows researchers to specify propensity score methods using the familiar point-and-click interface, and the software allows estimation of the propensity score using logistic regression and specifying nearest-neighbor matching with many options.

...read moreread less

Abstract: Propensity score matching is a tool for causal inference in non-randomized studies that allows for conditioning on large sets of covariates. The use of propensity scores in the social sciences is currently experiencing a tremendous increase; however it is far from a commonly used tool. One impediment towards a more wide-spread use of propensity score methods is the reliance on specialized software, because many social scientists still use SPSS as their main analysis tool. The current paper presents an implementation of various propensity score matching methods in SPSS. Specifically the presented SPSS custom dialog allows researchers to specify propensity score methods using the familiar point-and-click interface. The software allows estimation of the propensity score using logistic regression and specifying nearest-neighbor matching with many options, e.g., calipers, region of common support, matching with and without replacement, and matching one to many units. Detailed balance statistics and graphs are produced by the program.

...read moreread less

Journal Article•DOI•

Fatter Attraction: Anthropometric and Socioeconomic Matching on the Marriage Market

[...]

Pierre-André Chiappori, Sonia Oreffice, Climent Quintana-Domeque

01 Aug 2012-Journal of Political Economy

TL;DR: In this paper, a marriage market model of matching along multiple dimensions, some of which are unobservable, was constructed, in which individual preferences can be summarized by a one-dimensional index combining the various characteristics.

...read moreread less

Abstract: We construct a marriage market model of matching along multiple dimensions, some of which are unobservable, in which individual preferences can be summarized by a one-dimensional index combining the various characteristics. We show that, under testable assumptions, these indices are ordinally identified and that the male and female trade-offs between their partners’ characteristics are overidentified. Using PSID data on married couples, we recover the marginal rates of substitution between body mass index (BMI) and wages or education: men may compensate 1.3 additional units of BMI with a 1 percent increase in wages, whereas women may compensate two BMI units with 1 year of education.

...read moreread less

Journal Article•DOI•

Propensity scores: From naïve enthusiasm to intuitive understanding

[...]

Elizabeth A. Williamson¹, Elizabeth A. Williamson², Ruth Morley², Alan Lucas³, James R. Carpenter³ - Show less +1 more•Institutions (3)

Monash University¹, University of Melbourne², University of London³

01 Jun 2012-Statistical Methods in Medical Research

TL;DR: This article provides a non-technical and intuitive discussion of propensity score methodology, motivating the use of the propensity score approach by analogy with randomised studies, and describes the four main ways in which this methodology can be implemented.

...read moreread less

Abstract: Estimation of the effect of a binary exposure on an outcome in the presence of confounding is often carried out via outcome regression modelling. An alternative approach is to use propensity score methodology. The propensity score is the conditional probability of receiving the exposure given the observed covariates and can be used, under the assumption of no unmeasured confounders, to estimate the causal effect of the exposure. In this article, we provide a non-technical and intuitive discussion of propensity score methodology, motivating the use of the propensity score approach by analogy with randomised studies, and describe the four main ways in which this methodology can be implemented. We carefully describe the population parameters being estimated - an issue that is frequently overlooked in the medical literature. We illustrate these four methods using data from a study investigating the association between maternal choice to provide breast milk and the infant's subsequent neurodevelopment. We outline useful extensions of propensity score methodology and discuss directions for future research. Propensity score methods remain controversial and there is no consensus as to when, if ever, they should be used in place of traditional outcome regression models. We therefore end with a discussion of the relative advantages and disadvantages of each.

...read moreread less

Proceedings Article•DOI•

Large-scale interactive ontology matching: algorithms and implementation

[...]

Ernesto Jiménez-Ruiz¹, Bernardo Cuenca Grau¹, Yujiao Zhou¹, Ian Horrocks¹•Institutions (1)

University of Oxford¹

27 Aug 2012

TL;DR: This paper presents the ontology matching system LogMap 2, a much improved version of its predecessor LogMap, which supports user interaction during the matching process, which is essential for use cases requiring very accurate mappings.

...read moreread less

Abstract: In this paper we present the ontology matching system LogMap 2, a much improved version of its predecessor LogMap. LogMap 2 supports user interaction during the matching process, which is essential for use cases requiring very accurate mappings. Interactivity, however, imposes very strict scalability requirements; we are able to satisfy these requirements by providing real-time user response even for large-scale ontologies. Finally, LogMap 2 implements scalable reasoning and diagnosis algorithms, which minimise any logical inconsistencies introduced by the matching process.

...read moreread less

Journal Article•DOI•

Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees

[...]

Zhuolin Jiang¹, Zhe Lin², Larry S. Davis¹•Institutions (2)

University of Maryland, College Park¹, Adobe Systems²

01 Mar 2012-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work introduces a shape-motion prototype-based approach for action recognition that enables robust action matching in challenging situations and allows automatic alignment of action sequences.

...read moreread less

Abstract: A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, an action prototype tree is learned in a joint shape and motion space via hierarchical K-means clustering and each training sequence is represented as a labeled prototype sequence; then a look-up table of prototype-to-prototype distances is generated. During testing, based on a joint probability model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint probability, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance measures used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 92.86 percent on a large gesture data set (with dynamic backgrounds), 100 percent on the Weizmann action data set, 95.77 percent on the KTH action data set, 88 percent on the UCF sports data set, and 87.27 percent on the CMU action data set.

...read moreread less

Proceedings Article•DOI•

Fine-grained private matching for proximity-based mobile social networking

[...]

Rui Zhang¹, Yanchao Zhang¹, Jinyuan Sun¹, Guanhua Yan¹•Institutions (1)

Arizona State University¹

25 Mar 2012

TL;DR: This paper designs a suite of novel fine-grained private matching protocols for proximity-based mobile social networking that allow finer differentiation between PMSN users and can support a wide range of matching metrics at different privacy levels.

...read moreread less

Abstract: Proximity-based mobile social networking (PMSN) refers to the social interaction among physically proximate mobile users directly through the Bluetooth/WiFi interfaces on their smartphones or other mobile devices. It becomes increasingly popular due to the recently explosive growth of smartphone users. Profile matching means two users comparing their personal profiles and is often the first step towards effective PMSN. It, however, conflicts with users' growing privacy concerns about disclosing their personal profiles to complete strangers before deciding to interact with them. This paper tackles this open challenge by designing a suite of novel fine-grained private matching protocols. Our protocols enable two users to perform profile matching without disclosing any information about their profiles beyond the comparison result. In contrast to existing coarse-grained private matching schemes for PMSN, our protocols allow finer differentiation between PMSN users and can support a wide range of matching metrics at different privacy levels. The security and communication/computation overhead of our protocols are thoroughly analyzed and evaluated via detailed simulations.

...read moreread less

Proceedings Article•DOI•

Efficient online structured output learning for keypoint-based object tracking

[...]

Sam Hare¹, Amir Saffari¹, Philip H. S. Torr¹•Institutions (1)

Oxford Brookes University¹

16 Jun 2012

TL;DR: Experiments on challenging video sequences show that the algorithm significantly improves over state-of-the-art descriptor matching techniques using a range of descriptors, as well as recent online learning based approaches.

...read moreread less

Abstract: Efficient keypoint-based object detection methods are used in many real-time computer vision applications. These approaches often model an object as a collection of keypoints and associated descriptors, and detection then involves first constructing a set of correspondences between object and image keypoints via descriptor matching, and subsequently using these correspondences as input to a robust geometric estimation algorithm such as RANSAC to find the transformation of the object in the image. In such approaches, the object model is generally constructed offline, and does not adapt to a given environment at runtime. Furthermore, the feature matching and transformation estimation stages are treated entirely separately. In this paper, we introduce a new approach to address these problems by combining the overall pipeline of correspondence generation and transformation estimation into a single structured output learning framework. Following the recent trend of using efficient binary descriptors for feature matching, we also introduce an approach to approximate the learned object model as a collection of binary basis functions which can be evaluated very efficiently at runtime. Experiments on challenging video sequences show that our algorithm significantly improves over state-of-the-art descriptor matching techniques using a range of descriptors, as well as recent online learning based approaches.

...read moreread less

Journal Article•DOI•

Face Matching and Retrieval in Forensics Applications

[...]

Anil K. Jain¹, Brendan Klare¹, Unsang Park¹•Institutions (1)

Michigan State University¹

01 Jan 2012-IEEE MultiMedia

TL;DR: This article surveys forensic face-recognition approaches and the challenges they face in improving matching and retrieval results as well as processing low-quality images.

...read moreread less

Abstract: This article surveys forensic face-recognition approaches and the challenges they face in improving matching and retrieval results as well as processing low-quality images.

...read moreread less

Journal Article•DOI•

Search and matching in the housing market

[...]

David Genesove¹, David Genesove², Lu Han³•Institutions (3)

Center for Economic and Policy Research¹, Hebrew University of Jerusalem², University of Toronto³

01 Jul 2012-Journal of Urban Economics

TL;DR: In this article, the authors apply a random matching model to unique multi-year, multi-market survey data on both buyers and sellers, and examine how demand affects housing market liquidity.

...read moreread less

Journal Article•DOI•

Measuring the effect of M&A on patenting quantity and quality

[...]

Giovanni Valentini¹•Institutions (1)

Bocconi University¹

01 Mar 2012-Strategic Management Journal

TL;DR: It is found that M&A have a positive effect on patenting output, but decrease patent impact, originality, and generality.

...read moreread less

Abstract: I explore the effect of M&A on the patenting quantity and quality of the firms involved in a deal. Three measures of quality are considered: impact, generality, and originality. The impact of a patent denotes its influence on future inventions. Generality refers to a patent's applicability across technological fields. Finally, the originality of a patent indicates the extent to which an invention synthesizes diverse technological inputs. Applying a matching estimator to data from the U.S. ‘medical devices and photographic equipment’ industry from 1988 to 1996, I find that M&A have a positive effect on patenting output, but decrease patent impact, originality, and generality. Copyright © 2011 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

Robust and Efficient Ridge-Based Palmprint Matching

[...]

Jifeng Dai¹, Jianjiang Feng¹, Jie Zhou¹•Institutions (1)

Tsinghua University¹

01 Aug 2012-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A segment-based matching and fusion algorithm is proposed to deal with the skin distortion and the varying discrimination power of different palmprint regions, and to reduce the computational complexity, an orientation field-based registration algorithm is designed for registering the palmprints into the same coordinate system before matching and a cascade filter is built to reject the nonmated gallery palmprints in early stage.

...read moreread less

Abstract: During the past decade, many efforts have been made to use palmprints as a biometric modality. However, most of the existing palmprint recognition systems are based on encoding and matching creases, which are not as reliable as ridges. This affects the use of palmprints in large-scale person identification applications where the biometric modality needs to be distinctive as well as insensitive to changes in age and skin conditions. Recently, several ridge-based palmprint matching algorithms have been proposed to fill the gap. Major contributions of these systems include reliable orientation field estimation in the presence of creases and the use of multiple features in matching, while the matching algorithms adopted in these systems simply follow the matching algorithms for fingerprints. However, palmprints differ from fingerprints in several aspects: 1) Palmprints are much larger and thus contain a large number of minutiae, 2) palms are more deformable than fingertips, and 3) the quality and discrimination power of different regions in palmprints vary significantly. As a result, these matchers are unable to appropriately handle the distortion and noise, despite heavy computational cost. Motivated by the matching strategies of human palmprint experts, we developed a novel palmprint recognition system. The main contributions are as follows: 1) Statistics of major features in palmprints are quantitatively studied, 2) a segment-based matching and fusion algorithm is proposed to deal with the skin distortion and the varying discrimination power of different palmprint regions, and 3) to reduce the computational complexity, an orientation field-based registration algorithm is designed for registering the palmprints into the same coordinate system before matching and a cascade filter is built to reject the nonmated gallery palmprints in early stage. The proposed matcher is tested by matching 840 query palmprints against a gallery set of 13,736 palmprints. Experimental results show that the proposed matcher outperforms the existing matchers a lot both in matching accuracy and speed.

...read moreread less

Book Chapter•DOI•

YAM++: a multi-strategy based approach for ontology matching task

[...]

DuyHoa Ngo¹, Zohra Bellahsene¹•Institutions (1)

University of Montpellier¹

08 Oct 2012

TL;DR: The capability of the ontology matching tool YAM++ is presented, which shows that it is able to discover mappings between entities of given two ontologies by using machine learning approach and it is shown that it can deal with multi-lingual ontologies matching problem.

...read moreread less

Abstract: In this paper, we present the capability of our ontology matching tool YAM++. We show that YAM++ is able to discover mappings between entities of given two ontologies by using machine learning approach. Besides, we also demonstrate that if the training data are not available, YAM++ can discover mappings by using information retrieval techniques. Finally, we show that YAM++ is able to deal with multi-lingual ontologies matching problem.

...read moreread less

Journal Article•DOI•

Propensity score applied to survival data analysis through proportional hazards models: a Monte Carlo study.

[...]

Etienne Gayat¹, Matthieu Resche-Rigon¹, Matthieu Resche-Rigon², Jean-Yves Mary², Jean-Yves Mary¹, Raphaël Porcher², Raphaël Porcher¹ - Show less +3 more•Institutions (2)

French Institute of Health and Medical Research¹, Paris Diderot University²

01 May 2012-Pharmaceutical Statistics

TL;DR: It was showed that propensity scores applied to survival data can lead to unbiased estimation of both marginal and conditional treatment effect, when marginal and adjusted Cox models are used.

...read moreread less

Abstract: Propensity score methods are increasingly used in medical literature to estimate treatment effect using data from observational studies. Despite many papers on propensity score analysis, few have focused on the analysis of survival data. Even within the framework of the popular proportional hazard model, the choice among marginal, stratified or adjusted models remains unclear. A Monte Carlo simulation study was used to compare the performance of several survival models to estimate both marginal and conditional treatment effects. The impact of accounting or not for pairing when analysing propensity-score-matched survival data was assessed. In addition, the influence of unmeasured confounders was investigated. After matching on the propensity score, both marginal and conditional treatment effects could be reliably estimated. Ignoring the paired structure of the data led to an increased test size due to an overestimated variance of the treatment effect. Among the various survival models considered, stratified models systematically showed poorer performance. Omitting a covariate in the propensity score model led to a biased estimation of treatment effect, but replacement of the unmeasured confounder by a correlated one allowed a marked decrease in this bias. Our study showed that propensity scores applied to survival data can lead to unbiased estimation of both marginal and conditional treatment effect, when marginal and adjusted Cox models are used. In all cases, it is necessary to account for pairing when analysing propensity-score-matched data, using a robust estimator of the variance.

...read moreread less

Journal Article•DOI•

Matching in Networks with Bilateral Contracts

[...]

John William Hatfield¹, Scott Duke Kominers²•Institutions (2)

Stanford University¹, University of Chicago²

01 Feb 2012-American Economic Journal: Microeconomics

TL;DR: In this article, the authors introduce a model in which firms trade goods via bilateral contracts which specify a buyer, a seller, and the terms of the exchange, and this setting subsumes (many-to-many) matching with contracts, as well as supply chain matching.

...read moreread less

Abstract: We introduce a model in which firms trade goods via bilateral contracts which specify a buyer, a seller, and the terms of the exchange. This setting subsumes (many-to-many) matching with contracts, as well as supply chain matching. When firms’ relationships do not exhibit a supply chain structure, stable allocations need not exist. By contrast, in the presence of supply chain structure, a natural substitutability condition characterizes the maximal domain of firm preferences for which stable allocations are guaranteed to exist. Furthermore, the classical lattice structure, rural hospitals theorem, and one-sided strategy-proofness results all generalize to this setting. ( C78, D85, D86, L14)

...read moreread less

Journal Article•DOI•

Technical efficiency analysis correcting for biases from observed and unobserved variables: an application to a natural resource management project

[...]

Boris E. Bravo-Ureta¹, Boris E. Bravo-Ureta², William H. Greene³, Daniel Solis⁴•Institutions (4)

University of Connecticut¹, University of Talca², New York University³, University of Miami⁴

01 Aug 2012-Empirical Economics

TL;DR: In this article, the authors combine the stochastic frontier framework with impact evaluation methodology to compare technical efficiency (TE) across treatment and control groups using cross-sectional data associated with the MARENA Program in Honduras.

...read moreread less

Abstract: This article brings together the stochastic frontier framework with impact evaluation methodology to compare technical efficiency (TE) across treatment and control groups using cross-sectional data associated with the MARENA Program in Honduras. A matched group of beneficiaries and control farmers is determined using propensity score matching techniques to mitigate biases stemming from observed variables. In addition, possible self-selection arising from unobserved variables is addressed using a selectivity correction model for stochastic frontiers recently introduced by Greene (J Prod Anal 34:15–24, 2010). The results reveal that average TE is consistently higher for beneficiary farmers than the control group while the presence of selectivity bias cannot be rejected. TE ranges from 0.67 to 0.75 for beneficiaries and from 0.40 to 0.65 for the control depending on whether biases were controlled or not. The TE gap between beneficiaries and control farmers decreases by implementing the matching technique and the sample selection framework decreases this gap even further. The analysis also suggests that beneficiaries do not only exhibit higher TE but also higher frontier output.

...read moreread less

Journal Article•DOI•

Job Market Signaling of Relative Position, or Becker Married to Spence

[...]

Ed Hopkins¹•Institutions (1)

University of Edinburgh¹

01 Apr 2012-Journal of the European Economic Association

TL;DR: In this article, the authors consider a matching model of the labor market where workers, who have private information on their quality, signal to firms that also differ in quality, allowing assortative matching in which the highest-quality workers send the highest signals and are hired by the best firms.

...read moreread less

Abstract: This paper considers a matching model of the labor market where workers, who have private information on their quality, signal to firms that also differ in quality. Signals allow assortative matching in which the highest-quality workers send the highest signals and are hired by the best firms. Matching is considered both when wages are rigid (nontransferable utility) and when they are fully flexible (transferable utility). In both cases, equilibrium strategies and payoffs depend on the distributions of worker and firm types. This is in contrast to separating equilibria of the standard model, which do not respond to changes in supply or demand. With sticky wages, despite incomplete information, equilibrium investment in education by low-ability workers can be inefficiently low, and this distortion can become worse in a more competitive environment. In contrast, with flexible wages, greater competition improves efficiency. (JEL: C72, C78, D82)

...read moreread less

Patent•

Identifying candidates for job openings using a scoring function based on features in resumes and job descriptions

[...]

David Hardtke, Jacob Bollinger, Ben Martin, Eduardo Vivas

26 Oct 2012

TL;DR: In this paper, a computer-based method, and computer system, for matching candidates with job openings was proposed. And the technology more particularly relates to methods of providing a candidate with a score for a particular job opening, where the score is derived from a comparison of features in the candidate's resume with job features in a description of the job opening.

...read moreread less

Abstract: A computer-based method, and computer system, for matching candidates with job openings. The technology more particularly relates to methods of providing a candidate with a score for a particular job opening, where the score is derived from a comparison of features in the candidate's resume with job features in a description of the job opening, as well as use of external data gathered from other sources and based on information contained in the candidate's resume and/or in the description of the job opening. Particular features are weighted to take account of their significance in matching candidates to job openings in a statistical survey of such matching. The technology further provides for notifying employers that one or more high scoring candidates have been identified.

...read moreread less

Collapse