scispace - formally typeset
Proceedings ArticleDOI

Can Combining Demographics and Biometrics Improve De-duplication Performance?

23 Jun 2013-pp 188-193
TL;DR: The study presents the results when demographic and biometric information are individually processed and complementary information from the two modalities are combined at match score level for de-duplication under different operating scenarios.
Abstract: With the prevalent utilization of citizen databases, an individual has to prove his/her identity for accessing several services such as banking, health care, and social welfare benefits. These databases are now increasingly using demographic and biometric information to uniquely identify the individuals. It protects the core identity of citizens and facilitates them to receive the entitled benefits and rights. It is therefore important that every citizen should enroll only once in the database and be assigned only one unique identifier. De-duplication process prevents an individual from enrolling multiple times in the database. It is essential to understand the importance of constituent information (demographic and biometric) in the de-duplication process. Using a large database, this research attempts to fill the gap in existing literature by analyzing the performance of demographic and biometric information for de-duplication. The study presents the results when demographic and biometric information are individually processed and complementary information from the two modalities are combined at match score level for de-duplication under different operating scenarios.
Topics: Biometrics (52%)
Citations
More filters

Journal ArticleDOI
TL;DR: This paper proposes an adaptive sequential framework to automatically determine which subset of biometric traits and biographic information is adequate for de-duplication of a given query, on a virtual multi-biometric database of 27,000 subjects.
Abstract: An efficient, robust and accurate de-duplication mechanism.Real-time prediction whether additional identifiers are needed to make a decision.Biographic information is utilized for only a small fraction of queries. Use of biometrics for person identification has increased tremendously over the past decade, e.g., in large scale national identification programs, for law enforcement and border control applications, and social welfare initiatives. For such large scale applications with a diverse target population, unimodal biometric systems, which use a single biometric trait (e.g., fingerprints), are inadequate due to their limited capacity. Multimodal biometric systems, which fuse multiple biometric traits (e.g., fingerprints and face), are required for large-scale identification applications, e.g., de-duplication where the goal is to ensure that the same person does not have two different official credentials (e.g., national ID card) based on different credentials. While multimodal biometric systems offer several advantages (e.g., improvement in recognition accuracy, decrease in failure to enroll rate), they require large enrollment and de-duplication times. This paper proposes an adaptive sequential framework to automatically determine which subset of biometric traits and biographic information is adequate for de-duplication of a given query. An analysis of this strategy is presented on a virtual multi-biometric database of 27,000 subjects (fingerprints from NIST SD14 dataset and face images from the PCSO dataset) along with biographic information sampled from the US census data. Experimental results, using three-fold cross-validation, show that without any loss in de-duplication accuracy, on average, for 63.18% (of a total of 27,000) of the queries, only fingerprint capture is adequate, for an additional 28.69% of queries, both fingerprint and face are required, and only 8.13% of the queries needed biographic information in addition to fingerprint and face. Display Omitted

11 citations


Journal ArticleDOI
01 Jan 2018-IET Biometrics
TL;DR: This work proposes the use of a graph structure to model the relationship between the biometric records in a database and shows the benefits of such a graph in deducing biographic labels of incomplete records, i.e. records that may have missing biographic information.
Abstract: A biometric system uses the physical or behavioural attributes of a person, such as face, fingerprint, iris or voice, to recognise an individual. Many operational biometric systems store the biographic information of an individual, viz., name, gender, age and ethnicity, besides the biometric data itself. Thus, the biometric record pertaining to an individual consists of both biometric data and biographic data. We propose the use of a graph structure to model the relationship between the biometric records in a database. We show the benefits of such a graph in deducing biographic labels of incomplete records, i.e. records that may have missing biographic information. In particular, we use a label propagation scheme to deduce missing values for both binary-valued biographic attributes (e.g. gender) as well as multi-valued biographic attributes (e.g. age group). Experimental results using face-based biometric records consisting of name, age, gender and ethnicity convey the pros and cons of the proposed method.

5 citations


Cites background or methods from "Can Combining Demographics and Biom..."

  • ...There are two different strategies to integrate biographic data into a biometric system: (a) the biographic information can be used to filter the gallery database such that the input probe is only compared against those gallery records sharing a similar biographic profile [24], [25] and (b) the biometrics and biographics are combined at the match score level in order to improve the recognition accuracy [26]–[28]....

    [...]

  • ...[28] combine biometric and biographic match scores for a de-duplication application....

    [...]


Dissertation
01 Jan 2015-
Abstract: The abstract of the thesis consists of three sections, videlicet, Motivation Chapter Organization Salient Contributions. The complete abstract is included with the thesis. The final section on Salient Contributions is reproduced below. Salient Contributions The research presents the following salient contributions: i. A novel technique has been developed for comparing biographical information, by combining the average impact of Levenshtein, Damerau-Levenshtein, and editor distances. The impact is calculated as the ratio of the edit distance to the maximum possible edit distance between two strings of the same lengths as the given pair of strings. This impact lies in the range [0, 1] and can easily be converted to a similarity (matching) score by subtracting the impact from unity. ii. A universal soft computing framework is proposed for adaptively fusing biometric and biographical information by making real-time decisions to determine after consideration of each individual identifier whether computation of matching scores and subsequent fusion of additional identifiers, including biographical information is required. This proposed framework not only improves the accuracy of the system by fusing less reliable information (e.g. biographical information) only for instances where such a fusion is required, but also improves the efficiency of the system by computing matching scores for various available identifiers only when this computation is considered necessary. iii. A scientific method for comparing efficiency of fusion strategies through a predicted effort to error trade-off curve.

2 citations


Proceedings ArticleDOI
01 Jan 2017-
TL;DR: A case study on the existing de-duplication methods for passport enrolments and other such documents and helps identify big and fast data platforms to identify such e-governance plans, by evaluating the accuracy and efficiency of existing algorithms.
Abstract: Big data is an emerging technology that is becoming an essential part of national governance. Aadhaar is the unique identification scheme of India, handled by the Unique Identification Authority of India (UIDAI), which deals with big data. Every person above the age of 5 years has to register their demographic details (Name, Date of Birth, Address and Phone number) and biometric details (10 fingerprints and both iris) and then these details are used to verify the authenticity of the person when any services are required by him. Passport is a legal document that is carried by a person when he travels between countries, but in the case of the older passports with no biometric data, a person may have more than one legal passport with different demographic details. This paper does a case study on the existing de-duplication methods for passport enrolments and other such documents. In the case of newer passports, it takes 10 days to link with Aadhaar at the time of registration, hence the aim is to reduce the processing time of the linking and verification. String matching algorithms are used to compare the demographics, and techniques such as genetic programming and hashing are used for de-duplication. This case study also helps identify big and fast data platforms to identify such e-governance plans, by evaluating the accuracy and efficiency of existing algorithms. This system aims to predict the duplication of passports by linking Aadhaar and passport details, and to reduce the processing time of the Aadhaar database by using parallel algorithms.

1 citations


References
More filters

Book
01 Jan 2000-
TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
Abstract: From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.

13,269 citations


"Can Combining Demographics and Biom..." refers methods in this paper

  • ...The analysis is construed using the proposed learning based Support Vector Machine (SVM) [4] fusion algorithm which combines information from two modalities at the match score level....

    [...]




Book
10 Mar 2005-
TL;DR: This unique reference work is an absolutely essential resource for all biometric security professionals, researchers, and systems administrators.
Abstract: A major new professional reference work on fingerprint security systems and technology from leading international researchers in the field Handbook provides authoritative and comprehensive coverage of all major topics, concepts, and methods for fingerprint security systems This unique reference work is an absolutely essential resource for all biometric security professionals, researchers, and systems administrators

3,730 citations


"Can Combining Demographics and Biom..." refers background in this paper

  • ...1To know more about how fingerprints can be faked/spoofed, readers are directed to [11, 17]....

    [...]

  • ...Fingerprint recognition is one of the oldest and wellknown biometrics used in several applications because of its uniqueness and consistency over time [11]....

    [...]


Journal ArticleDOI
TL;DR: This paper presents an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database and covers similarity metrics that are commonly used to detect similar field entries.
Abstract: Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task. Errors are introduced as the result of transcription errors, incomplete information, lack of standard formats, or any combination of these factors. In this paper, we present a thorough analysis of the literature on duplicate record detection. We cover similarity metrics that are commonly used to detect similar field entries, and we present an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database. We also cover multiple techniques for improving the efficiency and scalability of approximate duplicate detection algorithms. We conclude with coverage of existing tools and with a brief discussion of the big open problems in the area

1,643 citations


1


"Can Combining Demographics and Biom..." refers background in this paper

  • ...Deduplication has been extensively studied by researchers in database and information management systems [7, 9]....

    [...]


Network Information
Related Papers (5)
30 Oct 2006

Svetlana Yanushkevich

18 Sep 2006, IEEE Aerospace and Electronic Systems Magazine

Marcos Faundez-Zanuy, J. Fierrez-Aguilar +2 more

22 Jan 2009

Sitalakshmi Venkatraman

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20181
20171
20161
20151