scispace - formally typeset
Journal ArticleDOI

Secure Approximate String Matching for Privacy-Preserving Record Linkage

TLDR
This work presents a novel public-key construction for secure two-party evaluation of threshold functions in restricted domains based on embeddings found in the message spaces of additively homomorphic encryption schemes.
Abstract
Real-world applications of record linkage often require matching to be robust in spite of small variations in string fields. For example, two health care providers should be able to detect a patient in common, even if one record contains a typo or transcription error. In the privacy-preserving setting, however, the problem of approximate string matching has been cast as a trade-off between security and practicality, and the literature has mainly focused on Bloom filter encodings , an approach which can leak significant information about the underlying records. We present a novel public-key construction for secure two-party evaluation of threshold functions in restricted domains based on embeddings found in the message spaces of additively homomorphic encryption schemes. We use this to construct an efficient two-party protocol for privately computing the threshold Dice coefficient. Relative to the approach of Bloom filter encodings, our proposal offers formal security guarantees and greater matching accuracy. We implement the protocol and demonstrate the feasibility of this approach in linking medium-sized patient databases with tens of thousands of records.

read more

Citations
More filters
Proceedings ArticleDOI

A two-party private string matching fuzzy vault scheme

TL;DR: In this paper, a two-party privacy-preserving approximate string matching methodology based on a novel Fuzzy Vault scheme was proposed, which combines the approximation and security properties of fuzzy vaults for record linkage purposes.
Journal ArticleDOI

Accurate and efficient privacy-preserving string matching

TL;DR: In this article , the authors proposed two new approaches for accurate and efficient privacy-preserving string matching that provide privacy against various attacks, and evaluated their approaches on several data sets with different types of strings, and validate their privacy, accuracy, and complexity compared to three baseline techniques, showing that they outperform all baselines.
Proceedings ArticleDOI

Encrypted video search: scalable, modular, and content-similar

TL;DR: This work begins the study of scalable encrypted video search in which a client can search videos similar to an image query, and advocates two-step searches by incorporating lightweight searchable encryption techniques for pre-screening and an interactive approach for fine-grained search.
Journal ArticleDOI

An Improved Chinese String Comparator for Bloom Filter Based Privacy-Preserving Record Linkage

TL;DR: Wang et al. as discussed by the authors proposed a method for PPRL in Chinese language environment, where Chinese characters (identification fields in record pairs) are encoded into strings composed of letters and numbers by using the SoundShape code according to their shapes and pronunciations.
Book ChapterDOI

Privacy-preserving record linkage using local sensitive hash and private set intersection

TL;DR: In this article , the authors proposed a new and efficient privacy-preserving record linkage (PPRL) protocol that combines PSI and local sensitive hash (LSH) functions, and runs in linear time.
References
More filters
Journal ArticleDOI

Space/time trade-offs in hash coding with allowable errors

TL;DR: Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.
Book ChapterDOI

Efficient Private Matching and Set Intersection

TL;DR: In this paper, the problem of computing the intersection of private datasets of two parties, where the datasets contain lists of elements taken from a large domain, was considered and protocols based on the use of homomorphic encryption and balanced hashing were proposed.

Efficient private matching and set intersection

TL;DR: This work considers the problem of computing the intersection of private datasets of two parties, where the datasets contain lists of elements taken from a large domain, and presents protocols, based on the use of homomorphic encryption and balanced hashing, for both semi-honest and malicious environments.
Proceedings ArticleDOI

When private set intersection meets big data: an efficient and scalable protocol

TL;DR: A new Private Set Intersection (PSI) protocol that is extremely efficient and highly scalable compared with existing protocols, based on a novel approach that is oblivious Bloom intersection, which has linear complexity and relies mostly on efficient symmetric key operations.
Proceedings ArticleDOI

Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud

TL;DR: This paper proposes a novel multi-keyword fuzzy search scheme that achieves fuzzy matching through algorithmic design rather than expanding the index file and effectively supports multiple keyword fuzzy search without increasing the index or search complexity.
Related Papers (5)