scispace - formally typeset
Journal ArticleDOI

Getty's Synoname™ and its cousins: A survey of applications of personal name-matching algorithms

Reads0
Chats0
TLDR
Personal name‐matching techniques may be included in name authority work, information retrieval, or duplicate detection, with some applications matching on name only, and others combining personal names with other data elements in record linkage techniques.
Abstract
The study reported in this article was commissioned by the Getty Art History Information Program (AHIP) as a background investigation of personal name-matching programs in fields other than art history, for purposes of comparing them and their approaches with AHIP's Synoname™ project. We review techniques employed in a variety of applications, including art history, bibliography, genealogy, commerce, and government, providing a framework of personal name characteristics, factors in selecting matching techniques, and types of applications. Personal names, as data elements in information systems, vary for a wide range of legitimate reasons, including cultural and historical traditions, translation and transliteration, reporting and recording variations, as well as typographical and phonetic errors. Some matching applications seek to link variants, while others seek to correct errors. The choice of matching techniques will vary in the amount of domain knowledge about the names that is incorporated, the sources of data, and the human and computing resources required. Personal name-matching techniques may be included in name authority work, information retrieval, or duplicate detection, with some applications matching on name only, and others combining personal names with other data elements in record linkage techniques. We discuss both phonetic- and pattern-matching techniques, reviewing a range of implemented and proposed name-matching techniques in the context of these factors. © 1992 John Wiley & Sons, Inc.

read more

Citations
More filters
Journal ArticleDOI

Why are online catalogs still hard to use

TL;DR: The problems with query matching systems are discussed, which were designed for skilled search intermediaries rather than end‐users, and the knowledge and skills they require in the information‐seeking process, illustrated with examples of searching card and online catalogs.
Proceedings ArticleDOI

A Comparison of Personal Name Matching: Techniques and Practical Issues

TL;DR: The characteristics of personal names are discussed and potential sources of variations and errors are presented and a comprehensive number of commonly used, as well as some recently developed name matching techniques are overview.
Journal ArticleDOI

Data integration using similarity joins and a word-based information representation language

TL;DR: WHIRL is described, a “soft” database management system which supports “similarity joins,” based on certain robust, general-purpose similarity metrics for text, which enables fragments of text to be used as keys.
Journal ArticleDOI

Finding approximate matches in large lexicons

TL;DR: This paper shows how to use string matching techniques in conjunction with lexicon indexes to find approximate matches in a large lexicon, and proposes methods for combining these techniques, and shows experimentally that these combinations yield good retrieval effectiveness while keeping index size and retrieval time low.
Book

Publishing and Using Cultural Heritage Linked Data on the Semantic Web

TL;DR: This book gives an overview on why, when, and how Linked (Open) Data and Semantic Web technologies can be employed in practice in publishing CH collections and other content on the Web, and motivates and presents a general semantic portal model and publishing framework as a solution approach to distributed semantic content creation, based on an ontology infrastructure.