Open AccessJournal Article
Fast approximate string matching with finite automata
Reads0
Chats0
TLDR
A fast algorithm for finding approximate matches of a string in a finite-state automaton, given some metric of similarity, which can be adapted to use a variety of metrics for determining the distance between two words.Abstract:
We present a fast algorithm for finding approximate matches of a string in a finite-state automaton, given some metric of similarity. The algorithm can be adapted to use a variety of metrics for determining the distance between two words.read more
Citations
More filters
Book ChapterDOI
HFST—Framework for Compiling and Applying Morphologies
TL;DR: HFST–Helsinki Finite-State Technology offers a path from language descriptions to efficient language applications in key environments and operating systems and provides an opportunity to exchange transducers between different software providers in order to get the best out of each finite-state library.
Proceedings ArticleDOI
Correcting noisy OCR: context beats confusion
John Evershed,Kent Fitch +1 more
TL;DR: This work describes a system for automatic post OCR text correction of digital collections of historical texts, which uses a "noisy channel" approach and shows good improvements in word error rate.
Proceedings Article
Arabic Word Generation and Modelling for Spell Checking
TL;DR: This work creates an adequate, open-source and large-coverage word list for Arabic containing 9,000,000 fully inflected surface words and creates a character-based tri-gram language model to approximate knowledge about permissible character clusters in Arabic, creating a novel method for detecting spelling errors.
Finite-State Spell-Checking with Weighted Language and Error Models
Tommi A. Pirinen,Krister Lindén +1 more
TL;DR: This paper uses a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrates rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet.
Proceedings Article
Effective Spell Checking Methods Using Clustering Algorithms
TL;DR: A novel approach to spell checking using dictionary clustering that combines the application of anomalous pattern initialization and partition around medoids (PAM) and an English misspelling list compiled using real examples extracted from the Birkbeck spelling error corpus is presented.
References
More filters
Journal ArticleDOI
A Formal Basis for the Heuristic Determination of Minimum Cost Paths
TL;DR: How heuristic information from the problem domain can be incorporated into a formal mathematical theory of graph searching is described and an optimality property of a class of search strategies is demonstrated.
Book
The Design and Analysis of Computer Algorithms
Alfred V. Aho,John E. Hopcroft +1 more
TL;DR: This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.
Journal ArticleDOI
Depth-First Search and Linear Graph Algorithms
TL;DR: The value of depth-first search or “backtracking” as a technique for solving problems is illustrated by two examples of an improved version of an algorithm for finding the strongly connected components of a directed graph.
Book
Heuristics : intelligent search strategies for computer problem solving
TL;DR: In this article, the authors present, characterizes and analyzes problem solving strategies that are guided by heuristic information, and characterise and analyze problem-solving strategies with heuristics.
Book
Finite State Morphology
TL;DR: This volume is a practical guide to finite-state theory and the affiliated programming languages lexc and xfst, and readers will learn how to write tokenizers, spelling checkers, and especially morphological analyzer/generators for words in English, French, Finnish, Hungarian and other languages.