scispace - formally typeset
Search or ask a question

Showing papers on "Feature (machine learning) published in 1978"


Journal ArticleDOI
TL;DR: A tutorial survey of techniques for using contextual information in pattern recognition is presented, with emphasis on the problems of image classification and text recognition, where the text is in the form of machine and handprinted characters, cursive script, and speech.

119 citations


Book ChapterDOI
01 Jan 1978
TL;DR: The concept of data structures is a continuation of the initial notion of “feature” from statistical pattern recognition—that is, a simple clue helpful in decision-making—to include multilevel or hierarchical classification processes: “understanding”.
Abstract: The term “data structures”(1) is well-established in computer science, conveying the idea of how tables and lists are stored, but it is a relatively new concept in pattern recognition. In pattern recognition, the concept of data structures is a continuation of the initial notion of “feature” from statistical pattern recognition(2)—that is, a simple clue helpful in decision-making—to include: 1. Nonstatistical or descriptive pattern recognition methodologies(3): Linguistic, syntactic, and structural methods. 2 Sequential calculations (a) Feature extraction depending on decision to be made. (b) Multilevel or hierarchical classification processes: “understanding”(4,5).

58 citations


Journal ArticleDOI
TL;DR: An error-correcting syntax analyzer for tree languages with substitution errors, called structure-preserved error-Correcting tree automaton (ECTA), is studied.
Abstract: The syntax errors on trees are defined in terms of five types of error transformations, namely, substitution, stretch, split, branch, and deletion. The distance between two trees is the least cost sequence of error transformations needed to transform one to the other. Based on this definition, a class of error-correcting tree automata (ECTA) is proposed. The operation of ECTA is illustrated by a character recognition example.

52 citations


Journal ArticleDOI
TL;DR: A zero-order Markov feature prediction model is postulated and applied to South Saskatchewan River flow data in an effort to demonstrate its usefulness in real situations.
Abstract: It is reasonable to consider that sequences of hydrologic data corresponding to daily, weekly, or monthly measurements occur in well-defined groups. These groups possess collective properties of the data forming them. Such a collection of properties can be called a hydrologic pattern. A pattern is a description of an object, and the objects of concern in this paper are groups of data on hydrologic phenomena observed at regular time intervals. Hydrologic patterns describing each of these groups are expressed by n appropriate properties. Further, the dimensionality, n in number, can be reduced by considering only those characteristic properties, m in number (m≤n), that are common in all hydrologic patterns of the same category. These m characteristic properties are called features. A procedure is presented to extract information present within patterns and among patterns of the pertinent hydrologic data. In addition, on the basis of the above information, a zero-order Markov feature prediction model is postulated. The basic assumptions of the model and their implications are presented. The model is applied to South Saskatchewan River flow data in an effort to demonstrate its usefulness in real situations.

30 citations


Patent
Robert K. Brayton1
26 Jun 1978
TL;DR: In this article, an automatic character recognition system and method for identifying an unknown character which is one of a class of known characters is presented. The system is set up using known specimen characters from a large character training set which must first select based on the use to which the character recognition will be put.
Abstract: An automatic character recognition system and method for identifying an unknown character which is one of a class of known characters. The system is set up using known specimen characters from a large character training set which must first be selected based on the use to which the character recognition will be put. Using the selected set, features and shapes of the character vocabulary in the set are obtained using selected feature scan parameters, are processed as a plurality of representative and normalized pieces of curves and are then stored in the form of binary coded representations. The system set-up also includes selecting and storing canonic shape parameters. The canonic shapes are separate pieces or segments of lines and curves which are selected on the basis that their shapes can be found as component parts or within a significant number of the characters within the character set. Having selected a character training set, selected and stored feature scan parameters and selected and stored canonic shape parameters, the system set-up procedure is completed. The next procedure is referred to as "system training" in which the individual characters within the large character training set are processed with the prior knowledge of the identity of each character being processed. This consists of an individual curve following for each of the plurality of characters on the character set and recording the path coordinates resulting from the curve following operation. The path coordinates are matched against the stored canonic shape parameters and the "best match" features are encoded. Statistical tables are then formed based on the best match relationships between the known training characters and the canonic shapes. The character recognition system is now capable of hereinafter operating with and identifying unknown characters belonging to the recognition space character set. In this procedure, the unknown character is examined using feature scan parameters to extract the features from the unknown character, providing complex vectors of the measured path coordinates of the extracted features which are matched against the stored canonic shape parameters by computing complex inner products, and a best match feature is determined. Finally, a plurality of row vectors are extracted from the statistical tables and combined to form a product vector. The largest component of the product vector is selected, and the column index j of the maximum component is noted. The unknown character is then identified as being a member of the character membership class whose column index is j.

28 citations


Journal ArticleDOI
Fehlauer1, Eisenstein1
TL;DR: A feature extraction technique based on a new criterion for "declustering" is presented that enhances class separation and is robust over a wide range of measurement statistics.
Abstract: A feature extraction technique based on a new criterion for "declustering" is presented. Declustering occurs when sample vectors from one pattern class form a densely packed point constellation, or cluster, in feature space while vectors from another class do not form a cluster but instead array themselves as outliers. Features chosen to optimize the declustering criterion enhance class separation and are robust over a wide range of measurement statistics.

28 citations


Journal ArticleDOI
David H. Foster1
TL;DR: Subjects made Same—Different judgements on pairs of briefly presented random-dot patterns: they had to judge in separate experiments either whether the members of each pair were identical in shape or whether the number of dots in each pattern was the same.
Abstract: Subjects made Same–-Different judgements on pairs of briefly presented random-dot patterns: they had to judge in separate experiments either whether the members of each pair were identical in shape or whether the number of dots in each pattern was the same. When one pattern was the rotated version of the other, the proportion of Same responses varied with the angle of rotation in the same way for the two types of judgement. From these and other data obtained with pattern pairs in which members differed in shape and in dot-number, the following inferences are made. First, in making both kinds of Same judgements, a fixed visual association is established between local features (dot-clusters within the pattern) and certain spatial relations between these local features. Thus when spatial-relation information is in principle irrelevant to the pattern-comparison task, as in judgements of dot-number, this information is not separated from the relevant local-feature information in the pattern representation. Sec...

24 citations


Journal ArticleDOI
C.H. Chen1
TL;DR: The fundamental problems in automatic recognition of seismic events are examined with particular emphasis on feature extraction and digital signal processing with the discussion of geophysical features.

23 citations


Journal ArticleDOI
E. Bloch1, D. Galage1
TL;DR: Anyone concerned with understanding the past, present, and future of high-speed computing must of necessity focus on progress in the components field.
Abstract: Anyone concerned with understanding the past, present, and future of high-speed computing must of necessity focus on progress in the components field. He must formulate searching questions about the outlook of that technology and how it affects progress in high-speed computer development.

16 citations


Journal ArticleDOI
TL;DR: The methodology suggested in the paper provides a structural pattern recognition generalization to phrase-structured syntactic pattern recognition.

15 citations


Book ChapterDOI
01 Jan 1978
TL;DR: In this paper, a system for inference on partial information, which uses production rules, is described, where only a partial match of the antecedent of a rule is needed, and some selected extensions of the system are presented.
Abstract: Some extensions of a system for inference on partial information, which uses production rules, are described. Basically, the system consists of RULES, an active set of rules (a subset of potentially large set of rules), partially ordered by specificity, and FACTS, a small active set of facts (a subset of potentially large set of data base facts). The critical feature of the inference method is that only a partial match of the antecedent of a rule is needed. Some selected extensions of the system are presented. These concern some approaches to the problems of selecting from an ambiguous response and, more importantly, transforming or dynamic clustering of FACTS and RULES. This latter problem is important because partial match is defined over the sets RULES and FACTS, and unless these sets are reasonably small, partial match can be an unmanageable operation. Several issues concerning the use of this inference system in certain applications are also briefly discussed.


Journal ArticleDOI
TL;DR: A recognizer of isolated words spoken in the Italian language is presented and is based on a classification of speech units into broad phonetic classes, eventually followed by a classification into more detailed classes if some ambiguities still remain.

Journal ArticleDOI
Kashyap1, Mittal
TL;DR: A method of recognizing isolated words and phrases from a given vocabulary spoken by any member in a given group of speakers, the identity of the speaker being unknown to the system is described.
Abstract: We describe a method of recognizing isolated words and phrases from a given vocabulary spoken by any member in a given group of speakers, the identity of the speaker being unknown to the system. The word utterance is divided into 20-30 nearly equal frames, frame boundaries being aligned with glottal pulses for voiced speech. A constant number of pitch periods are included in each frame. Statistical decision rules are used to determine the phoneme in each frame. Using the string of phonemes from all the frames of the utterance, a word decision is obtained using (phonological) syntactic rules. The syntactic rules used here are of 2 types, namely, 1) those obtained from the theory of word construction from phonemes in English as applied to our vocabulary, 2) those used to correct possible errors in phonemic decisions obtained earlier based on the decisions of neighboring segments. In our experiment, the vocabulary had 40 words, consisting of many pairs of words which are phonemically close to each other. The number of speakers was 6. The identity of the speaker is not known to the system. In testing 400 words utterances, the recognition rate was about 80 percent for phonemes (for 11 phonemes) but the word recognition was 98.1 percent correct. Phonological-syntactic rules played an important role in upgrading the word recognition rate over the phoneme recognition rate.

Journal ArticleDOI
TL;DR: The concept of irrelevant features in Bayesian models for pattern recognition is introduced, and its mathematical meaning is explained.
Abstract: The concept of irrelevant features in Bayesian models for pattern recognition is introduced, and its mathematical meaning is explained. A technique for computing the conditional probabilities of irrelevant features, if necessary, is described. The effect of irrelevant features on feature selection in sequential classification is discussed and illustrated.

Journal ArticleDOI
TL;DR: One of the most difficult problems in speaker recognition is that the feature parameters frequently vary after a long time interval; one uses the time pattern of both the fundamental frequency and log‐area‐ratio parameters and the other uses several kinds of statistical features derived from them.
Abstract: One of the most difficult problems in speaker recognition is that the feature parameters frequently vary after a long time interval. We examined this effect on two kinds of speaker recognition; one uses the time pattern of both the fundamental frequency and log‐area‐ratio parameters and the other uses several kinds of statistical features derived from them. Results of speaker recognition experiments revealed that the long‐term variation effects have a great influence on both recognition methods, but are more evident in recognition using statistical parameters. In order to reduce the error rate after a long interval, it is desirable to collect learning samples of each speaker over a long period and measure the weighted distance based on the long‐term variability of the feature parameters. When the learning samples are collected over a short period, it is effective to apply spectral equalization using the spectrum averaged over all the voiced portions of the input speech. By this method, an accuracy of 95% can be obtained in speaker verification even after five years using statistical parameters of a spoken word.One of the most difficult problems in speaker recognition is that the feature parameters frequently vary after a long time interval. We examined this effect on two kinds of speaker recognition; one uses the time pattern of both the fundamental frequency and log‐area‐ratio parameters and the other uses several kinds of statistical features derived from them. Results of speaker recognition experiments revealed that the long‐term variation effects have a great influence on both recognition methods, but are more evident in recognition using statistical parameters. In order to reduce the error rate after a long interval, it is desirable to collect learning samples of each speaker over a long period and measure the weighted distance based on the long‐term variability of the feature parameters. When the learning samples are collected over a short period, it is effective to apply spectral equalization using the spectrum averaged over all the voiced portions of the input speech. By this method, an accuracy of 95% ...

Journal ArticleDOI
TL;DR: In this paper, the encoding processes of recognition and recall for line-drawn faces were investigated, and the results indicated that recognition performance was higher than probe-recall performance for all groups.
Abstract: The encoding processes of recognition and recall for line-drawn faces were investigated. Subjects randomly received three-alternative forced-choice tests of recognition and probe recall of 20 male faces. Between each inspection and test, subjects performed an interference task for 10 sec. The interference tasks consisted of either identifying the missing facial feature in line drawings or in photographs, or correctly identifying the misspelled words describing different facial features. The results indicate that recognition performance was higher than probe-recall performance for all groups. The analysis of the recognition data suggests that recognition ability decreased as the similarity of the interference task to the target increased. This finding suggests that faces are encoded using visual rather than verbal imagery.

Journal ArticleDOI
King-Sun Fu1
01 Dec 1978

Journal ArticleDOI
TL;DR: A novel system for recognition of handprinted alphanumeric characters has been developed and tested, and the importance of “good” features over sophistication in the classification procedures was recognized and the feature extractor is designed to extract features based on a variety of topological, morphological and similar properties.
Abstract: A novel system for recognition of handprinted alphanumeric characters has been developed and tested. The system can be employed for recognition of either the alphabet or the numeral by contextually switching on to the corresponding branch of the recognition algorithm. The two major components of the system are the multistage feature extractor and the decision logic tree-type catagorizer. The importance of “good” features over sophistication in the classification procedures was recognized, and the feature extractor is designed to extract features based on a variety of topological, morphological and similar properties. An information feedback path is provided between the decision logic and the feature extractor units to facilitate an interleaved or recursive mode of operation. This ensures that only those features essential to the recognition of a particular sample are extracted each time. Test implementation has demonstrated the reliability of the system in recognizing a variety of handprinted alphanumeric characters with close to 100% accuracy.

19 May 1978
TL;DR: This paper examines the current status of the statistical pattern recognition by the topics: classification rules, feature extraction, contextual analysis, etc.
Abstract: : This paper examines the current status of the statistical pattern recognition by the topics: classification rules, feature extraction, contextual analysis, etc. Important but unsolved problem areas are also explored. The relationship between the statistical pattern recognition and signal processing is also considered.

Patent
06 May 1978
TL;DR: In this article, a steady-vowel-part extraction circuit was proposed to increase a processing speed by extracting a sampling point which corresponds to a vowel part in a monosyllable.
Abstract: PURPOSE: To increase a processing speed by extracting a sampling point which corresponds to a vowel part in a monosyllable, by selecting a recognition-expectant monosyllable, through the collation with the previously-registered reference vowel feature quantity, and by collating the monosyllable with an unknown voice. CONSTITUTION: By making use of the steadiness of the extracted feature quantity, steady-vowel-part extraction circuit 6 extracts a sampling point which corresponds to a vowel part in a monosyllable and then collates 10 a vowel feature quantity, which corresponds to this sampling point, with the reference vowel feature quantity previously registered for each monosyllable. As a result, the recognition-expectant monosyllable is determined and then collated 13 again with an unknown voice. Consequently, the processing speed can be increased. COPYRIGHT: (C)1979,JPO&Japio

Journal ArticleDOI
TL;DR: A learning decision algorithm, using a set of distinctive features, is described and applied to character recognition, based on assumptions which have a wide application area and which can provide help to an industrial user who has to design a character recognizer.

Patent
25 Oct 1978
TL;DR: In this article, a stroke density function of a feature amount is used to estimate the complexity of a character pattern, which is used in a rough classification when recognizing a Chinese character, and the result is subject to a threshold processing and judged in a deciding part 600.
Abstract: PURPOSE:To absorb a minute modification of an outline by forming a feature parameter by applying linear filtering to stroke density function of a feature amount showing the complexity of a character pattern which is effective in a rough classification when recognizing a Chinese character. CONSTITUTION:A document 50 in which a character is entered is converted into a digital pattern by a photoelectric converging part 100, chopped by a character chopping part 200, through a preliminary processing part 300 carrying out the normalization of size and noise, and enters a feature extracting part 400. Herein, the character is extracted in an extracting part 410 of a stroke density function, and thereafter is applied by for example median filtering at a filtering part 450. The calculation of the similarity of a feature 455 of the obtained input character and a reference pattern feature 555 in a reference pattern file 550 is carried out in an adjusting part 500. The result is subject to a threshold processing and judged in a deciding part 600. Finally, the obtained recognition result is accepted or rejected to an input pattern. A mode filter is usable as the filtering part 450.

Proceedings ArticleDOI
10 Apr 1978
TL;DR: Two studies for determining the relative applicability of each of these three feature extraction methods for speech recognition are presented and one study is aimed at determining the noise vulnerability of each method.
Abstract: Two popular methods of feature extraction which have been applied to automatic speech recognition are linear prediction and Fast Fourier Transform analysis. Recent work by the authors has indicated that zero-crossing analysis methods also have the potential to result in accurate speech recognition. In this paper two studies for determining the relative applicability of each of these three feature extraction methods for speech recognition are presented. One study is aimed at determining the relative discriminability of the methods for vowel recognition. The other study is aimed at determining the noise vulnerability of each method. Several Fast Fourier Transform and zero-crossing analysis algorithms perform well in the classification of vowels in a quiet environment. Exceptional classification results are obtained for several zero-crossing analysis algorithms applied to vowels in noise.

01 Apr 1978
TL;DR: Alternate representations, based on stage-space and AND/OR graphs and ordered search strategies, are described, for multistage and nearest neighbor classification and for structural pattern analysis and feature extraction.
Abstract: : Noting the major limitations of the much developed multi-variate statistical and syntactic pattern recognition models, this paper describes--in a tutorial manner--alternate representations, based on stage-space and AND/OR graphs and ordered search strategies, for multistage and nearest neighbor classification and for structural pattern analysis and feature extraction. Some recent work in pattern recognition is reviewed from these vantage points. In addition, the paper touches on recent contributions to the continuing attempts to understand feature subset selection, measurement complexity and nonparametric classification and error estimation. Surveys, conference proceedings and edited collections providing quick access to the recent literature on pattern recognition methodologies and applications, are cited in the bibliography. (Author)



Journal ArticleDOI
Chun Chiang1
TL;DR: A quantitative model is proposed to explain the order of accuracy for recognizing letters in meaningful words and non-related words based on the principle that minimum features required for recognition of the whole pattern is equal to or less than the total features needed for its component patterns.


Journal ArticleDOI
TL;DR: A newly developed methodology and related system for the automatic pattern recognition of machine parts etc., on the basis of the similarity, featured by the capability of recognizing in less than 4 seconds per one pattern and classifying into 22 kinds of matching patterns which are memorized in the minicomputer.