scispace - formally typeset
Search or ask a question

Showing papers on "Convolutional neural network published in 2004"


Patent
20 May 2004
TL;DR: This paper proposed a global optimization framework for optical character recognition (OCR) of low-resolution photographed documents that combines a binarization-type process, segmentation, and recognition into a single process.
Abstract: A global optimization framework for optical character recognition (OCR) of low-resolution photographed documents that combines a binarization-type process, segmentation, and recognition into a single process. The framework includes a machine learning approach trained on a large amount of data. A convolutional neural network can be employed to compute a classification function at multiple positions and take grey-level input which eliminates binarization. The framework utilizes preprocessing, layout analysis, character recognition, and word recognition to output high recognition rates. The framework also employs dynamic programming and language models to arrive at the desired output.

123 citations


Proceedings ArticleDOI
19 Jun 2004
TL;DR: A standard genetic algorithm is employed to train the weights of a 4-5x5 filter CNN in order to pass through the local minima and results in a 92.4% average success rate using 25 GA-trained CNNs presented with 100 crack (320x240 pixel) images.
Abstract: Detecting cracks is an important function in building, tunnel and bridge structural analysis. Successful automation of crack detection can provide a uniform and timely means for preventing further damage to structures. This laboratory has successfully applied convolutional neural networks (CNNs) to online crack detection. CNNs represent an interesting method for adaptive image processing and form a link between artificial neural networks, and finite impulse response filters. As with most artificial neural networks, the CNN is susceptible to multiple local minima, thus complexity and time must be applied in order to avoid becoming trapped within the local minima. This paper employs a standard genetic algorithm (GA) to train the weights of a 4-5x5 filter CNN in order to pass through the local minima. This technique resulted in a 92.3/spl plusmn/1.4% average success rate using 25 GA-trained CNNs presented with 100 crack (320x240 pixel) images.

53 citations


Book ChapterDOI
01 Dec 2004
TL;DR: A VLSI convolutional network architecture using a hybrid approach composed of pulse-width modulation (PWM) and digital circuits is proposed, which is called merged/mixed analog-digital architecture.
Abstract: Hierarchical convolutional neural networks represent a well-known robust image-recognition model. In order to apply this model to robot vision or various intelligent vision systems, its VLSI implementation with high performance and low power consumption is required. This paper proposes a VLSI convolutional network architecture using a hybrid approach composed of pulse-width modulation (PWM) and digital circuits. We call this approach merged/mixed analog-digital architecture. The VLSI chip includes PWM neuron circuits, PWM/digital converters, digital adder-subtracters, and digital memory. We have designed and fabricated a VLSI chip by using a 0.35 μm CMOS process. The VLSI chip can perform 6-bit precision convolution calculations for an image of 100 × 100 pixels with a receptive field area of up to 20 × 20 pixels within 5 ms, which means a performance of 2 GOPS. Power consumption of PWM neuron circuits was measured to be 20 mW. We have verified successful operations using a fabricated VLSI chip.

49 citations


Proceedings ArticleDOI
25 Jul 2004
TL;DR: A face detection system based on a class of convolutional neural networks, namely shunting inhibitory convolutionAL neural networks (SICoNNets), which has achieved 99% detection accuracy at 5% false alarm rate and toeplitz-connected network has achieved a 99% correct classification rate with only 1%false alarm rate based on the same test set.
Abstract: We present a face detection system based on a class of convolutional neural networks, namely shunting inhibitory convolutional neural networks (SICoNNets). The topology of these networks is a flexible feedforward architecture with three different connections schemes: fully-connected, toeplitz-connected and binary-connected. SICoNNets were trained, using a hybrid method based on Rprop, Quickprop and least squares, to discriminate between face and non-face patterns. All three connection schemes achieve 99% detection accuracy at 5% false alarm rate, based on a test set of 7000 face and non-face patterns. Furthermore, toeplitz-connected network was trained on a larger training set and has achieved a 99% correct classification rate with only 1% false alarm rate based on the same test set. A face detection system is built based on the trained convolutional neural networks. The system accepts an input image of arbitrary size and localizes the face patterns in the image. To localize faces of different sizes, the convolutional neural network is applied as a face detection filter at different scales. The detection scores from different scales are aggregated together to form the final decision.

30 citations


Book ChapterDOI
01 Jan 2004
TL;DR: This work introduces a population coding scheme in the CSNN architecture, and shows how the biologically inspired model attains invariance to changes in size and position of face and ensures the efficiency of face detection.
Abstract: We propose a convolutional spiking neural network (CSNN) model with population coding for robust object (e.g., face) detection. Basic structure of the network involves hierarchically alternating layers for feature detection and feature pooling. The proposed model implements hierarchical template matching by temporal integration of structured pulse packet. The packet signal represents some intermediate or complex visual feature (e.g., a pair of line segments, corners, eye, nose, etc.) that constitutes a face model. The output pulse of a feature pooling neuron represents some local feature (e.g., end-stop, blob, eye, etc.). Introducing a population coding scheme in the CSNN architecture, we show how the biologically inspired model attains invariance to changes in size and position of face and ensures the efficiency of face detection.

20 citations


Book ChapterDOI
19 Aug 2004
TL;DR: An unsupervised feature selection procedure for constructing a training set for the CNN and a figural alphabet is introduced to be used for low-level feature detection with CNN, which turned out to be useful in detecting a vocabulary set of intermediate level features and considerably reduces the complexity of the CNN.
Abstract: Convolutional Neural Networks (CNN) have proven to be useful tools for object detection and object recognition. They act like feature extractor and classifier at the same time. In this study we present an unsupervised feature selection procedure for constructing a training set for the CNN and analyze in detail the learnt receptive fields. We then introduce, for the first time, a figural alphabet to be used for low-level feature detection with CNN. This alphabet turned out to be useful in detecting a vocabulary set of intermediate level features and considerably reduces the complexity of the CNN. Moreover we propose an optimal high-level feature selection procedure and apply this to the challenging problem of car detection. We demonstrate promising results for multi-class object detection using obtained figural alphabet to detect considerably different categories of objects (e.g., faces and cars).

10 citations


Book ChapterDOI
22 Nov 2004
TL;DR: A model for face recognition using a support vector machine being fed with a feature vector generated from outputs in several modules in bottom as well as intermediate layers of convolutional neural network (CNN) trained for face detection.
Abstract: We propose a model for face recognition using a support vector machine being fed with a feature vector generated from outputs in several modules in bottom as well as intermediate layers of convolutional neural network (CNN) trained for face detection. The feature vector is composed of a set of local output distributions from feature detecting modules in the face detecting CNN. The set of local areas are automatically selected around facial components (e.g., eyes, moth, nose, etc.) detected by the CNN. Local areas for intermediate level features are defined so that information on spatial arrangement of facial components is implicitly included as output distribution from facial component detecting modules. Results demonstrate highly efficient and robust performance both in face recognition and in detection as well.

7 citations


Journal ArticleDOI
TL;DR: An entropy-based approach for constructing such neural networks for classification of acyclic structured patterns and results have shown that the networks constructed by this method can have a better performance, with respect to network size, learning speed, or recognition accuracy, than the networks obtained by other methods.
Abstract: Sperduti and Starita proposed a new type of neural network which consists of generalized recursive neurons for classification of structures. In this paper, we propose an entropy-based approach for constructing such neural networks for classification of acyclic structured patterns. Given a classification problem, the architecture, i.e., the number of hidden layers and the number of neurons in each hidden layer, and all the values of the link weights associated with the corresponding neural network are automatically determined. Experimental results have shown that the networks constructed by our method can have a better performance, with respect to network size, learning speed, or recognition accuracy, than the networks obtained by other methods.

6 citations


Book ChapterDOI
20 Sep 2004
TL;DR: A new algorithm for reducing multiply-and-accumulation operation by thresholding in a projection field and by performing weight decomposition in a 2-D neuron array is proposed.
Abstract: Hierarchical convolutional neural networks are a well-known robust image-recognition model. In order to apply this model to robot vision or various intelligent real-time vision systems, its VLSI implementation is essential. This paper proposes a new algorithm for reducing multiply-and-accumulation operation by thresholding in a projection field and by performing weight decomposition in a 2-D neuron array. We also propose a VLSI architecture based on the proposed algorithm, and estimate its operation performance.

4 citations


Proceedings ArticleDOI
01 Jan 2004
TL;DR: This paper extends the conventional perceptron and suggests a simple training strategy to extract reliability information of a perceptron-based neural networks decision.
Abstract: Artificial neural networks are an established solution for classification. Many neural networks consist of perceptrons (e.g. multilayer-perceptron networks). In this paper, we propose a concept to extract reliability information of a perceptron-based neural networks decision. Therefore, we extend the conventional perceptron and suggest a simple training strategy.

2 citations


Proceedings ArticleDOI
08 Dec 2004
TL;DR: This work has designed a recognition system based on multi neural network that can successfully discriminate the classes with similar features and has a high character recognition speed.
Abstract: It is essential to recognize the flight coupon information rapidly and correctly in the modernization and information construction of civil aviation. Flight coupons are usually with complex background and multifont. In order to achieve a high recognition rate and reliability, we have designed our recognition system based on multi neural network. Both the back propagation neural network and the convolutional neural network are used in our multi neural network model, and this model can successfully discriminate the classes with similar features. As a real-time recognition system, we have a high character recognition speed. In this paper, we present the solution and the experimental result of the system.