scispace - formally typeset
Search or ask a question

Showing papers on "MNIST database published in 2003"


Proceedings ArticleDOI
03 Aug 2003
TL;DR: A set of concrete bestpractices that document analysis researchers can use to get good results with neural networks, including a simple "do-it-yourself" implementation of convolution with a flexible architecture suitable for many visual document problems.
Abstract: Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.

2,783 citations


Journal ArticleDOI
TL;DR: The results of handwritten digit recognition on well-known image databases using state-of-the-art feature extraction and classification techniques are competitive to the best ones previously reported on the same databases.

545 citations


Book ChapterDOI
05 Jul 2003
TL;DR: A fast SVM training algorithm for multi-classes consisting of parallel and sequential optimizations is presented and it is shown that, without sacrificing the generalization performance, the proposed algorithm has achieved a speed-up factor of 110, when compared with Keerthi et al.'s modified SMO.
Abstract: A fast SVM training algorithm for multi-classes consisting of parallel and sequential optimizations is presented. The main advantage of the parallel optimization step is to remove most non-support vectors quickly, which dramatically reduces the training time at the stage of sequential optimization. In addition, some strategies such as kernel caching, shrinking and calling BLAS functions are effectively integrated into the algorithm to speed up the training. Experiments on MNIST handwritten digit database have shown that, without sacrificing the generalization performance, the proposed algorithm has achieved a speed-up factor of 110, when compared with Keerthi et al.'s modified SMO. Moreover, for the first time ever we investigated the training performance of SVM on handwritten Chinese database ETL9B with more than 3000 categories and about 500,000 training samples. The total training time is just 5.1 hours. The raw error rate of 1.1% on ETL9B has been achieved.

66 citations


Proceedings ArticleDOI
20 Jul 2003
TL;DR: Linear particle swarm optimization is presented as an alternative to current numeric SVM training methods and is demonstrated on the MNIST character recognition dataset.
Abstract: Training a support vector machine requires solving a constrained quadratic programming problem. Linear particle swarm optimization is intuitive and simple to implement, and is presented as an alternative to current numeric SVM training methods. Performance of the new algorithm is demonstrated on the MNIST character recognition dataset.

54 citations


Journal ArticleDOI
TL;DR: A fast support vector machine (SVM) training algorithm is proposed under SVM's decomposition framework by effectively integrating kernel caching, digest and shrinking policies and stopping conditions and the promising scalability paves a new way to solve more large-scale learning problems in other domains such as data mining.
Abstract: A fast support vector machine (SVM) training algorithm is proposed under SVM's decomposition framework by effectively integrating kernel caching, digest and shrinking policies and stopping conditions. Kernel caching plays a key role in reducing the number of kernel evaluations by maximal reusage of cached kernel elements. Extensive experiments have been conducted on a large handwritten digit database MNIST to show that the proposed algorithm is much faster than Keerthi et al.'s improved SMO, about nine times. Combined with principal component analysis, the total training for ten one-against-the-rest classifiers on MNIST took less than an hour. Moreover, the proposed fast algorithm speeds up SVM training without sacrificing the generalization performance. The 0.6% error rate on MNIST test set has been achieved. The promising scalability of the proposed scheme paves a new way to solve more large-scale learning problems in other domains such as data mining.

54 citations


Proceedings ArticleDOI
03 Aug 2003
TL;DR: A simple voting scheme for off-line recognition of handprinted numerals that is not script dependent, comparable with state-of-the-art technologies and sufficiently fast for real-life applications is proposed.
Abstract: This paper proposes a simple voting scheme for off-line recognition of handprinted numerals. One of the main features of the proposed scheme is that this is not script dependent. Another interesting feature is that it is sufficiently fast for real-life applications. In contrast to the usual practices, here we studied the efficiency of a majority voting approach when all the classifiers involved are multilayer perceptrons (MLP) of different sizes and respective features are based on wavelet transforms at different resolution levels. The rationale for this approach is to explore how one can improve the recognition performance without adding much to the requirements for computational time and resources. For simplicity and efficiency, in the present work, we considered only three coarse-to-fine resolution levels of wavelet representation. We primarily simulated the proposed technique on a database of off-line handprinted Bangla (a major Indian script) numerals. We achieved 97.16% correct recognition rate on a test set of 5000 Bangla numerals. In this simulation we used two other disjoint sets (one for training and the other for validation purpose) of sizes 6000 and 1000 respectively. We have also tested our approach on MNIST database for handwritten English digits. The result is comparable with state-of-the-art technologies.

54 citations


Proceedings Article
01 Jan 2003
TL;DR: This paper describes its supervised variant (SLLE), a conceptually new method, where class membership information is used to map overlapping high dimensional data into disjoint clusters in the embedded space.
Abstract: The locally linear embedding (LLE) algorithm is an un- supervised technique recently proposed for nonlinear dimensionality re- duction In this paper, we describe its supervised variant (SLLE) This is a conceptually new method, where class membership information is used to map overlapping high dimensional data into disjoint clusters in the embedded space In experiments, we combined it with support vec- tor machine (SVM) for classifying handwritten digits from the MNIST database

34 citations


Proceedings ArticleDOI
20 Jul 2003
TL;DR: The new neural classifier for the handwritten digit recognition is proposed, based on the Permutative Coding technique, derived from the associative-projective neural networks developed in the 80th-90th.
Abstract: The new neural classifier for the handwritten digit recognition is proposed. The classifier is based on the Permutative Coding technique. This coding technique is derived from the associative-projective neural networks developed in the 80th-90th. The classifier performance was tested on the MNIST database. The error rate of 0.54% was obtained.

16 citations


Book ChapterDOI
03 Jun 2003
TL;DR: A new discriminative training rule is presented for the ScanningN-T uple classifier, based on minimizing the mean-squared error and the cross-entropy, respectively, which offers improved accuracy at the cost of slower training time, since the training is now iterative instead of single pass.
Abstract: The ScanningN-T uple classifier (SNT) was introduced by Lucas and Amiri [1, 2] as an efficient and accurate classifier for chain-coded hand-written digits. The SNT operates as speeds of tens of thousands of sequences per second, during both the trainingand the recognition phases. The main contribution of this paper is to present a new discriminative trainingrule for the SNT. Two versions of the rule are provided, based on minimizingthe mean-squared error and the cross-entropy, respectively. The discriminative trainingrule offers improved accuracy at the cost of slower trainingtime, since the trainingis now iterative instead of single pass. The cross-entropy trained SNT offers the best results, with an error rate of 2.5% on sequences derived from the MNIST test set.

15 citations


Dissertation
01 Jan 2003
TL;DR: This thesis focuses on three problems: methodologies to adapt the structure of a neural network learning system, speeding up SVM's training and facilitating test on huge data sets, and effective solutions to the above three problems.
Abstract: Over the past few years, considerable progress has been made in the area of machine learning. However, when these learning machines, including support vector machines (SVMs) and neural networks, are applied to massive sets of high-dimensional data, many challenging problems emerge, such as high computational cost and the way to adapt the structure of a learning system. Therefore, it is important to develop some new methods with computational efficiency and high accuracy such that learning algorithms can be applied more widely to areas such as data ruining, Optical Character Recognition (OCR) and bio-informatics. In this thesis, we mainly focus on three problems: methodologies to adapt the structure of a neural network learning system, speeding up SVM's training and facilitating test on huge data sets. For the first problem, a local learning framework is proposed to automatically construct the ensemble of neural networks, which are trained on local subsets so that the complexity and training time of the learning system can be reduced and its generalization performance can be enhanced. With respect to SVM's training on a very large data set with thousands of classes and high-dimensional input vectors, block diagonal matrices are used to approximate the original kernel matrix such that the original SVM optimization process can be divided into hundreds of sub-problems, which can be solved efficiently. Theoretically, the run-time complexity of the proposed algorithm linearly scales to the size of the data set, the dimension of input vectors and the number of classes. For the last problem, a fast iteration algorithm is proposed to approximate the reduced set vectors simultaneously based on the general kernel type so that the number of vectors in the decision function of each class can be reduced considerably and the testing speed is increased significantly. The main contributions of this thesis are to propose effective solutions to the above three problems. It is especially worth mentioning that the methods which are used to solve the last two problems are crucial in making support vector machines more competitive in tasks where both high accuracy and classification speed are required. The proposed SVM algorithm runs at a much higher training speed than the existing ones such as svm-light and libsvm when applied to a huge data set with thousands of classes. The total training time of SVM with the radial basis function kernel on Hanwang's handwritten Chinese database (2,144,489 training samples, 542,122 testing samples, 3755 classes and 392-dimensional input vectors) is 19 hours on P4. In addition, the proposed testing algorithm has also achieved a promising classification speed, 16,000 patterns per second, on MNIST database. Besides the efficient computation, the state-of-the-art generalization performances have also been achieved on several well-known public and commercial databases. Particularly, very low error rates of 0.38%, 0.5% and 1.0% have been reached on MNIST, Hanwang handwritten digit databases and ETL9B handwritten Chinese database.

11 citations


Proceedings ArticleDOI
18 Jun 2003
TL;DR: A novel local appearance modeling method for object detection and recognition in cluttered scenes based on the joint distribution of local feature vectors at multiple salient points and factorization with the independent component analysis (ICA) leads to computationally tractable joint probability densities, which can model high-order dependencies.
Abstract: We propose a novel local appearance modeling method for object detection and recognition in cluttered scenes. The approach is based on the joint distribution of local feature vectors at multiple salient points and factorization with the independent component analysis (ICA). The resulting densities are simple multiplicative distributions modeled through adaptive Gaussian mixture models. This leads to computationally tractable joint probability densities, which can model high-order dependencies. Furthermore, different models are compared based on appearance, color and geometry information. Also, the combination of all of them results in a hybrid model, which obtains the best results using the COIL-100 object database. Our technique has been tested under different natural and cluttered scenes with different degrees of occlusions with promising results. Finally, a large statistical test with the MNIST digit database is used to demonstrate the improved performance obtained by explicit modeling of high-order dependencies.

Book ChapterDOI
15 Dec 2003
TL;DR: 7 machine learning algorithms for image classification including the recent approach that combines building of ensembles of extremely randomized trees and extraction of sub-windows from the original images are evaluated, showing that generic methods can come remarkably close to specialized methods.
Abstract: In this paper, we evaluate 7 machine learning algorithms for image classification including our recent approach that combines building of ensembles of extremely randomized trees and extraction of sub-windows from the original images. For the approach to be generic, all these methods are applied directly on pixel values without any feature extraction. We compared them on four publicly available datasets corresponding to representative applications of image classification problems: handwritten digits (MNIST), faces (ORL), 3D objects (COIL-IOO), and textures (OUTEX). A comparison with studies from the computer vision literature shows that generic methods can come remarkably close to specialized methods. In particular, our sub-window algorithm is competitive with the state of the art, a remarkable result considering its generality and conceptual simplicity.

Proceedings ArticleDOI
24 Nov 2003
TL;DR: This work provides a large statistical test with the MNIST digit database in order to demonstrate the improved performance obtained by explicit modeling of higher-order dependencies.
Abstract: A novel local appearance modeling method for object detection and recognition in cluttered scenes. The approach is based on the joint distribution of local feature vectors at multiple salient points and their factorization with independent component analysis (ICA). The resulting densities are simple multiplicative distributions modeled through adaptative Gaussian mixture models. This leads to computationally tractable joint probability densities which can model high-order dependencies. Our technique has been initially tested under different natural and cluttered scenes with different degrees of occlusions yielding promising results. In this work, we provide a large statistical test with the MNIST digit database in order to demonstrate the improved performance obtained by explicit modeling of higher-order dependencies.