Showing papers by "Yoshua Bengio published in 1998"

PDF

Open Access

Journal Article•DOI•

Gradient-based learning applied to document recognition

[...]

Yann LeCun¹, Léon Bottou², Léon Bottou³, Yoshua Bengio³, Yoshua Bengio⁴, Yoshua Bengio⁵, Patrick Haffner³ - Show less +3 more•Institutions (5)

Bell Labs¹, École Normale Supérieure², AT&T³, Alcatel-Lucent⁴, École Polytechnique de Montréal⁵

01 Jan 1998

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

...read moreread less

42,067 citations

Book•

Convolutional networks for images, speech, and time series

[...]

Yann LeCun, Yoshua Bengio¹, Yoshua Bengio², Yoshua Bengio³•Institutions (3)

École Polytechnique de Montréal¹, AT&T², Alcatel-Lucent³

01 Oct 1998

4,482 citations

Journal Article•DOI•

High quality document image compression with "DjVu"

[...]

Léon Bottou¹, Patrick Haffner¹, Paul G. Howard¹, Patrice Yvon Simard¹, Yoshua Bengio¹, Yann LeCun¹ - Show less +2 more•Institutions (1)

AT&T Labs¹

01 Jul 1998-Journal of Electronic Imaging

TL;DR: A new image compression technique called DjVu is presented that enables fast transmission of document images over low-speed connections, while faithfully reproducing the visual aspect of the document, including color, fonts, pictures, and paper texture.

...read moreread less

312 citations

Proceedings Article•DOI•

The Z-coder adaptive binary coder

[...]

Léon Bottou¹, Paul G. Howard, Yoshua Bengio•Institutions (1)

AT&T Labs¹

30 Mar 1998

TL;DR: The Z-coder is a new adaptive data compression coder for coding binary data, derived from the Golomb (1966) run-length coder, and retains most of the speed and simplicity of the earlier coder.

...read moreread less

Abstract: We present the Z-coder, a new adaptive data compression coder for coding binary data. The Z-coder is derived from the Golomb (1966) run-length coder, and retains most of the speed and simplicity of the earlier coder. The Z-coder can also be thought of as a multiplication-free approximate arithmetic coder, showing the close relationship between run-length coding and arithmetic coding. The Z-coder improves upon existing arithmetic coders by its speed and its principled design. We present a derivation of the Z-coder as well as details of the construction of its adaptive probability estimation table.

...read moreread less

59 citations

Proceedings Article•DOI•

Browsing through high quality document images with DjVu

[...]

Patrick Haffner¹, Léon Bottou², Paul G. Howard², Patrice Y. Simard², Yoshua Bengio², Y. Le Cun² - Show less +2 more•Institutions (2)

AT&T Labs¹, AT&T²

22 Apr 1998

TL;DR: Presents a new image compression technique called DjVu that is specifically geared towards the compression of high-resolution, high-quality images of scanned documents in color, and is available as a plug-in for popular Web browsers.

...read moreread less

Abstract: Presents a new image compression technique called "DjVu" that is specifically geared towards the compression of high-resolution, high-quality images of scanned documents in color. With DjVu, any screen connected to the Internet can access and display images of scanned pages while faithfully reproducing the font, color, drawings, pictures and paper texture. A typical magazine page in color at 300 dpi can be compressed down to between 40 to 60 KBytes, approximately 5 to 10 times better than JPEG for a similar level of subjective quality. Black-and-white documents are typically 15 to 30 KBytes at 300 dpi, or 4 to 8 times better than CCITT-G4. A real-time, memory-efficient version of the decoder was implemented, and is available as a plug-in for popular Web browsers.

...read moreread less

32 citations

Neural Networks for Speech Recognition

[...]

Edmondo Trentin, Yoshua Bengio, Cesare Furlanello, R. De Mori

01 Jan 1998

18 citations

Proceedings Article•DOI•

A memory-efficient adaptive Huffman coding algorithm for very large sets of symbols

[...]

Steven Pigeon¹, Yoshua Bengio•Institutions (1)

Université de Montréal¹

30 Mar 1998

TL;DR: A new algorithm for adaptive Huffman coding, called algorithm M, that uses space proportional to the number of frequency classes, and uses a tree with leaves that represent sets of symbols with the same frequency, rather than individual symbols.

...read moreread less

Abstract: Summary form only given. The problem of computing the minimum redundancy codes as we observe symbols one by one has received a lot of attention. However, existing algorithms implicitly assumes that either we have a small alphabet or that we have an arbitrary amount of memory at our disposal for the creation of a coding tree. In real life applications one may need to encode symbols coming from a much larger alphabet, for e.g. coding integers. We introduce a new algorithm for adaptive Huffman coding, called algorithm M, that uses space proportional to the number of frequency classes. The algorithm uses a tree with leaves that represent sets of symbols with the same frequency, rather than individual symbols. The code for each symbol is therefore composed of a prefix (specifying the set, or the leaf of the tree) and a suffix (specifying the symbol within the set of same-frequency symbols). The algorithm uses only two operations to remain as close as possible to the optimal: set migration and rebalancing. We analyze the computational complexity of algorithm M, and point to its advantages in terms of low memory complexity and fast decoding. Comparative experiments were performed with algorithm M on the Calgary corpus, with static Huffman coding as well as with another adaptive Huffman coding algorithms, algorithm /spl Lambda/ of Vitter. Experiments show that M performs comparably or better than the other algorithms but requires much less memory. Finally, we present an improved algorithm, M/sup +/, for non-stationary data, which models the distribution of the data in a fixed-size window in the data sequence.

...read moreread less

9 citations

Posted Content•

Using a Financial Training Criterion Rather than a Prediction Criterion

[...]

Yoshua Bengio

01 Jun 1998-Research Papers in Economics

TL;DR: In this article, the authors show that better results can be obtained when the model is directly trained in order to maximize the financial criterion of interest, here gains and losses (including those due to transactions) incurred during trading.

...read moreread less

Abstract: The application of this work is to decision taking with financial time-series, using learning algorithms. The traditional approach is to train a model using a prediction criterion, such as minimizing the squared error between predictions and actual values of a dependent variable, or maximizing the likelihood of a conditional model of the dependent variable. We find here with noisy time-series that better results can be obtained when the model is directly trained in order to maximize the financial criterion of interest, here gains and losses (including those due to transactions) incurred during trading. Experiments were performed on portfolio selection with 35 Canadian stocks Ce rapport presente une application des algorithmes d'apprentissage aux series chronologiques financieres. L'approche traditionnelle est basee sur l'estimation d'un modele de prediction, qui minimise par exemple l'erreur quadratique entre les predictions et les realisations de la variable a predire, ou qui maximise la vraisemblance d'un modele conditionnel de la variable dependante. Nos resultats sur des series financieres montrent que de meilleurs resultats peuvent etre obtenus quand les parametres du modeles sont plutot choisis de maniere a maximiser le critere financier voulu, ici les profits en tenant compte des pertes attribuables aux transactions. Des experiences realisees avec 35 titres canadiens sont decrites.

...read moreread less

8 citations

Journal Article•

Gaussian mixture densities for classification of nuclear power plant data

[...]

Yoshua Bengio, Francois Gingras, Bernard Goulard, Jean-Marc Lina, Keith Scott - Show less +1 more

01 Jan 1998-Computing and Informatics \/ Computers and Artificial Intelligence

TL;DR: Experimental results on nuclear plant data demonstrate the advantages of the proposed approach with respect to classification of signals but also their interpretation for nuclear plant monitoring.

...read moreread less

Abstract: In this paper we are concerned with the application of learning algorithms to the classification of reactor states in nuclear plants. Two aspects must be considered: (1) some types events (e.g., abnormal or rare) will not appear in the data set, but the system should be able to detect them, (2) not only classification of signals but also their interpretation are important for nuclear plant monitoring. We address both issues with a mixture of mixtures of Gaussians in which some parameters are shared to reflect the similar signals observed in different states of the reactor. An EM algorithm for these shared Gaussian mixtures is presented. Experimental results on nuclear plant data demonstrate the advantages of the proposed approach with respect to the above two points.

...read moreread less

1 citations

Memory-efficient adaptive huffman coding

[...]

Steven Pigeon, Yoshua Bengio

01 Jan 1998

1 citations