scispace - formally typeset
Search or ask a question

Showing papers by "Koray Kavukcuoglu published in 2011"


Journal Article
TL;DR: A unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling is proposed.
Abstract: We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.

6,734 citations


Proceedings Article
01 Jan 2011
TL;DR: Torch7 is a versatile numeric computing framework and machine learning library that extends Lua that can easily be interfaced to third-party software thanks to Lua’s light interface.
Abstract: Torch7 is a versatile numeric computing framework and machine learning library that extends Lua. Its goal is to provide a flexible environment to design and train learning machines. Flexibility is obtained via Lua, an extremely lightweight scripting language. High performance is obtained via efficient OpenMP/SSE and CUDA implementations of low-level numeric routines. Torch7 can easily be interfaced to third-party software thanks to Lua’s light interface.

1,602 citations


Posted Content
TL;DR: The authors proposed a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling.
Abstract: We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.

902 citations


Book ChapterDOI
01 Jan 2011
TL;DR: The majority of the feature extraction systems have a common structure composed of a filter bank, a nonlinear operation (quantization, winner-take-all, sparsification, normalization, and/or pointwise saturation), and finally a pooling operation (max, average, or histogramming).
Abstract: appropriate internal representations automatically, the way animals and humansseem to learn by simply looking at the world? In the time-honored approach tocomputer vision (and to pattern recognition in general), the question is avoided:internal representations are produced by a hand-crafted feature extractor, whoseoutput is fed to a trainable classifier. While the issue of learning features hasbeen a topic of interest for many years, considerable progress has been achievedin the last few years with the development of so-called

134 citations


Proceedings Article
01 Jan 2011
TL;DR: A system to automatically learn features from audio in an unsupervised manner using an overcomplete dictionary which can be used to sparsely decompose log-scaled spectrograms and an efficient encoder which quickly maps new inputs to approximations of their sparse representations using the learned dictionary.
Abstract: In this work we present a system to automatically learn features from audio in an unsupervised manner. Our method first learns an overcomplete dictionary which can be used to sparsely decompose log-scaled spectrograms. It then trains an efficient encoder which quickly maps new inputs to approximations of their sparse representations using the learned dictionary. This avoids expensive iterative procedures usually required to infer sparse codes. We then use these sparse codes as inputs for a linear Support Vector Machine (SVM). Our system achieves 83.4% accuracy in predicting genres on the GTZAN dataset, which is competitive with current state-of-the-art approaches. Furthermore, the use of a simple linear classifier combined with a fast feature extraction system allows our approach to scale well to large datasets.

130 citations


Patent
26 Apr 2011
TL;DR: In this paper, a cell phone having distributed artificial intelligence services is provided, which includes a neural network for performing a first pass of object recognition on an image to identify objects of interest therein based on one or more criterion.
Abstract: A cell phone having distributed artificial intelligence services is provided. The cell phone includes a neural network for performing a first pass of object recognition on an image to identify objects of interest therein based on one or more criterion. The cell phone also includes a patch generator for deriving patches from the objects of interest. Each of the patches includes a portion of a respective one of the objects of interest. The cell phone additionally includes a transmitter for transmitting the patches to a server for further processing in place of an entirety of the image to reduce network traffic.

28 citations