scispace - formally typeset
Conference

Neural Information Processing Systems

About: Neural Information Processing Systems is an academic conference. The conference publishes majorly in the area(s): Artificial neural network & Reinforcement learning. Over the lifetime, 12957 publication(s) have been published by the conference receiving 1236093 citation(s).

...read more

Papers
  More

Open accessProceedings Article
03 Dec 2012-
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

...read more

Topics: Convolutional neural network (61%), Deep learning (59%), Dropout (neural networks) (54%) ...read more

73,871 Citations


Open accessJournal ArticleDOI: 10.3156/JSOFT.29.5_177_2
Ian Goodfellow1, Jean Pouget-Abadie1, Mehdi Mirza1, Bing Xu1  +4 moreInstitutions (2)
08 Dec 2014-
Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

...read more

Topics: Generative model (64%), Discriminative model (54%), Approximate inference (53%) ...read more

29,410 Citations


Open accessProceedings Article
03 Jan 2001-
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

...read more

25,546 Citations


Open accessProceedings Article
Tomas Mikolov1, Ilya Sutskever1, Kai Chen1, Greg S. Corrado1  +1 moreInstitutions (1)
05 Dec 2013-
Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

...read more

Topics: Word2vec (60%), Word embedding (58%), Word order (52%) ...read more

23,982 Citations


Open accessProceedings Article
Ashish Vaswani1, Noam Shazeer1, Niki Parmar2, Jakob Uszkoreit1  +4 moreInstitutions (2)
12 Jun 2017-
Abstract: The dominant sequence transduction models are based on complex recurrent orconvolutional neural networks in an encoder and decoder configuration. The best performing such models also connect the encoder and decoder through an attentionm echanisms. We propose a novel, simple network architecture based solely onan attention mechanism, dispensing with recurrence and convolutions entirely.Experiments on two machine translation tasks show these models to be superiorin quality while being more parallelizable and requiring significantly less timeto train. Our single model with 165 million parameters, achieves 27.5 BLEU onEnglish-to-German translation, improving over the existing best ensemble result by over 1 BLEU. On English-to-French translation, we outperform the previoussingle state-of-the-art with model by 0.7 BLEU, achieving a BLEU score of 41.1.

...read more

Topics: Machine translation (58%), Encoder (52%), BLEU (51%) ...read more

21,996 Citations


Performance
Metrics
No. of papers from the Conference in previous years
YearPapers
2021516
20201,876
20191,473
20181,054
2017785
2016651

Top Attributes

Show by:

Conference's top 5 most impactful authors

Michael I. Jordan

105 papers, 49.9K citations

Yoshua Bengio

76 papers, 53.5K citations

Bernhard Schölkopf

72 papers, 11.7K citations

Francis Bach

42 papers, 5.4K citations

Geoffrey E. Hinton

42 papers, 82.1K citations

Network Information
Related Conferences (5)
International Conference on Learning Representations

3.3K papers, 458.8K citations

96% related
International Conference on Machine Learning

10.6K papers, 788.5K citations

95% related
Conference on Learning Theory

1.8K papers, 124K citations

91% related
Uncertainty in Artificial Intelligence

2.8K papers, 135.2K citations

90% related