R
Razvan Pascanu
Researcher at Google
Publications - 151
Citations - 40977
Razvan Pascanu is an academic researcher from Google. The author has contributed to research in topics: Artificial neural network & Reinforcement learning. The author has an hindex of 67, co-authored 151 publications receiving 32887 citations. Previous affiliations of Razvan Pascanu include Université de Montréal.
Papers
More filters
Posted Content
On the difficulty of training Recurrent Neural Networks
TL;DR: This paper proposes a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem and validates empirically the hypothesis and proposed solutions.
Posted Content
Overcoming catastrophic forgetting in neural networks
James Kirkpatrick,Razvan Pascanu,Neil C. Rabinowitz,Joel Veness,Guillaume Desjardins,Andrei Rusu,Kieran Milan,John Quan,Tiago Ramalho,Agnieszka Grabska-Barwinska,Demis Hassabis,Claudia Clopath,Dharshan Kumaran,Raia Hadsell +13 more
TL;DR: It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.
Journal ArticleDOI
Overcoming catastrophic forgetting in neural networks
James Kirkpatrick,Razvan Pascanu,Neil C. Rabinowitz,Joel Veness,Guillaume Desjardins,Andrei Rusu,Kieran Milan,John Quan,Tiago Ramalho,Agnieszka Grabska-Barwinska,Demis Hassabis,Claudia Clopath,Dharshan Kumaran,Raia Hadsell +13 more
TL;DR: In this paper, the authors show that it is possible to train networks that can maintain expertise on tasks that they have not experienced for a long time by selectively slowing down learning on the weights important for those tasks.
Proceedings Article
On the difficulty of training recurrent neural networks
TL;DR: In this article, a gradient norm clipping strategy is proposed to deal with the vanishing and exploding gradient problems in recurrent neural networks. But the proposed solution is limited to the case of RNNs.
Posted Content
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-Rfou,Guillaume Alain,Amjad Almahairi,Christof Angermueller,Dzmitry Bahdanau,Nicolas Ballas,Frédéric Bastien,Justin Bayer,Anatoly Belikov,Alexander Belopolsky,Yoshua Bengio,Arnaud Bergeron,James Bergstra,Valentin Bisson,Josh Bleecher Snyder,Nicolas Bouchard,Nicolas Boulanger-Lewandowski,Xavier Bouthillier,Alexandre de Brébisson,Olivier Breuleux,Pierre Luc Carrier,Kyunghyun Cho,Jan Chorowski,Paul F. Christiano,Tim Cooijmans,Marc-Alexandre Côté,Myriam Côté,Aaron Courville,Yann N. Dauphin,Olivier Delalleau,Julien Demouth,Guillaume Desjardins,Sander Dieleman,Laurent Dinh,Mélanie Ducoffe,Vincent Dumoulin,Samira Ebrahimi Kahou,Dumitru Erhan,Ziye Fan,Orhan Firat,Mathieu Germain,Xavier Glorot,Ian Goodfellow,Matthew M. Graham,Caglar Gulcehre,Philippe Hamel,Iban Harlouchet,Jean-Philippe Heng,Balázs Hidasi,Sina Honari,Arjun Jain,Sébastien Jean,Kai Jia,Mikhail Korobov,Vivek Kulkarni,Alex Lamb,Pascal Lamblin,Eric Larsen,César Laurent,Sean Lee,Simon Lefrancois,Simon Lemieux,Nicholas Léonard,Zhouhan Lin,Jesse A. Livezey,Cory Lorenz,Jeremiah Lowin,Qianli Ma,Pierre-Antoine Manzagol,Olivier Mastropietro,Robert T. McGibbon,Roland Memisevic,Bart van Merriënboer,Vincent Michalski,Mehdi Mirza,Alberto Orlandi,Chris Pal,Razvan Pascanu,Mohammad Pezeshki,Colin Raffel,Daniel Renshaw,Matthew Rocklin,Adriana Romero,Markus Roth,Peter Sadowski,John Salvatier,François Savard,Jan Schlüter,John Schulman,Gabriel Schwartz,Iulian Vlad Serban,Dmitriy Serdyuk,Samira Shabanian,Étienne Simon,Sigurd Spieckermann,S. Ramana Subramanyam,Jakub Sygnowski,Jérémie Tanguay,Gijs van Tulder,Joseph Turian,Sebastian Urban,Pascal Vincent,Francesco Visin,Harm de Vries,David Warde-Farley,Dustin J. Webb,Matthew Willson,Kelvin Xu,Lijun Xue,Li Yao,Saizheng Zhang,Ying Zhang +111 more
TL;DR: The performance of Theano is compared against Torch7 and TensorFlow on several machine learning models and recently-introduced functionalities and improvements are discussed.