Institution
Company•Tel Aviv, Israel•
About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Computer science & Artificial neural network. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.
Topics: Computer science, Artificial neural network, Language model, Context (language use), Reinforcement learning
Papers published on a yearly basis
Papers
More filters
•
TL;DR: This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture.
Abstract: Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due to the so-called vanishing gradient problem. In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent. This is achieved by using a slight structural modification of the simple recurrent neural network architecture. We encourage some of the hidden units to change their state slowly by making part of the recurrent weight matrix close to identity, thus forming kind of a longer term memory. We evaluate our model in language modeling experiments, where we obtain similar performance to the much more complex Long Short Term Memory (LSTM) networks (Hochreiter & Schmidhuber, 1997).
238 citations
••
01 Jun 2019TL;DR: This paper presents a new data set of 57k annotated utterances in English, Spanish, Spanish and Thai and uses this data set to evaluate three different cross-lingual transfer methods, finding that given several hundred training examples in the the target language, the latter two methods outperform translating the training data.
Abstract: One of the first steps in the utterance interpretation pipeline of many task-oriented conversational AI systems is to identify user intents and the corresponding slots. Since data collection for machine learning models for this task is time-consuming, it is desirable to make use of existing data in a high-resource language to train models in low-resource languages. However, development of such models has largely been hindered by the lack of multilingual training data. In this paper, we present a new data set of 57k annotated utterances in English (43k), Spanish (8.6k) and Thai (5k) across the domains weather, alarm, and reminder. We use this data set to evaluate three different cross-lingual transfer methods: (1) translating the training data, (2) using cross-lingual pre-trained embeddings, and (3) a novel method of using a multilingual machine translation encoder as contextual word representations. We find that given several hundred training examples in the the target language, the latter two methods outperform translating the training data. Further, in very low-resource settings, multilingual contextual word representations give better results than using cross-lingual static embeddings. We also compare the cross-lingual methods to using monolingual resources in the form of contextual ELMo representations and find that given just small amounts of target language data, this method outperforms all cross-lingual methods, which highlights the need for more sophisticated cross-lingual methods.
238 citations
•
25 Nov 2016TL;DR: An adversarial training approach to train semantic segmentation models that can detect and correct higher-order inconsistencies between ground truth segmentation maps and the ones produced by the segmentation net.
Abstract: Adversarial training has been shown to produce state of the art results for generative image modeling. In this paper we propose an adversarial training approach to train semantic segmentation models. We train a convolutional semantic segmentation network along with an adversarial network that discriminates segmentation maps coming either from the ground truth or from the segmentation network. The motivation for our approach is that it can detect and correct higher-order inconsistencies between ground truth segmentation maps and the ones produced by the segmentation net. Our experiments show that our adversarial training approach leads to improved accuracy on the Stanford Background and PASCAL VOC 2012 datasets.
237 citations
••
27 Apr 2015TL;DR: The pace of change and increasing complexity of modern code makes it difficult to produce error-free software, so available tools are often lacking in helping programmers develop more reliable and secure applications.
Abstract: For organisations like Facebook, high quality software is important. However, the pace of change and increasing complexity of modern code makes it difficult to produce error-free software. Available tools are often lacking in helping programmers develop more reliable and secure applications.
236 citations
•
15 Feb 2018TL;DR: House3D is built, a rich, extensible and efficient environment that contains 45,622 human-designed 3D scenes of houses, equipped with a diverse set of fully labeled 3D objects, textures and scene layouts, based on the SUNCG dataset and an emphasis on semantic-level generalization.
Abstract: Teaching an agent to navigate in an unseen 3D environment is a challenging task, even in the event of simulated environments. To generalize to unseen environments, an agent needs to be robust to low-level variations (e.g. color, texture, object changes), and also high-level variations (e.g. layout changes of the environment). To improve overall generalization, all types of variations in the environment have to be taken under consideration via different level of data augmentation steps. To this end, we propose House3D, a rich, extensible and efficient environment that contains 45,622 human-designed 3D scenes of visually realistic houses, ranging from single-room studios to multi-storied houses, equipped with a diverse set of fully labeled 3D objects, textures and scene layouts, based on the SUNCG dataset (Song this http URL.). The diversity in House3D opens the door towards scene-level augmentation, while the label-rich nature of House3D enables us to inject pixel- & task-level augmentations such as domain randomization (Toubin et. al.) and multi-task training. Using a subset of houses in House3D, we show that reinforcement learning agents trained with an enhancement of different levels of augmentations perform much better in unseen environments than our baselines with raw RGB input by over 8% in terms of navigation success rate. House3D is publicly available at this http URL.
236 citations
Authors
Showing all 7875 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Xiang Zhang | 154 | 1733 | 117576 |
Jitendra Malik | 151 | 493 | 165087 |
Trevor Darrell | 148 | 678 | 181113 |
Christopher D. Manning | 138 | 499 | 147595 |
Robert W. Heath | 128 | 1049 | 73171 |
Pieter Abbeel | 126 | 589 | 70911 |
Yann LeCun | 121 | 369 | 171211 |
Li Fei-Fei | 120 | 420 | 145574 |
Jon Kleinberg | 117 | 444 | 87865 |
Sergey Levine | 115 | 652 | 59769 |
Richard Szeliski | 113 | 359 | 72019 |
Sanjeev Kumar | 113 | 1325 | 54386 |
Bruce Neal | 108 | 561 | 87213 |
Larry S. Davis | 107 | 693 | 49714 |