Institution
Company•Tel Aviv, Israel•
About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Artificial neural network & Language model. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.
Topics: Artificial neural network, Language model, Reinforcement learning, Machine translation, Social network
Papers published on a yearly basis
Papers
More filters
••
13 May 2013TL;DR: This work finds that the space of subgraph frequencies is governed both by its combinatorial properties --- based on extremal results that constrain all graphs --- as well as by its empirical properties, manifested in the way that real social graphs appear to lie near a simple one-dimensional curve through this space.
Abstract: A growing set of on-line applications are generating data that can be viewed as very large collections of small, dense social graphs --- these range from sets of social groups, events, or collaboration projects to the vast collection of graph neighborhoods in large social networks A natural question is how to usefully define a domain-independent 'coordinate system' for such a collection of graphs, so that the set of possible structures can be compactly represented and understood within a common space In this work, we draw on the theory of graph homomorphisms to formulate and analyze such a representation, based on computing the frequencies of small induced subgraphs within each graph We find that the space of subgraph frequencies is governed both by its combinatorial properties --- based on extremal results that constrain all graphs --- as well as by its empirical properties --- manifested in the way that real social graphs appear to lie near a simple one-dimensional curve through this space We develop flexible frameworks for studying each of these aspects For capturing empirical properties, we characterize a simple stochastic generative model, a single-parameter extension of Erdos-Renyi random graphs, whose stationary distribution over subgraphs closely tracks the one-dimensional concentration of the real social graph families For the extremal properties, we develop a tractable linear program for bounding the feasible space of subgraph frequencies by harnessing a toolkit of known extremal graph theory Together, these two complementary frameworks shed light on a fundamental question pertaining to social graphs: what properties of social graphs are 'social' properties and what properties are 'graph' properties? We conclude with a brief demonstration of how the coordinate system we examine can also be used to perform classification tasks, distinguishing between structures arising from different types of social graphs
186 citations
•
TL;DR: VoxPopuli is introduced, a large-scale multilingual corpus providing 400K hours of unlabeled speech data in 23 languages and it is the largest open data to date for unsupervised representation learning as well as semi-supervised learning.
Abstract: We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised representation learning as well as semi-supervised learning. VoxPopuli also contains 1.8K hours of transcribed speeches in 16 languages and their aligned oral interpretations into 5 other languages totaling 5.1K hours. We provide speech recognition baselines and validate the versatility of VoxPopuli unlabelled data in semi-supervised learning under challenging out-of-domain settings. We will release the corpus at this https URL under an open license.
186 citations
•
01 Jan 2018TL;DR: In this article, a double attention block is proposed to aggregate and propagate informative global features from the entire spatio-temporal space of input images/videos, enabling subsequent convolution layers to access features of the entire space efficiently.
Abstract: Learning to capture long-range relations is fundamental to image/video recognition. Existing CNN models generally rely on increasing depth to model such relations which is highly inefficient. In this work, we propose the “double attention block”, a novel component that aggregates and propagates informative global features from the entire spatio-temporal space of input images/videos, enabling subsequent convolution layers to access features from the entire space efficiently. The component is designed with a double attention mechanism in two steps, where the first step gathers features from the entire space into a compact set through second-order attention pooling and the second step adaptively selects and distributes features to each location via another attention. The proposed double attention block is easy to adopt and can be plugged into existing deep neural networks conveniently. We conduct extensive ablation studies and experiments on both image and video recognition tasks for evaluating its performance. On the image recognition task, a ResNet-50 equipped with our double attention blocks outperforms a much larger ResNet-152 architecture on ImageNet-1k dataset with over 40% less the number of parameters and less FLOPs. On the action recognition task, our proposed model achieves the state-of-the-art results on the Kinetics and UCF-101 datasets with significantly higher efficiency than recent works.
185 citations
••
03 Apr 2020TL;DR: The approximate financial and environmental costs of training and tuning neural network models for NLP are quantified, incorporating updated estimates and broader information from recent related publications, and actionable recommendations to reduce costs and improve equity in the machine learning and artificial intelligence community are provided.
Abstract: The field of artificial intelligence has experienced a dramatic methodological shift towards large neural networks trained on plentiful data This shift has been fueled by recent advances in hardware and techniques enabling remarkable levels of computation, resulting in impressive advances in AI across many applications However, the massive computation required to obtain these exciting results is costly both financially, due to the price of specialized hardware and electricity or cloud compute time, and to the environment, as a result of non-renewable energy used to fuel modern tensor processing hardware In a paper published this year at ACL, we brought this issue to the attention of NLP researchers by quantifying the approximate financial and environmental costs of training and tuning neural network models for NLP (Strubell, Ganesh, and McCallum 2019) In this extended abstract, we briefly summarize our findings in NLP, incorporating updated estimates and broader information from recent related publications, and provide actionable recommendations to reduce costs and improve equity in the machine learning and artificial intelligence community
185 citations
•
TL;DR: This work combines a scene representation network with a low-dimensional morphable model which provides explicit control over pose and expressions and shows that this learned volumetric representation allows for photorealistic image generation that surpasses the quality of state-of-the-art video-based reenactment methods.
Abstract: We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. Digitally modeling and reconstructing a talking human is a key building-block for a variety of applications. Especially, for telepresence applications in AR or VR, a faithful reproduction of the appearance including novel viewpoints or head-poses is required. In contrast to state-of-the-art approaches that model the geometry and material properties explicitly, or are purely image-based, we introduce an implicit representation of the head based on scene representation networks. To handle the dynamics of the face, we combine our scene representation network with a low-dimensional morphable model which provides explicit control over pose and expressions. We use volumetric rendering to generate images from this hybrid representation and demonstrate that such a dynamic neural scene representation can be learned from monocular input data only, without the need of a specialized capture setup. In our experiments, we show that this learned volumetric representation allows for photo-realistic image generation that surpasses the quality of state-of-the-art video-based reenactment methods.
185 citations
Authors
Showing all 7875 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Xiang Zhang | 154 | 1733 | 117576 |
Jitendra Malik | 151 | 493 | 165087 |
Trevor Darrell | 148 | 678 | 181113 |
Christopher D. Manning | 138 | 499 | 147595 |
Robert W. Heath | 128 | 1049 | 73171 |
Pieter Abbeel | 126 | 589 | 70911 |
Yann LeCun | 121 | 369 | 171211 |
Li Fei-Fei | 120 | 420 | 145574 |
Jon Kleinberg | 117 | 444 | 87865 |
Sergey Levine | 115 | 652 | 59769 |
Richard Szeliski | 113 | 359 | 72019 |
Sanjeev Kumar | 113 | 1325 | 54386 |
Bruce Neal | 108 | 561 | 87213 |
Larry S. Davis | 107 | 693 | 49714 |