Institution

Facebook

Company•Tel Aviv, Israel•

About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Artificial neural network & Language model. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.

...read moreread less

Topics: Artificial neural network, Language model, Reinforcement learning, Machine translation, Social network ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Predicting Deeper into the Future of Semantic Segmentation

[...]

Pauline Luc¹, Natalia Neverova¹, Camille Couprie¹, Jakob Verbeek, Yann LeCun² - Show less +1 more•Institutions (2)

Facebook¹, New York University²

22 Mar 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: An autoregressive convolutional neural network that learns to iteratively generate multiple frames is developed and results show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames.

...read moreread less

Abstract: The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow.

...read moreread less

179 citations

Posted Content•

Exploring Randomly Wired Neural Networks for Image Recognition

[...]

Saining Xie¹, Alexander Kirillov¹, Ross Girshick, Kaiming He¹•Institutions (1)

Facebook¹

02 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors explore a more diverse set of connectivity patterns through the lens of randomly wired neural networks and define the concept of a stochastic network generator that encapsulates the entire network generation process.

...read moreread less

Abstract: Neural networks for image recognition have evolved through extensive manual design from simple chain-like models to structures with multiple wiring paths. The success of ResNets and DenseNets is due in large part to their innovative wiring plans. Now, neural architecture search (NAS) studies are exploring the joint optimization of wiring and operation types, however, the space of possible wirings is constrained and still driven by manual design despite being searched. In this paper, we explore a more diverse set of connectivity patterns through the lens of randomly wired neural networks. To do this, we first define the concept of a stochastic network generator that encapsulates the entire network generation process. Encapsulation provides a unified view of NAS and randomly wired networks. Then, we use three classical random graph models to generate randomly wired graphs for networks. The results are surprising: several variants of these random generators yield network instances that have competitive accuracy on the ImageNet benchmark. These results suggest that new efforts focusing on designing better network generators may lead to new breakthroughs by exploring less constrained search spaces with more room for novel design.

...read moreread less

179 citations

Patent•

Suggesting search results to users before receiving any search query from the users

[...]

Ryan Patterson¹, Michael Dudley Johnson¹•Institutions (1)

Facebook¹

09 Jul 2012

TL;DR: In this paper, the authors present a set of search results based on information known about the user and present the first set of results to the user in response to a user accessing a search tool and before the user submitting any search query or portion thereof.

...read moreread less

Abstract: In one embodiment, in response to a user accessing a search tool and before the user submitting any search query or portion thereof to the search tool, compiling a first set of search results based on information known about the user and presenting the first set of search results to the user.

...read moreread less

179 citations

Posted Content•DOI•

MSA Transformer

[...]

Roshan Rao¹, Jason Liu¹, Robert Verkuil¹, Joshua Meier¹, John Canny², Pieter Abbeel², Tom Sercu¹, Alexander Rives³ - Show less +4 more•Institutions (3)

Facebook¹, University of California, Berkeley², New York University³

15 Feb 2021-bioRxiv

TL;DR: This article introduced a protein language model which takes as input a set of sequences in the form of a multiple sequence alignment and interleaves row and column attention across the input sequences and is trained with a variant of the masked language modeling objective across many protein families.

...read moreread less

Abstract: Unsupervised protein language models trained across millions of diverse sequences learn structure and function of proteins. Protein language models studied to date have been trained to perform inference from individual sequences. The longstanding approach in computational biology has been to make inferences from a family of evolutionarily related sequences by fitting a model to each family independently. In this work we combine the two paradigms. We introduce a protein language model which takes as input a set of sequences in the form of a multiple sequence alignment. The model interleaves row and column attention across the input sequences and is trained with a variant of the masked language modeling objective across many protein families. The performance of the model surpasses current state-of-the-art unsupervised structure learning methods by a wide margin, with far greater parameter efficiency than prior state-of-the-art protein language models.

...read moreread less

179 citations

Proceedings Article•

Empirical Analysis of the Hessian of Over-Parametrized Neural Networks

[...]

Levent Sagun¹, Utku Evci², V. Ugur Guney³, Yann N. Dauphin⁴, Léon Bottou⁴ - Show less +1 more•Institutions (4)

New York University¹, Google², City University of New York³, Facebook⁴

11 Feb 2018

TL;DR: In this article, the authors studied the properties of common loss surfaces through their Hessian matrix and empirically showed that the spectrum of the Hessian is composed of two parts: (1) the bulk centered near zero, (2) and outliers away from the bulk.

...read moreread less

Abstract: We study the properties of common loss surfaces through their Hessian matrix. In particular, in the context of deep learning, we empirically show that the spectrum of the Hessian is composed of two parts: (1) the bulk centered near zero, (2) and outliers away from the bulk. We present numerical evidence and mathematical justifications to the following conjectures laid out by Sagun et al. (2016): Fixing data, increasing the number of parameters merely scales the bulk of the spectrum; fixing the dimension and changing the data (for instance adding more clusters or making the data less separable) only affects the outliers. We believe that our observations have striking implications for non-convex optimization in high dimensions. First, the flatness of such landscapes (which can be measured by the singularity of the Hessian) implies that classical notions of basins of attraction may be quite misleading. And that the discussion of wide/narrow basins may be in need of a new perspective around over-parametrization and redundancy that are able to create large connected components at the bottom of the landscape. Second, the dependence of small number of large eigenvalues to the data distribution can be linked to the spectrum of the covariance matrix of gradients of model outputs. With this in mind, we may reevaluate the connections within the data-architecture-algorithm framework of a model, hoping that it would shed light into the geometry of high-dimensional and non-convex spaces in modern applications. In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.

...read moreread less

178 citations

Collapse

Authors

Showing all 7875 results

Name	H-index	Papers	Citations
Yoshua Bengio	202	1033	420313
Xiang Zhang	154	1733	117576
Jitendra Malik	151	493	165087
Trevor Darrell	148	678	181113
Christopher D. Manning	138	499	147595
Robert W. Heath	128	1049	73171
Pieter Abbeel	126	589	70911
Yann LeCun	121	369	171211
Li Fei-Fei	120	420	145574
Jon Kleinberg	117	444	87865
Sergey Levine	115	652	59769
Richard Szeliski	113	359	72019
Sanjeev Kumar	113	1325	54386
Bruce Neal	108	561	87213
Larry S. Davis	107	693	49714

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

98% related

Microsoft

86.9K papers, 4.1M citations

96% related

Adobe Systems

8K papers, 214.7K citations

94% related

Carnegie Mellon University

104.3K papers, 5.9M citations

38.6K papers, 1.3M citations

90% related

Performance

Metrics

10,939

Papers

851,954

Citations

No. of papers from the Institution in previous years
Year	Papers
2024	1
2022	37
2021	1,738
2020	2,017
2019	1,607
2018	1,229