Institution

Facebook

Company•Tel Aviv, Israel•

About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Artificial neural network & Language model. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.

...read moreread less

Topics: Artificial neural network, Language model, Reinforcement learning, Machine translation, Social network ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Towards VQA Models That Can Read

[...]

Amanpreet Singh¹, Vivek T. Natarajan¹, Meet Shah¹, Yu Jiang¹, Xinlei Chen¹, Dhruv Batra², Devi Parikh², Marcus Rohrbach¹ - Show less +4 more•Institutions (2)

Facebook¹, Georgia Institute of Technology²

15 Jun 2019

TL;DR: A novel model architecture is introduced that reads text in the image, reasons about it in the context of the image and the question, and predicts an answer which might be a deduction based on the text and the image or composed of the strings found in the images.

...read moreread less

Abstract: Studies have shown that a dominant class of questions asked by visually impaired users on images of their surroundings involves reading text in the image. But today’s VQA models can not read! Our paper takes a first step towards addressing this problem. First, we introduce a new “TextVQA” dataset to facilitate progress on this important problem. Existing datasets either have a small proportion of questions about text (e.g., the VQA dataset) or are too small (e.g., the VizWiz dataset). TextVQA contains 45,336 questions on 28,408 images that require reasoning about text to answer. Second, we introduce a novel model architecture that reads text in the image, reasons about it in the context of the image and the question, and predicts an answer which might be a deduction based on the text and the image or composed of the strings found in the image. Consequently, we call our approach Look, Read, Reason & Answer (LoRRA). We show that LoRRA outperforms existing state-of-the-art VQA models on our TextVQA dataset. We find that the gap between human performance and machine performance is significantly larger on TextVQA than on VQA 2.0, suggesting that TextVQA is well-suited to benchmark progress along directions complementary to VQA 2.0.

...read moreread less

363 citations

Proceedings Article•DOI•

SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs

[...]

Rui Miao¹, Hongyi Zeng², Changhoon Kim, Jeongkeun Lee, Minlan Yu³ - Show less +1 more•Institutions (3)

University of Southern California¹, Facebook², Yale University³

07 Aug 2017

TL;DR: The system, called SilkRoad, is defined in a 400 line P4 program and when compiled to a state-of-the-art switching ASIC, it can load-balance ten million connections simultaneously at line rate.

...read moreread less

Abstract: In this paper, we show that up to hundreds of software load balancer (SLB) servers can be replaced by a single modern switching ASIC, potentially reducing the cost of load balancing by over two orders of magnitude. Today, large data centers typically employ hundreds or thousands of servers to load-balance incoming traffic over application servers. These software load balancers (SLBs) map packets destined to a service (with a virtual IP address, or VIP), to a pool of servers tasked with providing the service (with multiple direct IP addresses, or DIPs). An SLB is stateful, it must always map a connection to the same server, even if the pool of servers changes and/or if the load is spread differently across the pool. This property is called per-connection consistency or PCC. The challenge is that the load balancer must keep track of millions of connections simultaneously.Until recently, it was not possible to implement a load balancer with PCC in a merchant switching ASIC, because high-performance switching ASICs typically can not maintain per-connection states with PCC. Newer switching ASICs provide resources and primitives to enable PCC at a large scale. In this paper, we explore how to use switching ASICs to build much faster load balancers than have been built before. Our system, called SilkRoad, is defined in a 400 line P4 program and when compiled to a state-of-the-art switching ASIC, we show it can load-balance ten million connections simultaneously at line rate.

...read moreread less

362 citations

Proceedings Article•

Neural Text Generation With Unlikelihood Training

[...]

Sean Welleck¹, Ilia Kulikov¹, Stephen Roller², Emily Dinan², Kyunghyun Cho¹, Jason Weston² - Show less +2 more•Institutions (2)

New York University¹, Facebook²

30 Apr 2020

TL;DR: It is shown that the likelihood objective itself is at fault, resulting in a model that assigns too much probability to sequences containing repeats and frequent words, unlike those from the human training distribution, thus providing a strong alternative to existing techniques.

...read moreread less

Abstract: Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core. In particular, standard likelihood training and decoding leads to dull and repetitive outputs. While some post-hoc fixes have been proposed, in particular top-k and nucleus sampling, they do not address the fact that the token-level probabilities predicted by the model are poor. In this paper we show that the likelihood objective itself is at fault, resulting in a model that assigns too much probability to sequences containing repeats and frequent words, unlike those from the human training distribution. We propose a new objective, unlikelihood training, which forces unlikely generations to be assigned lower probability by the model. We show that both token and sequence level unlikelihood training give less repetitive, less dull text while maintaining perplexity, giving superior generations using standard greedy or beam search. According to human evaluations, our approach with standard beam search also outperforms the currently popular decoding methods of nucleus sampling or beam blocking, thus providing a strong alternative to existing techniques.

...read moreread less

357 citations

Proceedings Article•

Learning to summarize from human feedback

[...]

Nisan Stiennon, Long Ouyang¹, Jeffrey Wu², Daniel M. Ziegler³, Ryan Lowe⁴, Chelsea Voss², Alec Radford², Dario Amodei², Paul F. Christiano⁵ - Show less +5 more•Institutions (5)

Stanford University¹, OpenAI², Massachusetts Institute of Technology³, Facebook⁴, University of California, Berkeley⁵

02 Sep 2020

TL;DR: The authors use reinforcement learning to fine-tune a summarization policy according to human feedback, which results in better summaries than optimizing ROUGE according to humans, and transfer to CNN/DM news articles, producing summaries nearly as good as the human reference.

...read moreread less

Abstract: As language models become more powerful, training and evaluation are increasingly bottlenecked by the data and metrics used for a particular task. For example, summarization models are often trained to predict human reference summaries and evaluated using ROUGE, but both of these metrics are rough proxies for what we really care about---summary quality. In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences. We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning. We apply our method to a version of the TL;DR dataset of Reddit posts and find that our models significantly outperform both human reference summaries and much larger models fine-tuned with supervised learning alone. Our models also transfer to CNN/DM news articles, producing summaries nearly as good as the human reference without any news-specific fine-tuning. We conduct extensive analyses to understand our human feedback dataset and fine-tuned models We establish that our reward model generalizes to new datasets, and that optimizing our reward model results in better summaries than optimizing ROUGE according to humans. We hope the evidence from our paper motivates machine learning researchers to pay closer attention to how their training loss affects the model behavior they actually want.

...read moreread less

357 citations

Proceedings Article•DOI•

SemEval-2015 Task 10: Sentiment Analysis in Twitter

[...]

Sara Rosenthal¹, Preslav Nakov², Svetlana Kiritchenko², Saif M. Mohammad³, Alan Ritter³, Veselin Stoyanov⁴ - Show less +2 more•Institutions (4)

Columbia University¹, Qatar Foundation², University of Washington³, Facebook⁴

01 Jun 2015

TL;DR: The 2015 iteration of the SemEval shared task on Sentiment Analysis in Twitter was the most popular sentiment analysis shared task to date with more than 40 teams participating in each of the last three years.

...read moreread less

Abstract: In this paper, we describe the 2015 iteration of the SemEval shared task on Sentiment Analysis in Twitter. This was the most popular sentiment analysis shared task to date with more than 40 teams participating in each of the last three years. This year’s shared task competition consisted of five sentiment prediction subtasks. Two were reruns from previous years: (A) sentiment expressed by a phrase in the context of a tweet, and (B) overall sentiment of a tweet. We further included three new subtasks asking to predict (C) the sentiment towards a topic in a single tweet, (D) the overall sentiment towards a topic in a set of tweets, and (E) the degree of prior polarity of a phrase.

...read moreread less

356 citations

Collapse

Authors

Showing all 7875 results

Name	H-index	Papers	Citations
Yoshua Bengio	202	1033	420313
Xiang Zhang	154	1733	117576
Jitendra Malik	151	493	165087
Trevor Darrell	148	678	181113
Christopher D. Manning	138	499	147595
Robert W. Heath	128	1049	73171
Pieter Abbeel	126	589	70911
Yann LeCun	121	369	171211
Li Fei-Fei	120	420	145574
Jon Kleinberg	117	444	87865
Sergey Levine	115	652	59769
Richard Szeliski	113	359	72019
Sanjeev Kumar	113	1325	54386
Bruce Neal	108	561	87213
Larry S. Davis	107	693	49714

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

98% related

Microsoft

86.9K papers, 4.1M citations

96% related

Adobe Systems

8K papers, 214.7K citations

94% related

Carnegie Mellon University

104.3K papers, 5.9M citations

38.6K papers, 1.3M citations

90% related

Performance

Metrics

10,939

Papers

851,954

Citations

No. of papers from the Institution in previous years
Year	Papers
2024	1
2022	37
2021	1,738
2020	2,017
2019	1,607
2018	1,229