scispace - formally typeset
Search or ask a question
Institution

Facebook

CompanyTel Aviv, Israel
About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Artificial neural network & Language model. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.


Papers
More filters
Proceedings ArticleDOI
13 May 2013
TL;DR: This paper obtains bounds on the error rate of the algorithm and shows it is governed by the expansion of the graph, and demonstrates, using several synthetic and real datasets, that the algorithm outperforms the state of the art.
Abstract: In this paper we analyze a crowdsourcing system consisting of a set of users and a set of binary choice questions. Each user has an unknown, fixed, reliability that determines the user's error rate in answering questions. The problem is to determine the truth values of the questions solely based on the user answers. Although this problem has been studied extensively, theoretical error bounds have been shown only for restricted settings: when the graph between users and questions is either random or complete. In this paper we consider a general setting of the problem where the user--question graph can be arbitrary. We obtain bounds on the error rate of our algorithm and show it is governed by the expansion of the graph. We demonstrate, using several synthetic and real datasets, that our algorithm outperforms the state of the art.

230 citations

Posted Content
TL;DR: LayerDrop as mentioned in this paper is a form of structured dropout, which has a regularization effect during training and allows for efficient pruning at inference time, which leads to small BERT-like models of higher quality compared to training from scratch or using distillation.
Abstract: Overparameterized transformer networks have obtained state of the art results in various natural language processing tasks, such as machine translation, language modeling, and question answering. These models contain hundreds of millions of parameters, necessitating a large amount of computation and making them prone to overfitting. In this work, we explore LayerDrop, a form of structured dropout, which has a regularization effect during training and allows for efficient pruning at inference time. In particular, we show that it is possible to select sub-networks of any depth from one large network without having to finetune them and with limited impact on performance. We demonstrate the effectiveness of our approach by improving the state of the art on machine translation, language modeling, summarization, question answering, and language understanding benchmarks. Moreover, we show that our approach leads to small BERT-like models of higher quality compared to training from scratch or using distillation.

229 citations

Proceedings ArticleDOI
Tao Stein1, Erdong Chen1, Karan Mangla1
10 Apr 2011
TL;DR: The design of the Facebook Immune System is outlined, the challenges the system has faced and overcome, and the challenges it continues to face.
Abstract: Popular Internet sites are under attack all the time from phishers, fraudsters, and spammers. They aim to steal user information and expose users to unwanted spam. The attackers have vast resources at their disposal. They are well-funded, with full-time skilled labor, control over compromised and infected accounts, and access to global botnets. Protecting our users is a challenging adversarial learning problem with extreme scale and load requirements. Over the past several years we have built and deployed a coherent, scalable, and extensible realtime system to protect our users and the social graph. This Immune System performs realtime checks and classifications on every read and write action. As of March 2011, this is 25B checks per day, reaching 650K per second at peak. The system also generates signals for use as feedback in classifiers and other components. We believe this system has contributed to making Facebook the safest place on the Internet for people and their information. This paper outlines the design of the Facebook Immune System, the challenges we have faced and overcome, and the challenges we continue to face.

229 citations

Proceedings Article
27 Sep 2018
TL;DR: In this article, the authors investigate the efficiency of current lifelong learning approaches, in terms of sample complexity, computational and memory cost, and propose an improved version of GEM (Lopez-Paz & Ranzato, 2017), dubbed A-GEM, which enjoys the same or even better performance as GEM, while being almost as computationally and memory efficient as EWC (Kirkpatrick et al., 2016) and other regularization based methods.
Abstract: In lifelong learning, the learner is presented with a sequence of tasks, incrementally building a data-driven prior which may be leveraged to speed up learning of a new task. In this work, we investigate the efficiency of current lifelong approaches, in terms of sample complexity, computational and memory cost. Towards this end, we first introduce a new and a more realistic evaluation protocol, whereby learners observe each example only once and hyper-parameter selection is done on a small and disjoint set of tasks, which is not used for the actual learning experience and evaluation. Second, we introduce a new metric measuring how quickly a learner acquires a new skill. Third, we propose an improved version of GEM (Lopez-Paz & Ranzato, 2017), dubbed Averaged GEM (A-GEM), which enjoys the same or even better performance as GEM, while being almost as computationally and memory efficient as EWC (Kirkpatrick et al., 2016) and other regularization-based methods. Finally, we show that all algorithms including A-GEM can learn even more quickly if they are provided with task descriptors specifying the classification tasks under consideration. Our experiments on several standard lifelong learning benchmarks demonstrate that A-GEM has the best trade-off between accuracy and efficiency.

229 citations

Proceedings ArticleDOI
15 Jun 2019
TL;DR: The primary empirical finding is that pre-training at a very large scale (over 65 million videos), despite on noisy social-media videos and hashtags, substantially improves the state-of-the-art on three challenging public action recognition datasets.
Abstract: Current fully-supervised video datasets consist of only a few hundred thousand videos and fewer than a thousand domain-specific labels. This hinders the progress towards advanced video architectures. This paper presents an in-depth study of using large volumes of web videos for pre-training video models for the task of action recognition. Our primary empirical finding is that pre-training at a very large scale (over 65 million videos), despite on noisy social-media videos and hashtags, substantially improves the state-of-the-art on three challenging public action recognition datasets. Further, we examine three questions in the construction of weakly-supervised video action datasets. First, given that actions involve interactions with objects, how should one construct a verb-object pre-training label space to benefit transfer learning the most? Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning? Finally, actions are generally less well-localized in long videos vs. short videos; since action labels are provided at a video level, how should one choose video clips for best performance, given some fixed budget of number or minutes of videos?

229 citations


Authors

Showing all 7875 results

NameH-indexPapersCitations
Yoshua Bengio2021033420313
Xiang Zhang1541733117576
Jitendra Malik151493165087
Trevor Darrell148678181113
Christopher D. Manning138499147595
Robert W. Heath128104973171
Pieter Abbeel12658970911
Yann LeCun121369171211
Li Fei-Fei120420145574
Jon Kleinberg11744487865
Sergey Levine11565259769
Richard Szeliski11335972019
Sanjeev Kumar113132554386
Bruce Neal10856187213
Larry S. Davis10769349714
Network Information
Related Institutions (5)
Google
39.8K papers, 2.1M citations

98% related

Microsoft
86.9K papers, 4.1M citations

96% related

Adobe Systems
8K papers, 214.7K citations

94% related

Carnegie Mellon University
104.3K papers, 5.9M citations

91% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20241
202237
20211,738
20202,017
20191,607
20181,229