Institution

Facebook

Company•Tel Aviv, Israel•

About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Computer science & Artificial neural network. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.

...read moreread less

Topics: Computer science, Artificial neural network, Language model, Context (language use), Reinforcement learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Visual Storytelling

[...]

Ting-Hao Kenneth Huang¹, Francis Ferraro², Nasrin Mostafazadeh³, Ishan Misra¹, Aishwarya Agrawal⁴, Jacob Devlin¹, Ross Girshick⁵, Xiaodong He⁶, Pushmeet Kohli⁶, Dhruv Batra⁴, C. Lawrence Zitnick⁶, Devi Parikh⁴, Lucy Vanderwende⁶, Michel Galley⁶, Margaret Mitchell⁶ - Show less +11 more•Institutions (6)

Carnegie Mellon University¹, Johns Hopkins University², University of Rochester³, Virginia Tech⁴, Facebook⁵, Microsoft⁶

13 Jun 2016

TL;DR: Modelling concrete description as well as figurative and social language, as provided in this dataset and the storytelling task, has the potential to move artificial intelligence from basic understandings of typical visual scenes towards more and more human-like understanding of grounded event structure and subjective expression.

...read moreread less

Abstract: We introduce the first dataset for sequential vision-to-language, and explore how this data may be used for the task of visual storytelling. The first release of this dataset, SIND1 v.1, includes 81,743 unique photos in 20,211 sequences, aligned to both descriptive (caption) and story language. We establish several strong baselines for the storytelling task, and motivate an automatic metric to benchmark progress. Modelling concrete description as well as figurative and social language, as provided in this dataset and the storytelling task, has the potential to move artificial intelligence from basic understandings of typical visual scenes towards more and more human-like understanding of grounded event structure and subjective expression.

...read moreread less

184 citations

Proceedings Article•

Revisiting Classifier Two-Sample Tests

[...]

David Lopez-Paz¹, Maxime Oquab²•Institutions (2)

Facebook¹, École Normale Supérieure²

04 Nov 2016

TL;DR: The properties, performance, and uses of C2ST are established and their main theoretical properties are analyzed, and their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks, are proposed.

...read moreread less

Abstract: The goal of two-sample tests is to assess whether two samples, $S P ∼ P n$ and $S Q ∼ Q m$ , are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the n examples in S P with a positive label, and by pairing the m examples in $S Q$ with a negative label. If the null hypothesis " $P = Q$ " is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where P and Q differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery.

...read moreread less

184 citations

Proceedings Article•

Self-Censorship on Facebook

[...]

Sauvik Das¹, Adam D. I. Kramer²•Institutions (2)

Carnegie Mellon University¹, Facebook²

28 Jun 2013

TL;DR: There is specific evidence supporting the theory that a user’s “perceived audience” lies at the heart of the issue: posts are censored more frequently than comments, with status updates and posts directed at groups censored most frequently of all sharing use cases investigated.

...read moreread less

Abstract: We report results from an exploratory analysis examining “last-minute” self-censorship, or content that is filtered after being written, on Facebook. We collected data from 3.9 million users over 17 days and associate self-censorship behavior with features describing users, their social graph, and the interactions between them. Our results indicate that 71% of users exhibited some level of last-minute self-censorship in the time period, and provide specific evidence supporting the theory that a user’s “perceived audience” lies at the heart of the issue: posts are censored more frequently than comments, with status updates and posts directed at groups censored most frequently of all sharing use cases investigated. Furthermore, we find that: people with more boundaries to regulate censor more; males censor more posts than females and censor even more posts with mostly male friends than do females, but censor no more comments than females; people who exercise more control over their audience censor more content; and, users with more politically and age diverse friends censor less, in general.

...read moreread less

183 citations

Proceedings Article•DOI•

Learning to recognize reliable users and content in social media with coupled mutual reinforcement

[...]

Jiang Bian¹, Yandong Liu², Ding Zhou³, Eugene Agichtein², Hongyuan Zha¹ - Show less +1 more•Institutions (3)

Georgia Institute of Technology¹, Emory University², Facebook³

20 Apr 2009

TL;DR: Results of a large scale evaluation demonstrate that the semi-supervised coupled mutual reinforcement framework for simultaneously calculating content quality and user reputation and quality estimation significantly improves the accuracy of search over CQA archives over the state-of-the-art methods.

...read moreread less

Abstract: Community Question Answering (CQA) has emerged as a popular forum for users to pose questions for other users to answer. Over the last few years, CQA portals such as Naver and Yahoo! Answers have exploded in popularity, and now provide a viable alternative to general purpose Web search. At the same time, the answers to past questions submitted in CQA sites comprise a valuable knowledge repository which could be a gold mine for information retrieval and automatic question answering. Unfortunately, the quality of the submitted questions and answers varies widely - increasingly so that a large fraction of the content is not usable for answering queries. Previous approaches for retrieving relevant and high quality content have been proposed, but they require large amounts of manually labeled data -- which limits the applicability of the supervised approaches to new sites and domains. In this paper we address this problem by developing a semi-supervised coupled mutual reinforcement framework for simultaneously calculating content quality and user reputation, that requires relatively few labeled examples to initialize the training process. Results of a large scale evaluation demonstrate that our methods are more effective than previous approaches for finding high-quality answers, questions, and users. More importantly, our quality estimation significantly improves the accuracy of search over CQA archives over the state-of-the-art methods.

...read moreread less

182 citations

Proceedings Article•DOI•

Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings

[...]

Mikel Artetxe¹, Holger Schwenk²•Institutions (2)

University of the Basque Country¹, Facebook²

01 Jul 2019

TL;DR: This paper proposes a new method for this task based on multilingual sentence embeddings, which relies on nearest neighbor retrieval with a hard threshold over cosine similarity, and accounts for the scale inconsistencies of this measure.

...read moreread less

Abstract: Machine translation is highly sensitive to the size and quality of the training data, which has led to an increasing interest in collecting and filtering large parallel corpora. In this paper, we propose a new method for this task based on multilingual sentence embeddings. In contrast to previous approaches, which rely on nearest neighbor retrieval with a hard threshold over cosine similarity, our proposed method accounts for the scale inconsistencies of this measure, considering the margin between a given sentence pair and its closest candidates instead. Our experiments show large improvements over existing methods. We outperform the best published results on the BUCC mining task and the UN reconstruction task by more than 10 F1 and 30 precision points, respectively. Filtering the English-German ParaCrawl corpus with our approach, we obtain 31.2 BLEU points on newstest2014, an improvement of more than one point over the best official filtered version.

...read moreread less

182 citations

Collapse

Authors

Showing all 7875 results

Name	H-index	Papers	Citations
Yoshua Bengio	202	1033	420313
Xiang Zhang	154	1733	117576
Jitendra Malik	151	493	165087
Trevor Darrell	148	678	181113
Christopher D. Manning	138	499	147595
Robert W. Heath	128	1049	73171
Pieter Abbeel	126	589	70911
Yann LeCun	121	369	171211
Li Fei-Fei	120	420	145574
Jon Kleinberg	117	444	87865
Sergey Levine	115	652	59769
Richard Szeliski	113	359	72019
Sanjeev Kumar	113	1325	54386
Bruce Neal	108	561	87213
Larry S. Davis	107	693	49714

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

98% related

Microsoft

86.9K papers, 4.1M citations

96% related

Adobe Systems

8K papers, 214.7K citations

94% related

Carnegie Mellon University

104.3K papers, 5.9M citations

38.6K papers, 1.3M citations

90% related

Performance

Metrics

10,939

Papers

851,954

Citations

No. of papers from the Institution in previous years
Year	Papers
2024	1
2022	37
2021	1,738
2020	2,017
2019	1,607
2018	1,229