scispace - formally typeset
Search or ask a question
Institution

Facebook

CompanyTel Aviv, Israel
About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Computer science & Artificial neural network. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.


Papers
More filters
Patent
Jenny Yuen1, Luke St. Clair1
05 Dec 2012
TL;DR: In this article, a method for detecting a communication session between a first user and one or more second users is presented. But the method does not consider the social context of the communication session.
Abstract: In one embodiment, a method includes detecting a communication session between a first user and one or more second users. The method also includes determining a social context of the communication session, and determining based at least in part on the social context a set of symbols for communication by the first user in the communication session with the second users. The method further includes providing for display to the first user a set of keys corresponding to the set of symbols. The keys indicate symbols for input by the first user in the communication session.

130 citations

Posted Content
TL;DR: In this article, an ensemble of dynamics models is used to incentivize the agent to explore such that the disagreement of those ensembles is maximized, which results in a sample-efficient exploration.
Abstract: Efficient exploration is a long-standing problem in sensorimotor learning. Major advances have been demonstrated in noise-free, non-stochastic domains such as video games and simulation. However, most of these formulations either get stuck in environments with stochastic dynamics or are too inefficient to be scalable to real robotics setups. In this paper, we propose a formulation for exploration inspired by the work in active learning literature. Specifically, we train an ensemble of dynamics models and incentivize the agent to explore such that the disagreement of those ensembles is maximized. This allows the agent to learn skills by exploring in a self-supervised manner without any external reward. Notably, we further leverage the disagreement objective to optimize the agent's policy in a differentiable manner, without using reinforcement learning, which results in a sample-efficient exploration. We demonstrate the efficacy of this formulation across a variety of benchmark environments including stochastic-Atari, Mujoco and Unity. Finally, we implement our differentiable exploration on a real robot which learns to interact with objects completely from scratch. Project videos and code are at this https URL

130 citations

Patent
David Biderman1
30 Dec 2004
TL;DR: In this paper, an intelligent synchronization tool ensures access to desired content in a manner that automatically keeps the content current on the portable media device by introducing a variation threshold or user-specified degree of content variation among content downloaded to a user's mobile device to prevent the user from becoming bored.
Abstract: An intelligent synchronization tool ensures access to desired content in a manner that automatically keeps the content current on the portable media device. A variation threshold or user-specified degree of content variation may be introduced among content downloaded to a user's mobile device to prevent the user from becoming bored. Furthermore, intelligent synchronization may automatically populate the portable media device with popular content to save a user time and/or use passive monitoring techniques to ascertain a user's preferences for subsequent population.

130 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: A detailed image annotation that captures information beyond the visible pixels and requires complex reasoning about full scene structure is proposed, and it is shown that the proposed full scene annotation is surprisingly consistent between annotators, including for regions and edges.
Abstract: Common visual recognition tasks such as classification, object detection, and semantic segmentation are rapidly reaching maturity, and given the recent rate of progress, it is not unreasonable to conjecture that techniques for many of these problems will approach human levels of performance in the next few years. In this paper we look to the future: what is the next frontier in visual recognition? We offer one possible answer to this question. We propose a detailed image annotation that captures information beyond the visible pixels and requires complex reasoning about full scene structure. Specifically, we create an amodal segmentation of each image: the full extent of each region is marked, not just the visible pixels. Annotators outline and name all salient regions in the image and specify a partial depth order. The result is a rich scene structure, including visible and occluded portions of each region, figure-ground edge information, semantic labels, and object overlap. We create two datasets for semantic amodal segmentation. First, we label 500 images in the BSDS dataset with multiple annotators per image, allowing us to study the statistics of human annotations. We show that the proposed full scene annotation is surprisingly consistent between annotators, including for regions and edges. Second, we annotate 5000 images from COCO. This larger dataset allows us to explore a number of algorithmic ideas for amodal segmentation and depth ordering. We introduce novel metrics for these tasks, and along with our strong baselines, define concrete new challenges for the community.

130 citations

Posted Content
TL;DR: This paper proposes a novel method that leverages a generative model to naturally push related samples together, and results in representations that explicitly encode semantics shared between samples, unlike noise contrastive learning.
Abstract: The dominant paradigm for learning video-text representations -- noise contrastive learning -- increases the similarity of the representations of pairs of samples that are known to be related, such as text and video from the same sample, and pushes away the representations of all other pairs. We posit that this last behaviour is too strict, enforcing dissimilar representations even for samples that are semantically-related -- for example, visually similar videos or ones that share the same depicted action. In this paper, we propose a novel method that alleviates this by leveraging a generative model to naturally push these related samples together: each sample's caption must be reconstructed as a weighted combination of other support samples' visual representations. This simple idea ensures that representations are not overly-specialized to individual samples, are reusable across the dataset, and results in representations that explicitly encode semantics shared between samples, unlike noise contrastive learning. Our proposed method outperforms others by a large margin on MSR-VTT, VATEX and ActivityNet, and MSVD for video-to-text and text-to-video retrieval.

129 citations


Authors

Showing all 7875 results

NameH-indexPapersCitations
Yoshua Bengio2021033420313
Xiang Zhang1541733117576
Jitendra Malik151493165087
Trevor Darrell148678181113
Christopher D. Manning138499147595
Robert W. Heath128104973171
Pieter Abbeel12658970911
Yann LeCun121369171211
Li Fei-Fei120420145574
Jon Kleinberg11744487865
Sergey Levine11565259769
Richard Szeliski11335972019
Sanjeev Kumar113132554386
Bruce Neal10856187213
Larry S. Davis10769349714
Network Information
Related Institutions (5)
Google
39.8K papers, 2.1M citations

98% related

Microsoft
86.9K papers, 4.1M citations

96% related

Adobe Systems
8K papers, 214.7K citations

94% related

Carnegie Mellon University
104.3K papers, 5.9M citations

91% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20241
202237
20211,738
20202,017
20191,607
20181,229