Institution
Company•Tel Aviv, Israel•
About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Computer science & Artificial neural network. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.
Topics: Computer science, Artificial neural network, Language model, Context (language use), Reinforcement learning
Papers published on a yearly basis
Papers
More filters
••
06 Jun 2021TL;DR: This paper proposed the Hidden Unit BERT (HUBERT) model, which utilizes a cheap k-means clustering step to provide aligned target labels for pre-training of a BERT model.
Abstract: Compared to vision and language applications, self-supervised pre-training approaches for ASR are challenged by three unique problems: (1) There are multiple sound units in each input utterance, (2) With audio-only pre-training, there is no lexicon of sound units, and (3) Sound units have variable lengths with no explicit segmentation. In this paper, we propose the Hidden-Unit BERT (HUBERT) model which utilizes a cheap k-means clustering step to provide aligned target labels for pre-training of a BERT model. A key ingredient of our approach is applying the predictive loss over the masked regions only. This allows the pre-training stage to benefit from the consistency of the unsupervised teacher rather that its intrinsic quality. Starting with a simple k-means teacher of 100 cluster, and using two iterations of clustering, the HUBERT model matches the state-of-the-art wav2vec 2.0 performance on the ultra low-resource Libri-light 10h, 1h, 10min supervised subsets.
106 citations
•
22 Nov 2005TL;DR: In this article, a computer-implemented method for ranking files from an Internet search is provided, which comprises assigning a score to each file based on at least one of the following factors: recency, editorial popularity, clickthru popularity, favorites metadata, or collaborative filtering.
Abstract: A computer-implemented method is provided for ranking files from an Internet search. In one embodiment, the method comprises assigning a score to each file based on at least one of the following factors: recency, editorial popularity, clickthru popularity, favorites metadata, or favorites collaborative filtering. The files may be organized based on the assigned scores to provide users with more accurate search results.
106 citations
••
01 Oct 2019TL;DR: The nocaps benchmark as discussed by the authors is a large-scale benchmark for object captioning, which consists of 166,100 human-generated captions describing 15,100 images from the Open Images validation and test sets.
Abstract: Image captioning models have achieved impressive results on datasets containing limited visual concepts and large amounts of paired image-caption training data. However, if these models are to ever function in the wild, a much larger variety of visual concepts must be learned, ideally from less supervision. To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task. Dubbed ‘nocaps’, for novel object captioning at scale, our benchmark consists of 166,100 human-generated captions describing 15,100 images from the Open Images validation and test sets. The associated training data consists of COCO image-caption pairs, plus Open Images image-level labels and object bounding boxes. Since Open Images contains many more classes than COCO, nearly 400 object classes seen in test images have no or very few associated training captions (hence, nocaps). We extend existing novel object captioning models to establish strong baselines for this benchmark and provide analysis to guide future work.
105 citations
••
24 May 2018TL;DR: A joint multilingual sentence embedding is learned and it is used to filter noisy parallel data and to mine for parallel data in large news collections and to identify parallel sentences in comparable corpora on the BUCC shared task.
Abstract: We learn a joint multilingual sentence embedding and use the distance between sentences in different languages to filter noisy parallel data and to mine for parallel data in large news collections. We are able to improve a competitive baseline on the WMT’14 English to German task by 0.3 BLEU by filtering out 25% of the training data. The same approach is used to mine additional bitexts for the WMT’14 system and to obtain competitive results on the BUCC shared task to identify parallel sentences in comparable corpora. The approach is generic, it can be applied to many language pairs and it is independent of the architecture of the machine translation system.
105 citations
••
23 Apr 2018TL;DR: This work designs a key-value store, MyNVM, which leverages an NVM block device to reduce DRAM usage, and to reduce the total cost of ownership, while providing comparable latency and queries-per-second as MyRocks on a server with a much larger amount of DRAM.
Abstract: Popular SSD-based key-value stores consume a large amount of DRAM in order to provide high-performance database operations. However, DRAM can be expensive for data center providers, especially given recent global supply shortages that have resulted in increasing DRAM costs. In this work, we design a key-value store, MyNVM, which leverages an NVM block device to reduce DRAM usage, and to reduce the total cost of ownership, while providing comparable latency and queries-per-second (QPS) as MyRocks on a server with a much larger amount of DRAM. Replacing DRAM with NVM introduces several challenges. In particular, NVM has limited read bandwidth, and it wears out quickly under a high write bandwidth. We design novel solutions to these challenges, including using small block sizes with a partitioned index, aligning blocks post-compression to reduce read bandwidth, utilizing dictionary compression, implementing an admission control policy for which objects get cached in NVM to control its durability, as well as replacing interrupts with a hybrid polling mechanism. We implemented MyNVM and measured its performance in Facebook's production environment. Our implementation reduces the size of the DRAM cache from 96 GB to 16 GB, and incurs a negligible impact on latency and queries-per-second compared to MyRocks. Finally, to the best of our knowledge, this is the first study on the usage of NVM devices in a commercial data center environment.
105 citations
Authors
Showing all 7875 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Xiang Zhang | 154 | 1733 | 117576 |
Jitendra Malik | 151 | 493 | 165087 |
Trevor Darrell | 148 | 678 | 181113 |
Christopher D. Manning | 138 | 499 | 147595 |
Robert W. Heath | 128 | 1049 | 73171 |
Pieter Abbeel | 126 | 589 | 70911 |
Yann LeCun | 121 | 369 | 171211 |
Li Fei-Fei | 120 | 420 | 145574 |
Jon Kleinberg | 117 | 444 | 87865 |
Sergey Levine | 115 | 652 | 59769 |
Richard Szeliski | 113 | 359 | 72019 |
Sanjeev Kumar | 113 | 1325 | 54386 |
Bruce Neal | 108 | 561 | 87213 |
Larry S. Davis | 107 | 693 | 49714 |