Institution
Company•Tel Aviv, Israel•
About: Facebook is a company organization based out in Tel Aviv, Israel. It is known for research contribution in the topics: Artificial neural network & Language model. The organization has 7856 authors who have published 10906 publications receiving 570123 citations. The organization is also known as: facebook.com & FB.
Topics: Artificial neural network, Language model, Reinforcement learning, Machine translation, Social network
Papers published on a yearly basis
Papers
More filters
••
Pacific Northwest National Laboratory1, Lawrence Berkeley National Laboratory2, National Center for Computational Sciences3, Brookhaven National Laboratory4, Argonne National Laboratory5, Intel6, University of Texas at Arlington7, State University of New York System8, Pennsylvania State University9, Oak Ridge National Laboratory10, Washington University in St. Louis11, Wellesley College12, Maria Curie-Skłodowska University13, Iowa State University14, Academy of Sciences of the Czech Republic15, University of Tennessee at Martin16, Université libre de Bruxelles17, Facebook18, Russian Academy of Sciences19, University of Minnesota20, University of Washington21, United States Naval Research Laboratory22, Georgia Institute of Technology23, University of St Andrews24, Universidad Autónoma Metropolitana25, University of California, San Diego26, Saarland University27, Sandia National Laboratories28, University of Illinois at Urbana–Champaign29, University of Iceland30, Australian National University31, Florida Institute of Technology32, University of Science and Technology of China33, Oswaldo Cruz Foundation34, Cardiff University35, Louisiana State University36, Chinese Academy of Sciences37, National Autonomous University of Mexico38, University of Florida39, Los Alamos National Laboratory40, University of Oviedo41, Prince of Songkla University42, Ames Laboratory43, University of Utah44, Northwestern University45, Universal Display Corporation46, Federal University of Pernambuco47, CD-adapco48, Cray49, Massachusetts Institute of Technology50, Nvidia51, University of Tennessee52, Shandong Normal University53, University of Cambridge54, Advanced Micro Devices55, Technische Universität München56, Stanford University57, Wuhan University of Technology58, Stony Brook University59
TL;DR: The NWChem computational chemistry suite is reviewed, including its history, design principles, parallel tools, current capabilities, outreach, and outlook.
Abstract: Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principle-driven methodologies to model complex chemical and materials processes. Over the past few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach, and outlook.
342 citations
••
TL;DR: To improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts.
Abstract: We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots. Some key takeaways from the competition are: (1) pretrained Transformer variants are currently the best performing models on this task, (2) but to improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts (e.g. how many questions asked vs. answered).
340 citations
•
18 Jul 2021TL;DR: GPSA is introduced, a form of positional self-attention which can be equipped with a "soft" convolutional inductive bias and outperforms the DeiT on ImageNet, while offering a much improved sample efficiency.
Abstract: Convolutional architectures have proven extremely successful for vision tasks. Their hard inductive biases enable sample-efficient learning, but come at the cost of a potentially lower performance ceiling. Vision Transformers (ViTs) rely on more flexible self-attention layers, and have recently outperformed CNNs for image classification. However, they require costly pre-training on large external datasets or distillation from pre-trained convolutional networks. In this paper, we ask the following question: is it possible to combine the strengths of these two architectures while avoiding their respective limitations? To this end, we introduce gated positional self-attention (GPSA), a form of positional self-attention which can be equipped with a ``soft" convolutional inductive bias. We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information. The resulting convolutional-like ViT architecture, ConViT, outperforms the DeiT on ImageNet, while offering a much improved sample efficiency. We further investigate the role of locality in learning by first quantifying how it is encouraged in vanilla self-attention layers, then analysing how it is escaped in GPSA layers. We conclude by presenting various ablations to better understand the success of the ConViT. Our code and models are released publicly at this https URL.
339 citations
•
17 Jun 2010TL;DR: In this article, an information management and distribution system facilitates the controlled exchange of contact information over a network, which can support one or more of creation and design, rolodex, exchange, and update features.
Abstract: An information management and distribution system is disclosed. The information management and distribution system facilitates the controlled exchange of contact information over a network. The system can support one or more of creation and design, rolodex, exchange, and update features. In one embodiment, the information management and distribution system can include a networked server system accessible by remote user devices via the network, and at least one database maintained by the networked server system and storing content information and exchange settings of registered users.
338 citations
•
TL;DR: XLSR is presented which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages to enable a single multilingual speech recognition model which is competitive to strong individual models.
Abstract: This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages We build on wav2vec 20 which is trained by solving a contrastive task over masked latent speech representations and jointly learns a quantization of the latents shared across languages The resulting model is fine-tuned on labeled data and experiments show that cross-lingual pretraining significantly outperforms monolingual pretraining On the CommonVoice benchmark, XLSR shows a relative phoneme error rate reduction of 72% compared to the best known results On BABEL, our approach improves word error rate by 16% relative compared to a comparable system Our approach enables a single multilingual speech recognition model which is competitive to strong individual models Analysis shows that the latent discrete speech representations are shared across languages with increased sharing for related languages We hope to catalyze research in low-resource speech understanding by releasing XLSR-53, a large model pretrained in 53 languages
337 citations
Authors
Showing all 7875 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Xiang Zhang | 154 | 1733 | 117576 |
Jitendra Malik | 151 | 493 | 165087 |
Trevor Darrell | 148 | 678 | 181113 |
Christopher D. Manning | 138 | 499 | 147595 |
Robert W. Heath | 128 | 1049 | 73171 |
Pieter Abbeel | 126 | 589 | 70911 |
Yann LeCun | 121 | 369 | 171211 |
Li Fei-Fei | 120 | 420 | 145574 |
Jon Kleinberg | 117 | 444 | 87865 |
Sergey Levine | 115 | 652 | 59769 |
Richard Szeliski | 113 | 359 | 72019 |
Sanjeev Kumar | 113 | 1325 | 54386 |
Bruce Neal | 108 | 561 | 87213 |
Larry S. Davis | 107 | 693 | 49714 |