scispace - formally typeset
Open AccessProceedings ArticleDOI

Deep Neural Networks for YouTube Recommendations

Paul Covington, +2 more
- pp 191-198
Reads0
Chats0
TLDR
This paper details a deep candidate generation model and then describes a separate deep ranking model and provides practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.
Abstract
YouTube represents one of the largest scale and most sophisticated industrial recommendation systems in existence. In this paper, we describe the system at a high level and focus on the dramatic performance improvements brought by deep learning. The paper is split according to the classic two-stage information retrieval dichotomy: first, we detail a deep candidate generation model and then describe a separate deep ranking model. We also provide practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.

read more

Citations
More filters
Journal ArticleDOI

Graph Neural Networks in Recommender Systems: A Survey

TL;DR: A comprehensive review of recent research efforts on GNN-based recommender systems is provided in this paper , where the authors systematically analyze the challenges of applying GNN on different types of data and discuss how existing works in this field address these challenges.
Posted Content

Behavior Sequence Transformer for E-commerce Recommendation in Alibaba

TL;DR: Wang et al. as discussed by the authors proposed to use the Transformer model to capture the sequential signals underlying users' behavior sequences for recommendation in Alibaba, which obtained significant improvements in online Click-Through-Rate (CTR) comparing to two baselines.
Posted Content

Understanding Training Efficiency of Deep Learning Recommendation Models at Scale

TL;DR: The goal of this paper is to explain the intricacies of using GPUs for training recommendation models, factors affecting hardware efficiency at scale, and learnings from a new scale-up GPU server design, Zion.
Proceedings ArticleDOI

SimpleX: A Simple and Strong Baseline for Collaborative Filtering

TL;DR: In this paper, the authors propose the cosine contrastive loss (CCL) and further incorporate it to a simple unified collaborative filtering model, dubbed SimpleX, which can surpass most sophisticated state-of-the-art models by a large margin.
Proceedings Article

Sampled Softmax with Random Fourier Features

TL;DR: This paper develops the first theoretical understanding of the role that different sampling distributions play in determining the quality of sampled softmax, and proposes the Random Fourier Softmax method, which leads to low bias in estimation in terms of both the fullsoftmax distribution and the full softmax gradient.
References
More filters
Book ChapterDOI

I and J

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Posted Content

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.