scispace - formally typeset
Open AccessProceedings ArticleDOI

Deep Neural Networks for YouTube Recommendations

Paul Covington, +2 more
- pp 191-198
TLDR
This paper details a deep candidate generation model and then describes a separate deep ranking model and provides practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.
Abstract
YouTube represents one of the largest scale and most sophisticated industrial recommendation systems in existence. In this paper, we describe the system at a high level and focus on the dramatic performance improvements brought by deep learning. The paper is split according to the classic two-stage information retrieval dichotomy: first, we detail a deep candidate generation model and then describe a separate deep ranking model. We also provide practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.

read more

Citations
More filters
Proceedings ArticleDOI

Contrastive Learning for Sequential Recommendation

TL;DR: A novel multi-task framework called Contrastive Learning for Sequential Recommendation (CL4SRec) is proposed, which not only takes advantage of the traditional next item prediction task but also utilizes the contrastive learning framework to derive self-supervision signals from the original user behavior sequences.
Proceedings ArticleDOI

Explore, exploit, and explain: personalizing explainable recommendations with bandits

TL;DR: This work provides the first method that combines bandits and explanations in a principled manner and is able to jointly learn which explanations each user responds to; learn the best content to recommend for each user; and balance exploration with exploitation to deal with uncertainty.
Posted Content

Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

TL;DR: This paper explores the problem of building ML systems that fail loudly, investigating methods for detecting dataset shift, identifying exemplars that most typify the shift, and quantifying shift malignancy, and demonstrates that domain-discriminating approaches tend to be helpful for characterizing shifts qualitatively and determining if they are harmful.
Proceedings ArticleDOI

Learning Tree-based Deep Model for Recommender Systems

TL;DR: A novel tree-based method which can provide logarithmic complexity w.r.t. corpus size even with more expressive models such as deep neural networks is proposed and can be jointly learnt towards better compatibility with users' interest distribution and hence facilitate both training and prediction.
Proceedings ArticleDOI

RecNMP: accelerating personalized recommendation with near-memory processing

TL;DR: RecNMP as mentioned in this paper proposes a lightweight, commodity DRAM compliant, near-memory processing solution to accelerate personalized recommendation inference, which is specifically tailored to production environments with heavy co-location of operators on a single server.
References
More filters
Book ChapterDOI

I and J

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Posted Content

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.