Deep Neural Networks for YouTube Recommendations
Paul Covington,Jay Adams,Emre Sargin +2 more
- pp 191-198
TLDR
This paper details a deep candidate generation model and then describes a separate deep ranking model and provides practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.Abstract:
YouTube represents one of the largest scale and most sophisticated industrial recommendation systems in existence. In this paper, we describe the system at a high level and focus on the dramatic performance improvements brought by deep learning. The paper is split according to the classic two-stage information retrieval dichotomy: first, we detail a deep candidate generation model and then describe a separate deep ranking model. We also provide practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.read more
Citations
More filters
Proceedings ArticleDOI
Cold Start Similar Artists Ranking with Gravity-Inspired Graph Autoencoders
TL;DR: In this paper, a graph autoencoder architecture is used to learn node embedding representations from a graph, and to automatically rank the top-k most similar neighbors of new artists using a gravity-inspired mechanism.
Proceedings ArticleDOI
ATBRG: Adaptive Target-Behavior Relational Graph Network for Effective Recommendation
TL;DR: Wang et al. as discussed by the authors proposed an adaptive target-behavior relational graph network (ATBRG) to capture structural relations of target user-item pairs over KG.
Journal ArticleDOI
Memory-aware gated factorization machine for top-N recommendation
TL;DR: A memory-aware gated factorization machine (MAGFM), which improves the FM method by introducing two new components: an external user memory matrix is introduced to each user, which can enrich the representation capacity by leveraging user historical items and the auxiliary information associated with the historical items.
Proceedings ArticleDOI
Effects of Past Interactions on User Experience with Recommended Documents
TL;DR: An initial exploration of users' experience with recommended documents, with a focus on how prior interactions influence recognition and interest, suggests that in addition to helping users quickly access documents they intend to re-find, document recommendation can add value in helping users discover other documents.
Journal ArticleDOI
Disentangled Representation Learning for Recommendation
TL;DR: This paper presents the SEMantic MACRo-mIcro Disentangled Variational Auto-Encoder (SEM-MacridVAE) model, a model for learning disentangled representations from user behaviors, taking item semantic information into account, and demonstrates that this approach is able to achieve significant improvement over the state-of-the-art baselines.
References
More filters
Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Posted Content
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.