Deep Neural Networks for YouTube Recommendations
Paul Covington,Jay Adams,Emre Sargin +2 more
- pp 191-198
Reads0
Chats0
TLDR
This paper details a deep candidate generation model and then describes a separate deep ranking model and provides practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.Abstract:
YouTube represents one of the largest scale and most sophisticated industrial recommendation systems in existence. In this paper, we describe the system at a high level and focus on the dramatic performance improvements brought by deep learning. The paper is split according to the classic two-stage information retrieval dichotomy: first, we detail a deep candidate generation model and then describe a separate deep ranking model. We also provide practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.read more
Citations
More filters
Proceedings ArticleDOI
Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation
TL;DR: This work develops a distance-based recommendation model with several novel aspects, including each user and item are parameterized by Gaussian distributions to capture the learning uncertainties and an adaptive margin generation scheme is proposed to generate the margins regarding different training triplets.
Proceedings ArticleDOI
AutoGroup: Automatic Feature Grouping for Modelling Explicit High-Order Feature Interactions in CTR Prediction
TL;DR: An end-to-end model, AutoGroup, is proposed, which casts the selection of feature interactions as a structural optimization problem and performs both dimensionality reduction and feature selection which are not seen in previous models.
Journal ArticleDOI
Social movie recommender system based on deep autoencoder network using Twitter data
TL;DR: A hybrid social recommender system utilizing a deep autoencoder network is introduced and the accuracy and effectiveness of the proposed approach have been improved compared to the other state-of-the-art methods.
Proceedings ArticleDOI
Two-stage Model for Automatic Playlist Continuation at Scale
TL;DR: This paper uses a two-stage model where the first stage is optimized for fast retrieval, and the second stage re-ranks retrieved candidates maximizing the accuracy at the top of the recommended list.
Journal ArticleDOI
HierTrain: Fast Hierarchical Edge AI Learning With Hybrid Parallelism in Mobile-Edge-Cloud Computing
TL;DR: Wang et al. as mentioned in this paper proposed HierTrain, a hierarchical edge AI learning framework, which efficiently deploys the DNN training task over the hierarchical MECC architecture, and developed a novel hybrid parallelism method, which is the key to HierTrain to adaptively assign the model layers and the data samples across the three levels of the edge device, edge server and cloud center.
References
More filters
Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Posted Content
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.