Proceedings ArticleDOI
Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features
Ying Shan,T. Ryan Hoens,Jian Jiao,Haijing Wang,Dong Yu,Jianchang Mao +5 more
- pp 255-262
Reads0
Chats0
TLDR
The Deep Crossing model is proposed which is a deep neural network that automatically combines features to produce superior models and was able to build, from scratch, two web-scale models for a major paid search engine, and achieve superior results with only a sub-set of the features used in the production models.Abstract:
Manually crafted combinatorial features have been the "secret sauce" behind many successful models. For web-scale applications, however, the variety and volume of features make these manually crafted features expensive to create, maintain, and deploy. This paper proposes the Deep Crossing model which is a deep neural network that automatically combines features to produce superior models. The input of Deep Crossing is a set of individual features that can be either dense or sparse. The important crossing features are discovered implicitly by the networks, which are comprised of an embedding and stacking layer, as well as a cascade of Residual Units. Deep Crossing is implemented with a modeling tool called the Computational Network Tool Kit (CNTK), powered by a multi-GPU platform. It was able to build, from scratch, two web-scale models for a major paid search engine, and achieve superior results with only a sub-set of the features used in the production models. This demonstrates the potential of using Deep Crossing as a general modeling paradigm to improve existing products, as well as to speed up the development of new models with a fraction of the investment in feature engineering and acquisition of deep domain knowledge.read more
Citations
More filters
Proceedings ArticleDOI
Deep Interest Network for Click-Through Rate Prediction
Guorui Zhou,Xiaoqiang Zhu,Chenru Song,Ying Fan,Han Zhu,Xiao Ma,Yan Yanghui,Junqi Jin,Han Li,Kun Gai +9 more
TL;DR: A novel model: Deep Interest Network (DIN) is proposed which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad.
Journal ArticleDOI
Deep Learning Based Recommender System: A Survey and New Perspectives
TL;DR: A comprehensive review of recent research efforts on deep learning-based recommender systems is provided in this paper, along with a comprehensive summary of the state-of-the-art.
Proceedings ArticleDOI
Neural Factorization Machines for Sparse Predictive Analytics
Xiangnan He,Tat-Seng Chua +1 more
TL;DR: Neural Factorization Machines (NFM) as discussed by the authors is a special case of NFM without hidden layers, which combines the linearity of FM in modelling second-order feature interactions and the non-linearity of neural network in modelling higher-order features.
Proceedings ArticleDOI
KGAT: Knowledge Graph Attention Network for Recommendation
TL;DR: Wang et al. as mentioned in this paper proposed a knowledge graph attention network (KGAT) which explicitly models the high-order connectivities in KG in an end-to-end fashion.
Proceedings ArticleDOI
KGAT: Knowledge Graph Attention Network for Recommendation
TL;DR: This work proposes a new method named Knowledge Graph Attention Network (KGAT), which explicitly models the high-order connectivities in KG in an end-to-end fashion and significantly outperforms state-of-the-art methods like Neural FM and RippleNet.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Proceedings ArticleDOI
Object recognition from local scale-invariant features
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.