scispace - formally typeset
Proceedings ArticleDOI

Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features

Reads0
Chats0
TLDR
The Deep Crossing model is proposed which is a deep neural network that automatically combines features to produce superior models and was able to build, from scratch, two web-scale models for a major paid search engine, and achieve superior results with only a sub-set of the features used in the production models.
Abstract
Manually crafted combinatorial features have been the "secret sauce" behind many successful models. For web-scale applications, however, the variety and volume of features make these manually crafted features expensive to create, maintain, and deploy. This paper proposes the Deep Crossing model which is a deep neural network that automatically combines features to produce superior models. The input of Deep Crossing is a set of individual features that can be either dense or sparse. The important crossing features are discovered implicitly by the networks, which are comprised of an embedding and stacking layer, as well as a cascade of Residual Units. Deep Crossing is implemented with a modeling tool called the Computational Network Tool Kit (CNTK), powered by a multi-GPU platform. It was able to build, from scratch, two web-scale models for a major paid search engine, and achieve superior results with only a sub-set of the features used in the production models. This demonstrates the potential of using Deep Crossing as a general modeling paradigm to improve existing products, as well as to speed up the development of new models with a fraction of the investment in feature engineering and acquisition of deep domain knowledge.

read more

Citations
More filters
Proceedings ArticleDOI

Deep Interest Network for Click-Through Rate Prediction

TL;DR: A novel model: Deep Interest Network (DIN) is proposed which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad.
Journal ArticleDOI

Deep Learning Based Recommender System: A Survey and New Perspectives

TL;DR: A comprehensive review of recent research efforts on deep learning-based recommender systems is provided in this paper, along with a comprehensive summary of the state-of-the-art.
Proceedings ArticleDOI

Neural Factorization Machines for Sparse Predictive Analytics

TL;DR: Neural Factorization Machines (NFM) as discussed by the authors is a special case of NFM without hidden layers, which combines the linearity of FM in modelling second-order feature interactions and the non-linearity of neural network in modelling higher-order features.
Proceedings ArticleDOI

KGAT: Knowledge Graph Attention Network for Recommendation

TL;DR: Wang et al. as mentioned in this paper proposed a knowledge graph attention network (KGAT) which explicitly models the high-order connectivities in KG in an end-to-end fashion.
Proceedings ArticleDOI

KGAT: Knowledge Graph Attention Network for Recommendation

TL;DR: This work proposes a new method named Knowledge Graph Attention Network (KGAT), which explicitly models the high-order connectivities in KG in an end-to-end fashion and significantly outperforms state-of-the-art methods like Neural FM and RippleNet.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Proceedings ArticleDOI

Object recognition from local scale-invariant features

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Related Papers (5)