AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
Weiping Song,Chence Shi,Zhiping Xiao,Zhijian Duan,Yewen Xu,Ming Zhang,Jian Tang +6 more
- pp 1161-1170
Reads0
Chats0
TLDR
An effective and efficient method called the AutoInt to automatically learn the high-order feature interactions of input features and map both the numerical and categorical features into the same low-dimensional space is proposed.Abstract:
Click-through rate (CTR) prediction, which aims to predict the probability of a user clicking on an ad or an item, is critical to many online applications such as online advertising and recommender systems. The problem is very challenging since (1) the input features (e.g., the user id, user age, item id, item category) are usually sparse and high-dimensional, and (2) an effective prediction relies on high-order combinatorial features (a.k.a. cross features), which are very time-consuming to hand-craft by domain experts and are impossible to be enumerated. Therefore, there have been efforts in finding low-dimensional representations of the sparse and high-dimensional raw features and their meaningful combinations. In this paper, we propose an effective and efficient method called the AutoInt to automatically learn the high-order feature interactions of input features. Our proposed algorithm is very general, which can be applied to both numerical and categorical input features. Specifically, we map both the numerical and categorical features into the same low-dimensional space. Afterwards, a multi-head self-attentive neural network with residual connections is proposed to explicitly model the feature interactions in the low-dimensional space. With different layers of the multi-head self-attentive neural networks, different orders of feature combinations of input features can be modeled. The whole model can be efficiently fit on large-scale raw data in an end-to-end fashion. Experimental results on four real-world datasets show that our proposed approach not only outperforms existing state-of-the-art approaches for prediction but also offers good explainability. Code is available at: \urlhttps://github.com/DeepGraphLearning/RecommenderSystems.read more
Citations
More filters
Posted Content
TabNet: Attentive Interpretable Tabular Learning
Sercan O. Arik,Tomas Pfister +1 more
TL;DR: It is demonstrated that TabNet outperforms other neural network and decision tree variants on a wide range of non-performance-saturated tabular datasets and yields interpretable feature attributions plus insights into the global model behavior.
Proceedings ArticleDOI
S^3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization
Kun Zhou,Hui Wang,Wayne Xin Zhao,Yutao Zhu,Sirui Wang,Fuzheng Zhang,Zhongyuan Wang,Ji-Rong Wen +7 more
TL;DR: This work proposes the model S3-Rec, which stands for Self-Supervised learning for Sequential Recommendation, based on the self-attentive neural architecture, to utilize the intrinsic data correlation to derive self-supervision signals and enhance the data representations via pre-training methods for improving sequential recommendation.
Proceedings ArticleDOI
S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization
Kun Zhou,Hui Wang,Wayne Xin Zhao,Yutao Zhu,Sirui Wang,Fuzheng Zhang,Zhongyuan Wang,Ji-Rong Wen +7 more
TL;DR: Li et al. as mentioned in this paper proposed a self-supervised learning for sequential recommendation based on the self-attentive neural architecture, which utilizes the intrinsic data correlation to derive self-vision signals and enhance the data representations via pre-training methods for improving sequential recommendation.
Proceedings ArticleDOI
DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems
TL;DR: This work proposes an improved framework DCN-V2, which is simple, can be easily adopted as building blocks, and has delivered significant offline accuracy and online business metrics gains across many web-scale learning to rank systems at Google.
Posted Content
RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms
Wayne Xin Zhao,Shanlei Mu,Yupeng Hou,Zihan Lin,Kaiyuan Li,Yushuo Chen,Yujie Lu,Hui Wang,Changxin Tian,Xingyu Pan,Yingqian Min,Zhichao Feng,Xinyan Fan,Xu Chen,Pengfei Wang,Wendi Ji,Yaliang Li,Xiaoling Wang,Ji-Rong Wen +18 more
TL;DR: A unified, comprehensive and efficient recommender system library called RecBole (pronounced as [rEk'boUl@r]), which provides a unified framework to develop and reproduce recommendation algorithms for research purpose and provides a series of auxiliary functions, tools, and scripts to facilitate the use of this library.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Journal Article
Dropout: a simple way to prevent neural networks from overfitting
TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Posted Content
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.