scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Probabilistic latent semantic user segmentation for behavioral targeted advertising

TL;DR: This work proposes a novel user segmentation algorithm named Probabilistic Latent Semantic User Segmentation (PLSUS), which adopts the probabilistic latent semantic analysis to mine the relationship between users and their behaviors so as to segment users in a semantic manner.
Abstract: Behavioral Targeting (BT), which aims to deliver the most appropriate advertisements to the most appropriate users, is attracting much attention in online advertising market. A key challenge of BT is how to automatically segment users for ads delivery, and good user segmentation may significantly improve the ad click-through rate (CTR). Different from classical user segmentation strategies, which rarely take the semantics of user behaviors into consideration, we propose in this paper a novel user segmentation algorithm named Probabilistic Latent Semantic User Segmentation (PLSUS). PLSUS adopts the probabilistic latent semantic analysis to mine the relationship between users and their behaviors so as to segment users in a semantic manner. We perform experiments on the real world ad click through log of a commercial search engine. Comparing with the other two classical clustering algorithms, K-Means and CLUTO, PLSUS can further improve the ads CTR up to 100%. To our best knowledge, this work is an early semantic user segmentation study for BT in academia.
Citations
More filters
Proceedings ArticleDOI
11 Aug 2013
TL;DR: An empirical analysis and measurement of a production ad exchange is provided, observing that periodic patterns occur in various statistics including impressions, clicks, bids, and conversion rates, which suggest time-dependent models would be appropriate for capturing the repeated patterns in RTB.
Abstract: The real-time bidding (RTB), aka programmatic buying, has recently become the fastest growing area in online advertising. Instead of bulking buying and inventory-centric buying, RTB mimics stock exchanges and utilises computer algorithms to automatically buy and sell ads in real-time; It uses per impression context and targets the ads to specific people based on data about them, and hence dramatically increases the effectiveness of display advertising. In this paper, we provide an empirical analysis and measurement of a production ad exchange. Using the data sampled from both demand and supply side, we aim to provide first-hand insights into the emerging new impression selling infrastructure and its bidding behaviours, and help identifying research and design issues in such systems. From our study, we observed that periodic patterns occur in various statistics including impressions, clicks, bids, and conversion rates (both post-view and post-click), which suggest time-dependent models would be appropriate for capturing the repeated patterns in RTB. We also found that despite the claimed second price auction, the first price payment in fact is accounted for 55.4% of total cost due to the arrangement of the soft floor price. As such, we argue that the setting of soft floor price in the current RTB systems puts advertisers in a less favourable position. Furthermore, our analysis on the conversation rates shows that the current bidding strategy is far less optimal, indicating the significant needs for optimisation algorithms incorporating the facts such as the temporal behaviours, the frequency and recency of the ad displays, which have not been well considered in the past.

230 citations


Cites methods from "Probabilistic latent semantic user ..."

  • ...Throughout the paper we used advertiser/bidder/buyer/demand side interchangeably to comply with the industrial tradition....

    [...]

Posted Content
TL;DR: A comprehensive survey on Internet advertising is provided, discussing and classifying the research issues, identifying the recent technologies, and suggesting its future directions, to discover and aggregate the fundamental problems that characterise the newly-formed research field and capture its potential future prospects.
Abstract: Internet advertising is a fast growing business which has proved to be significantly important in digital economics. It is vitally important for both web search engines and online content providers and publishers because web advertising provides them with major sources of revenue. Its presence is increasingly important for the whole media industry due to the influence of the Web. For advertisers, it is a smarter alternative to traditional marketing media such as TVs and newspapers. As the web evolves and data collection continues, the design of methods for more targeted, interactive, and friendly advertising may have a major impact on the way our digital economy evolves, and to aid societal development. Towards this goal mathematically well-grounded Computational Advertising methods are becoming necessary and will continue to develop as a fundamental tool towards the Web. As a vibrant new discipline, Internet advertising requires effort from different research domains including Information Retrieval, Machine Learning, Data Mining and Analytic, Statistics, Economics, and even Psychology to predict and understand user behaviours. In this paper, we provide a comprehensive survey on Internet advertising, discussing and classifying the research issues, identifying the recent technologies, and suggesting its future directions. To have a comprehensive picture, we first start with a brief history, introduction, and classification of the industry and present a schematic view of the new advertising ecosystem. We then introduce four major participants, namely advertisers, online publishers, ad exchanges and web users; and through analysing and discussing the major research problems and existing solutions from their perspectives respectively, we discover and aggregate the fundamental problems that characterise the newly-formed research field and capture its potential future prospects.

67 citations


Cites background or methods from "Probabilistic latent semantic user ..."

  • ...User segmentation Wu et al. (2009) Probabilistic latent semantic analysis 1 day’s commercial search engine ad data...

    [...]

  • ...Users can then be clustered into profile groups, which can be targeted specifically by advertisers (Wu et al., 2009), for instance, demographics such as gender and age or interests such as sports or fashion....

    [...]

Proceedings ArticleDOI
16 Apr 2012
TL;DR: This work proposes a novel approach for conquering the sparseness of behavior pattern space and makes it possible to discover similar mobile users with respect to their habits by leveraging behavior pattern mining.
Abstract: Discovering similar users with respect to their habits plays an important role in a wide range of applications, such as collaborative filtering for recommendation, user segmentation for market analysis, etc. Recently, the progressing ability to sense user contexts of smart mobile devices makes it possible to discover mobile users with similar habits by mining their habits from their mobile devices. However, though some researchers have proposed effective methods for mining user habits such as behavior pattern mining, how to leverage the mined results for discovering similar users remains less explored. To this end, we propose a novel approach for conquering the sparseness of behavior pattern space and thus make it possible to discover similar mobile users with respect to their habits by leveraging behavior pattern mining. To be specific, first, we normalize the raw context log of each user by transforming the location-based context data and user interaction records to more general representations. Second, we take advantage of a constraint-based Bayesian Matrix Factorization model for extracting the latent common habits among behavior patterns and then transforming behavior pattern vectors to the vectors of mined common habits which are in a much more dense space. The experiments conducted on real data sets show that our approach outperforms three baselines in terms of the effectiveness of discovering similar mobile users with respect to their habits.

58 citations


Cites methods from "Probabilistic latent semantic user ..."

  • ...segmentation for market analysis [27, 21], etc....

    [...]

  • ...[21] modeled user habits as latent variables and extracted them from online behaviors through the probability mixture model....

    [...]

BookDOI
01 Jan 2012
TL;DR: This tutorial chapter describes basic MLP and SVM concepts, under the CRISP-DM methodology, and shows how such learning tools can be applied to real-world classification and regression DM applications.
Abstract: Multilayer perceptrons (MLPs) and support vector machines (SVMs) are flexible machine learning techniques that can fit complex nonlinear mappings. MLPs are the most popular neural network type, consisting on a feedforward network of processing neurons that are grouped into layers and connected by weighted links. On the other hand, SVM transforms the input variables into a high dimensional feature space and then finds the best hyperplane that models the data in the feature space. Both MLP and SVM are gaining an increase attention within the data mining (DM) field and are particularly useful when more simpler DM models fail to provide satisfactory predictive models. This tutorial chapter describes basic MLP and SVM concepts, under the CRISP-DM methodology, and shows how such learning tools can be applied to real-world classification and regression DM applications.

52 citations


Cites background from "Probabilistic latent semantic user ..."

  • ...Here, the target variables are the emissions that the actors wish to reduce and the required financial means act as additional environmental items [36, 39, 40, 41, 47, 59, 60, 61, 62]....

    [...]

Book ChapterDOI
01 Jan 2012
TL;DR: This chapter surveys NMF in terms of the model formulation and its variations and extensions, algorithms and applications, as well as its relations with K-means and Probabilistic Latent Semantic Indexing (PLSI).
Abstract: In recent years, Nonnegative Matrix Factorization (NMF) has become a popular model in data mining society. NMF aims to extract hidden patterns from a series of high-dimensional vectors automatically, and has been applied for dimensional reduction, unsupervised learning (clustering, semi-supervised clustering and co-clustering, etc.) and prediction successfully. This chapter surveys NMF in terms of the model formulation and its variations and extensions, algorithms and applications, as well as its relations with K-means and Probabilistic Latent Semantic Indexing (PLSI). In summary, we draw the following conclusions: 1) NMF has a good interpretability due to its nonnegative constraints; 2) NMF is very flexible regarding the choices of its objective functions and the algorithms employed to solve it; 3) NMF has a variety of applications; 4) NMF has a solid theoretical foundation and a close relationship with the existing state-of-the-art unsupervised learning models. However, as a new and developing technology, there are still many interesting open issues remained unsolved and waiting for research from theoretical and algorithmic perspectives.

33 citations

References
More filters
Journal ArticleDOI
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

30,570 citations

Proceedings Article
03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

25,546 citations

Journal ArticleDOI
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Abstract: A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.

12,443 citations

Journal ArticleDOI
TL;DR: This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared.
Abstract: The experimental evidence accumulated over the past 20 years indicates that textindexing systems based on the assignment of appropriately weighted single terms produce retrieval results that are superior to those obtainable with other more elaborate text representations. These results depend crucially on the choice of effective term weighting systems. This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared.

9,460 citations

Journal ArticleDOI
01 Aug 1999
TL;DR: Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data.
Abstract: Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized model is able to deal with domain{specific synonymy as well as with polysemous words. In contrast to standard Latent Semantic Indexing (LSI) by Singular Value Decomposition, the probabilistic variant has a solid statistical foundation and defines a proper generative data model. Retrieval experiments on a number of test collections indicate substantial performance gains over direct term matching methods as well as over LSI. In particular, the combination of models with different dimensionalities has proven to be advantageous.

4,577 citations