scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Predicting demographic attributes from web usage: Purpose and methodologies

TL;DR: This paper discusses detailed review of various preprocessing, prediction and recommendation techniques used to predict demographic attributes of users through web usage mining.
Abstract: Web usage mining is a way of identifying and analyzing how users interact with a web site. This browsing data collected as web log, which is first preprocessed and then number of data mining methods such as classification, clustering and association rule mining etc. applied to find out interesting patterns. User's demographic information plays important role in designing business strategies, advertisement etc. Previous research shows some techniques to predict these demographic attributes (age gender etc.). This paper discusses detailed review of various preprocessing, prediction and recommendation techniques.
Citations
More filters
Proceedings ArticleDOI
11 Jul 2018
TL;DR: This study investigates the pattern of individual mobility patterns and its relationship with social-demographics, and extracts travel features from the raw smart card data, including spatial, temporal and travel mode features, which capture the travel variability of travellers.
Abstract: With the wide application of the smart card technology in public transit system, traveller’s daily travel behaviours can be possibly obtained. This study devotes to investigating the pattern of individual mobility patterns and its relationship with social-demographics. We first extract travel features from the raw smart card data, including spatial, temporal and travel mode features, which capture the travel variability of travellers. Then, travel features are fed to various supervised machine learning models to predict individual’s demographic attributes, such as age group, gender, income level and car ownership. Finally, a case study based on London’s Oyster Card data is presented and results show it is a promising opportunity for demographic study based on people’s mobility behaviour.

12 citations

Book ChapterDOI
01 Jan 2020
TL;DR: A survey of different methodologies and parameters used in analyzing the behavior of a user through Clickstream data is presented, outlining the methods used so far for clustering the users based on mining their interests.
Abstract: Data stream mining has emerged as one of the most prominent areas with its applications in various areas like network sensors, stock exchange, meteorological research and e-commerce. Stream mining is potentially an active area in which the data is continuously generated in large amounts which are dynamic, non-stationary, unstoppable, and infinite in nature. One of such streaming data generated with the user browsing tendency is Clickstream data. Analyzing the user online behavior on e-commerce Web sites is helpful in drawing certain conclusions and making specific recommendations for both the users and the electronic commerce companies to improve their marking strategies and increase the transaction rates effectively leading to enhance the revenue. This paper aims at presenting a survey of different methodologies and parameters used in analyzing the behavior of a user through Clickstream data. Little deeper, this article also outlines the methods used so far for clustering the users based on mining their interests.

5 citations

Journal ArticleDOI
TL;DR: In this article, the authors measured the relative importance of hotel website features based on users' perceptions and analyzed the impact of gender, age, and frequency of Internet access on the given importance of features.
Abstract: Purpose – This study measures the relative importance of hotel website features based on users’ perceptions and analyses the impact of gender, age, and frequency of Internet access on the given importance of features. Our study includes ten features and three hypotheses. Design/methodology/approach – A research questionnaire was developed and distributed to hotel guests. A total of 406 responses were collected. Statistical analysis included paired t-tests and oneway ANOVA. Findings – Results showed that users prioritized information about products and services, bookings and reservations, an easy-to-use website, and contact information. Privacy, design, and information on the surroundings were also important features. Customer feedback options, corporate information, and links to social media sites were ranked as significantly less important. Moreover, age and frequency of Internet access have a significant impact on the perceived importance of features, while no differences were found with regard to gender. Originality – Many studies have used web performance tools to measure the performance of hotel websites. However, these studies have not provided guests’ preferences and perceived importance of website features. To our knowledge, no previous research has examined the effect of gender, age, and frequency of Internet access on the perceived importance of hotel website features.

5 citations

Proceedings Article
26 Apr 2019
TL;DR: This study proposes to use a convolutional neural network for automatic feature extraction and demographic prediction, including age group, gender, income level and car ownership, using smart card data and household survey to infer demographics of passengers.
Abstract: This study devotes to investigating the possibility of inferring demographics of passengers using smart card data (SCD) and household survey. We first represent SCD as a two-dimension image to capture travel patterns. Then, we propose to use a convolutional neural network for automatic feature extraction and demographic prediction, including age group, gender, income level and car ownership. The household survey data is used to train the deep learning model. Finally, a case study using on London’s Oyster Card and survey is presented and results show it is a promising opportunity for demographic study based on people’s mobility behaviour.

3 citations


Cites background from "Predicting demographic attributes f..."

  • ...However, most literature focuses on the demographic prediction based on user’s activities in the virtual internet world (Hu et al., 2007; Saste et al., 2017), the discriminative power of mobility in the physical world has received much less attention....

    [...]

Journal ArticleDOI
TL;DR: Information obtained from web usage mining does the task of finding the hidden important information about user behaviour, and facilitate more effective browsing, to enhance web design, its page surfing pattern and other valuable information which is used for various purposes.
Abstract: To find valuable knowledge from web data is known as web mining. The growth of World Wide Web exceeded all expectations with the development of internet technology. Rapid growth of World Wide Web has affected a lot of both visitors and web site owners. Retrieving different information in different format has become a very difficult task. To solve this problem, one positive approach is web usage mining (WUM).Web mining that extracts patterns from user weblogs is known as web usage mining, that is an implementation part of data mining. The goal of web usage mining is to understand the behaviour of web site users by processing the data mining of web access data. The success of web usage mining depends upon efficient knowledge extracted from large amount of raw log data. Knowledge obtained from web usage mining does the task of finding the hidden important information about user behaviour, and facilitate more effective browsing, to enhance web design, its page surfing pattern and other valuable information which is used for various purposes. In this paper, we provide detailed review of work done for different phases of web usage mining.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: This article introduces the modules that comprise a Web personalization system, emphasizing the Web usage mining module, and presents a review of the most common methods that are used as well as technical issues that occur.
Abstract: Web personalization is the process of customizing a Web site to the needs of specific users, taking advantage of the knowledge acquired from the analysis of the user's navigational behavior (usage data) in correlation with other information collected in the Web context, namely, structure, content, and user profile data. Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. In this article we present a survey of the use of Web mining for Web personalization. More specifically, we introduce the modules that comprise a Web personalization system, emphasizing the Web usage mining module. A review of the most common methods that are used as well as technical issues that occur is given, along with a brief overview of the most popular tools and applications available from software vendors. Moreover, the most important research initiatives in the Web usage mining and personalization areas are presented.

941 citations

Journal ArticleDOI
TL;DR: This paper is a survey of recent work in the field of web usage mining for the benefit of research on the personalization of Web-based information services, focusing on the problems identified and the solutions that have been proposed.
Abstract: This paper is a survey of recent work in the field of web usage mining for the benefitof research on the personalization of Web-based information services. The essence of personalization is the adaptability of information systems to the needs of their users. This issue is becoming increasingly important on the Web, as non-expert users are overwhelmed by the quantity of information available online, while commercial Web sites strive to add value to their services in order to create loyal relationships with their visitors-customers. This article views Web personalization through the prism of personalization policies adopted by Web sites and implementing a variety of functions. In this context, the area of Web usage mining is a valuable source of ideas and methods for the implementation of personalization functionality. We therefore present a survey of the most recent work in the field of Web usage mining, focusing on the problemsthat have been identified and the solutions that have been proposed.

426 citations

Journal ArticleDOI
TL;DR: Analysis indicates that perception and satisfaction differences exist between the cultural clusters and gender groups within those cultures --- Asia, Europe, Latin & South America, and North America and that females within certain cultures have widely different preferences regarding web site attributes.
Abstract: The growth of electronic commerce, in particular business-to-consumer, has been explosive during the last few years. Until recently, the Web community has been a male dominated western-oriented society, with the design of Web sites reflecting that homogenous audience. Using an adapted version of Hofstede's dimensions as a means of differentiation, this study explores the perception and satisfaction levels of one hundred and sixty subjects on four web sites. Analysis indicates that perception and satisfaction differences exist between the cultural clusters and gender groups within those cultures --- Asia, Europe, Latin & South America, and North America. In particular, the perceptions of the Asian and Latin/South American were found to be similar, as were the perceptions of the Europeans and North Americans. Qualitative analysis indicates that females within certain cultures have widely different preferences from their male counterparts regarding web site attributes.

386 citations

Proceedings ArticleDOI
Jian Hu1, Hua-Jun Zeng1, Hua Li1, Cheng Niu1, Zheng Chen1 
08 May 2007
TL;DR: This paper made a first approach to predict users' gender and age from their Web browsing behaviors, in which the Webpage view information is treated as a hidden variable to propagate demographic information between different users.
Abstract: Demographic information plays an important role in personalized web applications. However, it is usually not easy to obtain this kind of personal data such as age and gender. In this paper, we made a first approach to predict users' gender and age from their Web browsing behaviors, in which the Webpage view information is treated as a hidden variable to propagate demographic information between different users. There are three main steps in our approach: First, learning from the Webpage click-though data, Webpages are associated with users' (known) age and gender tendency through a discriminative model; Second, users' (unknown) age and gender are predicted from the demographic information of the associated Webpages through a Bayesian framework; Third, based on the fact that Webpages visited by similar users may be associated with similar demographic tendency, and users with similar demographic information would visit similar Webpages, a smoothing component is employed to overcome the data sparseness of web click-though log. Experiments are conducted on a real web click-through log to demonstrate the effectiveness of the proposed approach. The experimental results show that the proposed algorithm can achieve up to 30.4% improvements on gender prediction and 50.3% on age prediction in terms of macro F1, compared to baseline algorithms.

291 citations


"Predicting demographic attributes f..." refers background or methods in this paper

  • ...Do Viet and Tu Minh Phuong [19] suggested improvement over the Jian Hu, Hua-Jun Zeng approach [18]....

    [...]

  • ...Similar type of content based approach is used by Jian Hu, Hua-Jun Zeng in there paper [18]....

    [...]

Journal ArticleDOI
TL;DR: It is concluded that predicting a user’s personality profile can be applied to personalize content, optimize search results, and improve online advertising.
Abstract: Individual differences in personality affect users' online activities as much as they do in the offline world. This work, based on a sample of over a third of a million users, examines how users' behaviour in the online environment, captured by their website choices and Facebook profile features, relates to their personality, as measured by the standard Five Factor Model personality questionnaire. Results show that there are psychologically meaningful links between users' personalities, their website preferences and Facebook profile features. We show how website audiences differ in terms of their personality, present the relationships between personality and Facebook profile features, and show how an individual's personality can be predicted from Facebook profile features. We conclude that predicting a user's personality profile can be applied to personalize content, optimize search results, and improve online advertising.

211 citations