scispace - formally typeset
Search or ask a question

Showing papers by "Nello Cristianini published in 2013"


Posted Content
TL;DR: This work proposes KernelUCB, a kernelised UCB algorithm, and gives a cumulative regret bound through a frequentist analysis and improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function.
Abstract: We tackle the problem of online reward maximisation over a large finite set of actions described by their contexts. We focus on the case when the number of actions is too big to sample all of them even once. However we assume that we have access to the similarities between actions' contexts and that the expected reward is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS). We propose KernelUCB, a kernelised UCB algorithm, and give a cumulative regret bound through a frequentist analysis. For contextual bandits, the related algorithm GP-UCB turns out to be a special case of our algorithm, and our finite-time analysis improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function. Moreover, for the linear kernel, our regret bound matches the lower bound for contextual linear bandits.

159 citations


Journal ArticleDOI
TL;DR: This paper describes an approach that incorporates text-analysis technologies for the automation of some of these tasks, enabling us to analyse data sets that are many orders of magnitude larger than those normally used.
Abstract: News content analysis is usually preceded by a labour-intensive coding phase, where experts extract key information from news items. The cost of this phase imposes limitations on the sample sizes that can be processed, and therefore to the kind of questions that can be addressed. In this paper we describe an approach that incorporates text-analysis technologies for the automation of some of these tasks, enabling us to analyse data sets that are many orders of magnitude larger than those normally used. The patterns detected by our method include: (1) similarities in writing style among several outlets, which reflect reader demographics; (2) gender imbalance in media content and its relation with topic; (3) the relationship between topic and popularity of articles.

87 citations


Proceedings Article
11 Aug 2013
TL;DR: In this paper, the authors propose KernelUCB, a kernelized UCB algorithm, and give a cumulative regret bound through a frequentist analysis for the case when the number of actions is too big to sample all of them even once.
Abstract: We tackle the problem of online reward maximisation over a large finite set of actions described by their contexts. We focus on the case when the number of actions is too big to sample all of them even once. However we assume that we have access to the similarities between actions' contexts and that the expected reward is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS). We propose KernelUCB, a kernelised UCB algorithm, and give a cumulative regret bound through a frequentist analysis. For contextual bandits, the related algorithm GP-UCB turns out to be a special case of our algorithm, and our finite-time analysis improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function. Moreover, for the linear kernel, our regret bound matches the lower bound for contextual linear bandits.

44 citations


Posted Content
TL;DR: This work presents a method for computing mood scores from text using affective word taxonomies, and applies it to millions of tweets collected in the United Kingdom during the seasons of summer and winter, results in the detection of strong and statistically significant circadian patterns for all the investigated mood types.
Abstract: Social Media offer a vast amount of geo-located and time-stamped textual content directly generated by people. This information can be analysed to obtain insights about the general state of a large population of users and to address scientific questions from a diversity of disciplines. In this work, we estimate temporal patterns of mood variation through the use of emotionally loaded words contained in Twitter messages, possibly reflecting underlying circadian and seasonal rhythms in the mood of the users. We present a method for computing mood scores from text using affective word taxonomies, and apply it to millions of tweets collected in the United Kingdom during the seasons of summer and winter. Our analysis results in the detection of strong and statistically significant circadian patterns for all the investigated mood types. Seasonal variation does not seem to register any important divergence in the signals, but a periodic oscillation within a 24-hour period is identified for each mood type. The main common characteristic for all emotions is their mid-morning peak, however their mood score patterns differ in the evenings.

29 citations


Journal ArticleDOI
TL;DR: It is shown that this method can predict which articles will become popular, as well as extracting those keywords that mostly affect the appeal function, and enables us to compare different outlets from the point of view of their readers’ preference patterns.
Abstract: We explore the problem of learning to predict the popularity of an article in online news media. By "popular" we mean an article that was among the "most read" articles of a given day in the news outlet that published it. We show that this cannot be modelled simply as the binary classification task of separating popular from unpopular articles, thereby assuming that popularity is an absolute property. Instead, we propose to view popularity in the perspective of a competitive situation where the popular articles are those which were the most appealing on that particular day. This leads to the notion of an "appeal" function, to model which we use a linear function in the bag of words representation. The parameters of this linear function are learnt from a training set formed by pairs of documents, one of which was popular and the other which appeared on the same page and date, without becoming popular. To learn the appeal function we use Ranking Support Vector Machines, using data collected from six different outlets over a period of 1 year. We show that our method can predict which articles will become popular, as well as extracting those keywords that mostly affect the appeal function. This also enables us to compare different outlets from the point of view of their readers' preference patterns. Remarkably, this is achieved using very limited information, namely the textual content of title and description of each article, the page and date of publication, and whether it became popular.

28 citations


Journal Article
TL;DR: This work presents a methodology for large scale quantitative narrative analysis of text data, which includes various recent ideas from text mining and pattern analysis in order to solve a problem arising in digital humanities and social sciences.
Abstract: We present a methodology for large scale quantitative narrative analysis (QNA) of text data, which includes various recent ideas from text mining and pattern analysis in order to solve a problem arising in digital humanities and social sciences. The key idea is to automatically transform the corpus into a network, by extracting the key actors and objects of the narration, linking them to form a network, and then analyzing this network to extract information about those actors. These actors can be characterized by: studying their position in the overall network of actors and actions; generating scatter plots describing the subject/object bias of each actor; and investigating the types of actions each actor is most associated with. The software pipeline is demonstrated on text obtained from three story books from the Gutenberg Project. Our analysis reveals that our approach correctly identifies the most central actors in a given narrative. We also find that the hero of a narrative always has the highest degree in a network. They are most often the subjects of actions, but not the ones with the highest subject bias score. Our methodology is very scalable, and addresses specific research needs that are currently very labour intensive in social sciences and digital humanities.

9 citations


Book ChapterDOI
01 Jan 2013
TL;DR: Machine Learning techniques are used to model the reading preferences of audiences of 14 online news outlets and find that content appeal is related both to writing style - with more sentimentally charged language being preferred, and to content with “Public Affairs” topics, such as “Finance” and “Politics”, being less preferred.
Abstract: We use Machine Learning techniques to model the reading preferences of audiences of 14 online news outlets. The models, describing the appeal of a given article to each audience, are formed by linear functions of word frequencies, and are obtained by comparing articles that became “Most Popular” on a given day in a given outlet with articles that did not. We make use of 2,432,148 such article pairs, collected over a period of over 1.5 years. Those models are shown to be predictive of user choices, and they are then used to compare both the audiences and the contents of various news outlets. In the first case, we find that there is a significant correlation between demographic profiles of audiences and their preferences. In the second case we find that content appeal is related both to writing style - with more sentimentally charged language being preferred, and to content with “Public Affairs” topics, such as “Finance” and “Politics”, being less preferred.

4 citations


Book ChapterDOI
01 Jan 2013
TL;DR: This paper investigates a method based on the creation of a graph, whose vertices are the documents and the edges represent some notion of semantic similarity, and shows that relation-based classification is competitive with support vector machines, which can be considered as state of the art.
Abstract: We are interested in the problem of automatically annotating a large, constantly expanding corpus, in the case where potentially neither the dataset nor the class of possible labels that can be used are static, and the annotation of the data needs to be efficient. This application is motivated by real-world scenarios of news content analysis and social-web content analysis. We investigate a method based on the creation of a graph, whose vertices are the documents and the edges represent some notion of semantic similarity. In this graph, label propagation algorithms can be efficiently used to apply labels to documents based on the annotation of their neighbours. This paper presents experimental results about both the efficient creation of the graph and the propagation of the labels. We compare the effectiveness of various approaches to graph construction by building graphs of 800,000 vertices based on the Reuters corpus, showing that relation-based classification is competitive with support vector machines, which can be considered as state of the art. We also show that the combination of our relation-based approach and support vector machines leads to an improvement over the methods individually.