Showing papers by "Anthony J. Bagnall published in 2005"

PDF

Open Access

Book Chapter•DOI•

A novel bit level time series representation with implication of similarity search and clustering

[...]

Chotirat Ann Ratanamahatana¹, Eamonn Keogh¹, Anthony J. Bagnall², Stefano Lonardi¹•Institutions (2)

University of California, Riverside¹, University of East Anglia²

18 May 2005

TL;DR: This work introduces a new technique based on a bit level approximation of the data that allows raw data to be directly compared to the reduced representation, while still guaranteeing lower bounds to Euclidean distance.

...read moreread less

Abstract: Because time series are a ubiquitous and increasingly prevalent type of data, there has been much research effort devoted to time series data mining recently. As with all data mining problems, the key to effective and scalable algorithms is choosing the right representation of the data. Many high level representations of time series have been proposed for data mining. In this work, we introduce a new technique based on a bit level approximation of the data. The representation has several important advantages over existing techniques. One unique advantage is that it allows raw data to be directly compared to the reduced representation, while still guaranteeing lower bounds to Euclidean distance. This fact can be exploited to produce faster exact algorithms for similarly search. In addition, we demonstrate that our new representation allows time series clustering to scale to much larger datasets.

...read moreread less

124 citations

Journal Article•DOI•

A multiagent model of the UK market in electricity generation

[...]

Anthony J. Bagnall¹, George Davey Smith¹•Institutions (1)

University of East Anglia¹

01 Oct 2005-IEEE Transactions on Evolutionary Computation

TL;DR: An agent based computational economics approach for studying the effect of alternative structures and mechanisms on behavior in electricity markets and the potential benefit of an evolutionary economics approach to market modeling is demonstrated.

...read moreread less

Abstract: The deregulation of electricity markets has continued apace around the globe. The best structure for deregulated markets is a subject of much debate, and the consequences of poor structural choices can be dramatic. Understanding the effect of structure on behavior is essential, but the traditional economics approaches of field studies and experimental studies are particularly hard to conduct in relation to electricity markets. This paper describes an agent based computational economics approach for studying the effect of alternative structures and mechanisms on behavior in electricity markets. Autonomous adaptive agents, using hierarchical learning classifier systems, learn through competition in a simulated model of the UK market in electricity generation. The complex agent structure was developed through a sequence of experimentation to test whether it was capable of meeting the following requirements: first, that the agents are able to learn optimal strategies when competing against nonadaptive agents; second, that the agents are able to learn strategies observable in the real world when competing against other adaptive agents; and third, that cooperation without explicit communication can evolve in certain market situations. The potential benefit of an evolutionary economics approach to market modeling is demonstrated by examining the effects of alternative payment mechanisms on the behavior of agents.

...read moreread less

98 citations

Journal Article•DOI•

Clustering Time Series with Clipped Data

[...]

Anthony J. Bagnall¹, G. J. Janacek¹•Institutions (1)

University of East Anglia¹

01 Feb 2005-Machine Learning

TL;DR: It is shown that the simple procedure of clipping the time series (discretising to above or below the median) reduces memory requirements and significantly speeds up clustering without decreasing clustering accuracy.

...read moreread less

Abstract: Clustering time series is a problem that has applications in a wide variety of fields, and has recently attracted a large amount of research. Time series data are often large and may contain outliers. We show that the simple procedure of clipping the time series (discretising to above or below the median) reduces memory requirements and significantly speeds up clustering without decreasing clustering accuracy. We also demonstrate that clipping increases clustering accuracy when there are outliers in the data, thus serving as a means of outlier detection and a method of identifying model misspecification. We consider simulated data from polynomial, autoregressive moving average and hidden Markov models and show that the estimated parameters of the clipped data used in clustering tend, asymptotically, to those of the unclipped data. We also demonstrate experimentally that, if the series are long enough, the accuracy on clipped data is not significantly less than the accuracy on unclipped data, and if the series contain outliers then clipping results in significantly better clusterings. We then illustrate how using clipped series can be of practical benefit in detecting model misspecification and outliers on two real world data sets: an electricity generation bid data set and an ECG data set.

...read moreread less

77 citations

Book Chapter•DOI•

On the Classification of Maze Problems

[...]

Anthony J. Bagnall¹, Zhanna V. Zatuchna¹•Institutions (1)

University of East Anglia¹

01 Jan 2005

TL;DR: A maze is a grid-like two-dimensional area of any size, usually rectangular as mentioned in this paper, where the goal is to learn a policy to reach food as fast as possible from any square.

...read moreread less

Abstract: A maze is a grid-like two-dimensional area of any size, usually rectangular. A maze consists of cells. A cell is an elementary maze item, a formally bounded space, interpreted as a single site. The maze may contain different obstacles in any quantity. Some may be significant for learning purposes, like virtual food. The agent is randomly placed in the maze on an empty cell. The agent is allowed to move in all directions, but only through empty space. The task is to learn a policy to reach food as fast as possible from any square. Once the food is reached, the agent position is reset to a random one and the task repeated.

...read moreread less

42 citations

Book Chapter•DOI•

A likelihood ratio distance measure for the similarity between the fourier transform of time series

[...]

G. J. Janacek¹, Anthony J. Bagnall¹, M. Powell¹•Institutions (1)

University of East Anglia¹

18 May 2005

TL;DR: This paper describes an alternative distance measure based on the likelihood ratio statistic to test the hypothesis of difference between series, and compares the new distance measure to Euclidean distance on five types of data with varying levels of compression.

...read moreread less

Abstract: Fast Fourier Transforms (FFTs) have been a popular transformation and compression technique in time series data mining since first being proposed for use in this context in [1]. The Euclidean distance between coefficients has been the most commonly used distance metric with FFTs. However, on many problems it is not the best measure of similarity available. In this paper we describe an alternative distance measure based on the likelihood ratio statistic to test the hypothesis of difference between series. We compare the new distance measure to Euclidean distance on five types of data with varying levels of compression. We show that the likelihood ratio measure is better at discriminating between series from different models and grouping series from the same model.

...read moreread less

32 citations

Proceedings Article•DOI•

AgentP classifier system: self-adjusting vs. gradual approach

[...]

Zhanna V. Zatuchna¹, Anthony J. Bagnall¹•Institutions (1)

University of East Anglia¹

12 Dec 2005

TL;DR: The results show that AgentP often outperforms (and always at least matches) the performance of other techniques and, on the large majority of mazes used, learns optimal or near optimal solutions with fewer trials and a smaller classifier population.

...read moreread less

Abstract: Learning classifier systems belong to the class of algorithms based on the principle of self-organization and evolution and have frequently been applied to mazes, an important type of reinforcement learning problem. Mazes may contain aliasing cells, i.e. squares in a different location that look identical to an agent with limited perceptive power. Mazes with aliasing squares present a particular difficult learning problem. As a possible approach to the problem, AgentP, a learning classifier system with associative perception, was recently introduced. AgentP is based on the psychological model of associative perception learning and operates explicitly imprinted images of the environment states. Two types of learning mode are described: the first, self-adjusting AgentP, is more flexible and adapts rapidly to changing information; the second, gradual AgentP, is more conservative in drawing conclusions and rigid when it comes to revising strategy. The performance of both systems is tested on existing and new aliasing environments. The results show that AgentP often outperforms (and always at least matches) the performance of other techniques and, on the large majority of mazes used, learns optimal or near optimal solutions with fewer trials and a smaller classifier population.

...read moreread less

10 citations

Attribute selection methods for filtered attribute subspace based bagging with injected randomness (FASBIR)

[...]

Ian M. Whittley, Anthony J. Bagnall, Larry Bull, M. Pettipher, Matthew Studley, Firat Tekiner - Show less +2 more

23 Apr 2005

TL;DR: Two refinements of FASBIR are proposed and evaluated on several very large data sets and shown to be feasible and effective.

...read moreread less

Abstract: Filtered Attribute Subspace based Bagging with Injected Randomness (FASBIR) is a recently proposed algorithm for ensembles of k-nn classifiers [28]. FASBIR works by first performing a global filtering of attributes using information gain, then randomising the bagged ensemble with random subsets of the remaining attributes and random distance metrics. In this paper we propose two refinements of FASBIR and evaluate them on several very large data sets.

...read moreread less

1 citations