Showing papers by "James Bailey published in 2009"

PDF

Open Access

Proceedings Article•DOI•

Information theoretic measures for clusterings comparison: is a correction for chance necessary?

[...]

Nguyen Xuan Vinh¹, Julien Epps¹, James Bailey²•Institutions (2)

University of New South Wales¹, University of Melbourne²

14 Jun 2009

TL;DR: This paper derives the analytical formula for the expected mutual information value between a pair of clusterings, and proposes the adjusted version for several popular information theoretic based measures.

...read moreread less

Abstract: Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of pair-counting based and set-matching based measures. In this paper, we discuss the necessity of correction for chance for information theoretic based measures for clusterings comparison. We observe that the baseline for such measures, i.e. average value between random partitions of a data set, does not take on a constant value, and tends to have larger variation when the ratio between the number of data points and the number of clusters is small. This effect is similar in some other non-information theoretic based measures such as the well-known Rand Index. Assuming a hypergeometric model of randomness, we derive the analytical formula for the expected mutual information value between a pair of clusterings, and then propose the adjusted version for several popular information theoretic based measures. Some examples are given to demonstrate the need and usefulness of the adjusted measures.

...read moreread less

748 citations

Journal Article•DOI•

Closed-loop control for intensive care unit sedation

[...]

Wassim M. Haddad¹, James Bailey•Institutions (1)

Georgia Institute of Technology¹

01 Mar 2009-Best Practice & Research Clinical Anaesthesiology

TL;DR: The challenges and opportunities of feedback control using nonnegative and compartmental system theory for the specific problem of closed-loop control of intensive care unit sedation are discussed.

...read moreread less

33 citations

Proceedings Article•

A query based approach for mining evolving graphs

[...]

Andrey Kan¹, Jeffrey Chan², James Bailey¹, Christopher Leckie¹•Institutions (2)

University of Melbourne¹, National University of Ireland, Galway²

01 Dec 2009

TL;DR: This paper addresses the novel problem of querying evolving graphs using spatio-temporal patterns by answering selection queries, which can discover evolving subgraphs that satisfy both a temporal and a spatial predicate.

...read moreread less

Abstract: An evolving graph is a graph that can change over time. Such graphs can be applied in modelling a wide range of real-world phenomena, like computer networks, social networks and protein interaction networks. This paper addresses the novel problem of querying evolving graphs using spatio-temporal patterns. In particular, we focus on answering selection queries, which can discover evolving subgraphs that satisfy both a temporal and a spatial predicate. We investigate the efficient implementation of such queries and experimentally evaluate our techniques using real-world evolving graph datasets --- Internet connectivity logs and the Enron email corpus. We show that is possible to use queries to discover meaningful events hidden in this data and demonstrate that our implementation is scalable for very large evolving graphs.

...read moreread less

16 citations

Journal Article•DOI•

A voting approach to identify a small number of highly predictive genes using multiple classifiers

[...]

Rafiul Hassan¹, M. Maruf Hossain¹, James Bailey¹, Geoff Macintyre¹, Joshua W. K. Ho², Joshua W. K. Ho³, Kotagiri Ramamohanarao¹ - Show less +3 more•Institutions (3)

University of Melbourne¹, NICTA², University of Sydney³

30 Jan 2009-BMC Bioinformatics

TL;DR: It is shown that it is possible to obtain superior classification accuracy with this approach and obtain a compact gene set that is also biologically relevant and has good coverage of different biological processes.

...read moreread less

Abstract: Microarray gene expression profiling has provided extensive datasets that can describe characteristics of cancer patients. An important challenge for this type of data is the discovery of gene sets which can be used as the basis of developing a clinical predictor for cancer. It is desirable that such gene sets be compact, give accurate predictions across many classifiers, be biologically relevant and have good biological process coverage. By using a new type of multiple classifier voting approach, we have identified gene sets that can predict breast cancer prognosis accurately, for a range of classification algorithms. Unlike a wrapper approach, our method is not specialised towards a single classification technique. Experimental analysis demonstrates higher prediction accuracies for our sets of genes compared to previous work in the area. Moreover, our sets of genes are generally more compact than those previously proposed. Taking a biological viewpoint, from the literature, most of the genes in our sets are known to be strongly related to cancer. We show that it is possible to obtain superior classification accuracy with our approach and obtain a compact gene set that is also biologically relevant and has good coverage of different biological processes.

...read moreread less

15 citations

Proceedings Article•

Feature weighted SVMs using receiver operating characteristics

[...]

S Zhang, M. Maruf Hossain¹, Hassan, James Bailey¹, Kotagiri Ramamohanarao¹ - Show less +1 more•Institutions (1)

University of Melbourne¹

01 Dec 2009

TL;DR: This paper investigates the suitability of using a new feature weighting scheme for SVM kernel functions, based on receiver operating characteristics (ROC), and demonstrates that it can significantly and substantially boost classification performance, across a range of datasets.

...read moreread less

Abstract: Support Vector Machines (SVMs) are a leading tool in classification and pattern recognition and the kernel function is one of its most important components. This function is used to map the input space into a high dimensional feature space. However, it can perform rather poorly when there are too many dimensions (e.g. for gene expression data) or when there is a lot of noise. In this paper, we investigate the suitability of using a new feature weighting scheme for SVM kernel functions, based on receiver operating characteristics (ROC). This strategy is clean, simple and surprisingly effective. We experimentally demonstrate that it can significantly and substantially boost classification performance, across a range of datasets.

...read moreread less

13 citations

Book Chapter•DOI•

A Framework for Goal-Based Semantic Compensation in Agent Systems

[...]

Amy Unruh¹, James Bailey¹, Kotagiri Ramamohanarao¹•Institutions (1)

University of Melbourne¹

06 Oct 2009

TL;DR: This paper describes an approach to improving the robustness of an agent system by augmenting its failure-handling capabilities, based on the concept of semantic compensation: "cleaning up" failed or canceled tasks can help agents behave more robustly and predictably at both an individual and system level.

...read moreread less

Abstract: This paper describes an approach to improving the robustness of an agent system by augmenting its failure-handling capabilities. The approach is based on the concept of semantic compensation: "cleaning up" failed or canceled tasks can help agents behave more robustly and predictably at both an individual and system level. Our approach is goal-based, both with respect to defining failure-handling knowledge, and in specifying a failure-handling model that makes use of this knowledge. By abstracting failure-handling above the level of specific actions or task implementations, it is not tied to specific agent architectures or task plans and is more widely applicable. The failure-handling knowledge is employed via a failure-handling support component associated with each agent through a goal-based interface. The use of this component decouples the specification and use of failure-handling information from the specification of the agent's domain problem-solving knowledge, and reduces the failure-handling information that an agent developer needs to provide.

...read moreread less

13 citations

Journal Article•DOI•

Adaptive disturbance rejection control for compartmental systems with application to intraoperative anesthesia influenced by hemorrhage and hemodilution effects

[...]

K.Y. Volyanskyy¹, Wassim M. Haddad¹, James Bailey•Institutions (1)

Georgia Institute of Technology¹

01 Jan 2009-International Journal of Adaptive Control and Signal Processing

TL;DR: In this paper, a direct adaptive disturbance rejection control framework for compartmental dynamical systems with exogenous bounded disturbances is proposed, which guarantees partial asymptotic stability with respect to part of the closed-loop system states associated with the plant dynamics.

...read moreread less

Abstract: Compartmental system models involve dynamic states whose values are nonnegative. These models are widespread in biological and physiological sciences and play a key role in understanding these processes. In this paper, we develop a direct adaptive disturbance rejection control framework for compartmental dynamical systems with exogenous bounded disturbances. The proposed framework is Lyapunov based and guarantees partial asymptotic stability of the closed-loop system, that is, asymptotic stability with respect to part of the closed-loop system states associated with the plant dynamics. The remainder of the states associated with the adaptive controller gains are shown to be Lyapunov stable. In the case of bounded energy ℒ2 disturbances, the proposed approach guarantees a nonexpansivity constraint on the closed-loop input–output map between the plant disturbances and performance variables. Finally, a numerical example involving the infusion of the anesthetic drug propofol for maintaining a desired constant level of depth of anesthesia for surgery in the face of continuing hemorrhage and hemodilution is provided. Copyright © 2008 John Wiley & Sons, Ltd.

...read moreread less

13 citations

Book Chapter•DOI•

Using Highly Expressive Contrast Patterns for Classification - Is It Worthwhile?

[...]

Elsa Loekito¹, James Bailey¹•Institutions (1)

University of Melbourne¹

19 Apr 2009

TL;DR: This paper investigates whether expressive contrasts are beneficial for classification by adopting a statistical methodology for eliminating noisy patterns and identifying circumstances where expressive patterns can improve over previous contrast pattern based classifiers.

...read moreread less

Abstract: Classification is an important task in data mining. Contrast patterns, such as emerging patterns, have been shown to be powerful for building classifiers, but they rarely exist in sparse data. Recently proposed disjunctive emerging patterns are highly expressive, and can potentially overcome this limitation. Simple contrast patterns only allow simple conjunctions, whereas disjunctive patterns additionally allow expressions of disjunctions. This paper investigates whether expressive contrasts are beneficial for classification. We adopt a statistical methodology for eliminating noisy patterns. Our experiments identify circumstances where expressive patterns can improve over previous contrast pattern based classifiers. We also present some guidelines for i) using expressive patterns based on the nature of the given data, ii) how to choose between the different types of contrast patterns for building a classifier.

...read moreread less

12 citations

Book Chapter•DOI•

Semantic Web Query Languages

[...]

James Bailey¹, François Bry², Tim Furche², Sebastian Schaffert³•Institutions (3)

University of Melbourne¹, Ludwig Maximilian University of Munich², Salzburg Research³

01 Jan 2009

TL;DR: A wide range of query languages for the Semantic Web exist, ranging from pure “selection languages” with only limited expressivity, to fully-fledged reasoning languages, to general purpose languages that support multiple data representation formats and allow simultaneous querying of data on both the standard andSemantic Web.

...read moreread less

Abstract: DEFINITION A number of formalisms have been proposed for representing data and meta data on the Semantic Web. In particular, RDF, Topic Maps and OWL allow one to describe relationships between data items, such as concept hierarchies and relations between the concepts. A key requirement for the Semantic Web is integrated access to data represented in any of these formalisms, as well the ability to also access data in the formalisms of the “standard Web”, such as (X)HTML and XML. This data access is the objective of Semantic Web query languages. A wide range of query languages for the Semantic Web exist, ranging from i) pure “selection languages” with only limited expressivity, to fully-fledged reasoning languages, and ii) from query languages restricted to a certain data representation format, such as XML or RDF, to general purpose languages that support multiple data representation formats and allow simultaneous querying of data on both the standard and Semantic Web.

...read moreread less

11 citations

Journal Article•DOI•

Using graph partitioning to discover regions of correlated spatio-temporal change in evolving graphs

[...]

Jeffrey Chan¹, James Bailey¹, Christopher Leckie¹•Institutions (1)

University of Melbourne¹

01 Oct 2009

TL;DR: This work investigates a new type of dynamic graph analysis - finding regions of a graph that are evolving in a similar manner and are topologically similar over a period of time and proposes a new algorithm called regHunter, which treats the region discovery problem as a multi-objective optimisation problem, and it uses amulti-level graph partitioning algorithm to discover the regions of correlated change.

...read moreread less

Abstract: There is growing interest in studying dynamic graphs, or graphs that evolve with time. In this work, we investigate a new type of dynamic graph analysis - finding regions of a graph that are evolving in a similar manner and are topologically similar over a period of time. For example, these regions can be used to group a set of changes having a common cause in event detection and fault diagnosis. Prior work [6] has proposed a greedy framework called cSTAG to find these regions. It was accurate in datasets where the regions are temporally and spatially well separated. However, in cases where the regions are not well separated, cSTAG produces incorrect groupings. In this paper, we propose a new algorithm called regHunter. It treats the region discovery problem as a multi-objective optimisation problem, and it uses a multi-level graph partitioning algorithm to discover the regions of correlated change. In addition, we propose an external clustering validation technique, and use several existing internal measures to evaluate the accuracy of regHunter. Using synthetic datasets, we found regHunter is significantly more accurate than cSTAG in dynamic graphs that have regions with small separation. Using two real datasets - the access graph of the 1998 World Cup website, and the BGP connectivity graph during the landfall of Hurricane Katrina - we found regHunter obtained more accurate results than cSTAG. Furthermore, regHunter was able to discover two interesting regions for the World Cup access graph that CSTAG was not able to find.

...read moreread less

7 citations

A query based approach for mining evolving graphs

[...]

Andrey Kan¹, Jeffrey Chan², James Bailey¹, Christopher Leckie¹•Institutions (2)

University of Melbourne¹, National University of Ireland, Galway²

01 Jan 2009

TL;DR: In this article, the authors focus on answering selection queries, which can discover evolving subgraphs that satisfy both a temporal and a spatial predicate and investigate the efficient implementation of such queries and experimentally evaluate their techniques using real-world evolving graph datasets.

...read moreread less

Abstract: An evolving graph is a graph that can change over time. Such graphs can be applied in modelling a wide range of real-world phenomena, like computer networks, social networks and protein interaction networks. This paper addresses the novel problem of querying evolving graphs using spatio-temporal patterns. In particular, we focus on answering selection queries, which can discover evolving subgraphs that satisfy both a temporal and a spatial predicate. We investigate the efficient implementation of such queries and experimentally evaluate our techniques using real-world evolving graph datasets - Internet connectivity logs and the Enron email corpus. We show that is possible to use queries to discover meaningful events hidden in this data and demonstrate that our implementation is scalable for very large evolving graphs. © 2009, Australian Computer Society, Inc.

...read moreread less

Journal Article•DOI•

Building more robust multi-agent systems using a log-based approach

[...]

Amy Unruh¹, James Bailey¹, Kotagiri Ramamohanarao¹•Institutions (1)

University of Melbourne¹

01 Jan 2009-Web Intelligence and Agent Systems: An International Journal

TL;DR: This work claims that execution logging is essential to support agent system robustness, and that agents should have architectural-level support for logging and recovery methods, and describes an infrastructure-level, default methodology for agent problem-handling, based on logging, and supported by declaratively encoding domain-specific knowledge related to changes in goal status and semantic compensations.

...read moreread less

Abstract: In an agent system, the ability to handle problems and recover from them is important in sustaining stability and providing robustness. We claim that execution logging is essential to support agent system robustness, and that agents should have architectural-level support for logging and recovery methods. We describe an infrastructure-level, default methodology for agent problem-handling, based on logging, and supported by declaratively encoding domain-specific knowledge related to changes in goal status and semantic compensations. Via logging, the approach allows repair of already-completed as well as current goals. We define a language, APLR, to support and constrain incremental specification of problem-handling information, with the agents' problem-handling behaviour increasing in sophistication as more knowledge is added to the system. The approach is implemented by mapping the methodology and domain knowledge to 3APL-like plan rules extended to support logging.

...read moreread less