scispace - formally typeset
Search or ask a question
Author

Joby Boxall

Bio: Joby Boxall is an academic researcher from University of Sheffield. The author has contributed to research in topics: Water quality & Biofilm. The author has an hindex of 34, co-authored 169 publications receiving 3904 citations. Previous affiliations of Joby Boxall include University of Warwick & University of Cambridge.


Papers
More filters
Journal ArticleDOI
TL;DR: There are very few published practicable tools and techniques available to aid water companies in the planned management and control of discolouration problems, and this is an area in need of significant further practical research and development.

266 citations

Journal ArticleDOI
TL;DR: Bacteria inhabiting biofilms, predominantly species belonging to genera Pseudomonas, Zooglea and Janthinobacterium, have an enhanced ability to express extracellular polymeric substances to adhere to surfaces and to favour co-aggregation between cells than those found in the bulk water.

241 citations

Journal ArticleDOI
TL;DR: The currently available methods and emerging approaches for characterising microbial communities, including both planktonic and biofilm ways of life, are critically evaluated and will assist hydraulic engineers and microbial ecologists in choosing the most appropriate tools to assess drinking water microbiology and related aspects.

212 citations

Journal ArticleDOI
TL;DR: The objective of the work presented in this paper was to assess the online application and resulting benefits of an artificial intelligence system for detection of leaks/bursts at district meter area (DMA) level.
Abstract: Water lost through leakage from water distribution networks is often appreciable. As pressure increases on water resources, there is a growing emphasis for water service providers to minimize this loss. The objective of the work presented in this paper was to assess the online application and resulting benefits of an artificial intelligence system for detection of leaks/bursts at district meter area (DMA) level. An artificial neural network model, a mixture density network, was trained using a continually updated historic database that constructed a probability density model of the future flow profile. A fuzzy inference system was used for classification; it compared latest observed flow values with predicted flows over time windows such that in the event of abnormal flow conditions alerts are generated. From the probability density functions of predicted flows, the fuzzy inference system provides confidence intervals associated with each detection, these confidence values provide useful information for f...

177 citations

Journal ArticleDOI
TL;DR: Support vector regression is used as a learning method for anomaly detection from water flow and pressure time series data and the robustness derives from the training error function is applied to a case study.
Abstract: The sampling frequency and quantity of time series data collected from water distribution systems has been increasing in recent years, giving rise to the potential for improving system knowledge if suitable automated techniques can be applied, in particular, machine learning. Novelty (or anomaly) detection refers to the automatic identification of novel or abnormal patterns embedded in large amounts of “normal” data. When dealing with time series data (transformed into vectors), this means abnormal events embedded amongst many normal time series points. The support vector machine is a data-driven statistical technique that has been developed as a tool for classification and regression. The key features include statistical robustness with respect to non-Gaussian errors and outliers, the selection of the decision boundary in a principled way, and the introduction of nonlinearity in the feature space without explicitly requiring a nonlinear algorithm by means of kernel functions. In this research, support vector regression is used as a learning method for anomaly detection from water flow and pressure time series data. No use is made of past event histories collected through other information sources. The support vector regression methodology, whose robustness derives from the training error function, is applied to a case study.

148 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Book ChapterDOI
01 Jan 1997
TL;DR: The boundary layer equations for plane, incompressible, and steady flow are described in this paper, where the boundary layer equation for plane incompressibility is defined in terms of boundary layers.
Abstract: The boundary layer equations for plane, incompressible, and steady flow are $$\matrix{ {u{{\partial u} \over {\partial x}} + v{{\partial u} \over {\partial y}} = - {1 \over \varrho }{{\partial p} \over {\partial x}} + v{{{\partial ^2}u} \over {\partial {y^2}}},} \cr {0 = {{\partial p} \over {\partial y}},} \cr {{{\partial u} \over {\partial x}} + {{\partial v} \over {\partial y}} = 0.} \cr }$$

2,598 citations

Journal Article
TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

2,436 citations