scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Pattern Recognition and Machine Learning

01 Aug 2007-Technometrics (Taylor & Francis)-Vol. 49, Iss: 3, pp 366-366
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.
Citations
More filters
Journal ArticleDOI
TL;DR: A GP framework is presented to model RV time series jointly with ancillary activity indicators, allowing the activity component of RV timeseries to be constrained and disentangled from e.g. planetary components.
Abstract: To date, the radial velocity (RV) method has been one of the most productive techniques for detecting and confirming extrasolar planetary candidates. Unfortunately, stellar activity can induce RV variations which can drown out or even mimic planetary signals - and it is notoriously difficult to model and thus mitigate the effects of these activity-induced nuisance signals. This is expected to be a major obstacle to using next-generation spectrographs to detect lower mass planets, planets with longer periods, and planets around more active stars. Enter Gaussian processes (GPs) which, we note, have a number of attractive features that make them very well suited to disentangling stellar activity signals from planetary signals. We present here a GP framework we developed to model RV time series jointly with ancillary activity indicators (e.g. bisector velocity spans, line widths, chromospheric activity indices), allowing the activity component of RV time series to be constrained and disentangled from e.g. planetary components. We discuss the mathematical details of our GP framework, and present results illustrating its encouraging performance on both synthetic and real RV datasets, including the publicly-available Alpha Centauri B dataset.

288 citations

Journal ArticleDOI
TL;DR: In this article, a molecular dipole moment model based on environment dependent neural network charges is proposed for the prediction of infrared spectra based on only a few hundreds of electronic structure reference points.
Abstract: Machine learning has emerged as an invaluable tool in many research areas. In the present work, we harness this power to predict highly accurate molecular infrared spectra with unprecedented computational efficiency. To account for vibrational anharmonic and dynamical effects – typically neglected by conventional quantum chemistry approaches – we base our machine learning strategy on ab initio molecular dynamics simulations. While these simulations are usually extremely time consuming even for small molecules, we overcome these limitations by leveraging the power of a variety of machine learning techniques, not only accelerating simulations by several orders of magnitude, but also greatly extending the size of systems that can be treated. To this end, we develop a molecular dipole moment model based on environment dependent neural network charges and combine it with the neural network potential approach of Behler and Parrinello. Contrary to the prevalent big data philosophy, we are able to obtain very accurate machine learning models for the prediction of infrared spectra based on only a few hundreds of electronic structure reference points. This is made possible through the use of molecular forces during neural network potential training and the introduction of a fully automated sampling scheme. We demonstrate the power of our machine learning approach by applying it to model the infrared spectra of a methanol molecule, n-alkanes containing up to 200 atoms and the protonated alanine tripeptide, which at the same time represents the first application of machine learning techniques to simulate the dynamics of a peptide. In all of these case studies we find an excellent agreement between the infrared spectra predicted via machine learning models and the respective theoretical and experimental spectra.

288 citations

Proceedings ArticleDOI
05 Sep 2012
TL;DR: A wearable acoustic sensor, called BodyScope, is developed to record the sounds produced in the user's throat area and classify them into user activities, such as eating, drinking, speaking, laughing, and coughing.
Abstract: Accurate activity recognition enables the development of a variety of ubiquitous computing applications, such as context-aware systems, lifelogging, and personal health systems. Wearable sensing technologies can be used to gather data for activity recognition without requiring sensors to be installed in the infrastructure. However, the user may need to wear multiple sensors for accurate recognition of a larger number of different activities. We developed a wearable acoustic sensor, called BodyScope, to record the sounds produced in the user's throat area and classify them into user activities, such as eating, drinking, speaking, laughing, and coughing. The F-measure of the Support Vector Machine classification of 12 activities using only our BodyScope sensor was 79.5%. We also conducted a small-scale in-the-wild study, and found that BodyScope was able to identify four activities (eating, drinking, speaking, and laughing) at 71.5% accuracy.

288 citations

Journal ArticleDOI
TL;DR: It is shown that the most accurate characterizations are achieved by using prior knowledge of where to expect neurodegeneration (hippocampus and parahippocampal gyrus) and that feature selection does improve the classification accuracies, but it depends on the method adopted.

288 citations


Cites methods from "Pattern Recognition and Machine Lea..."

  • ...The new samples were then drawn from the group with more samples using a variation of rejection sampling (Bishop, 2006)....

    [...]

Journal ArticleDOI
TL;DR: This primer aims to introduce BNs to the computational biologist, focusing on the concepts behind methods for learning the parameters and structure of models, at a time when they are becoming the machine learning method of choice.
Abstract: Bayesian networks (BNs) provide a neat and compact representation for expressing joint probability distributions (JPDs) and for inference. They are becoming increasingly important in the biological sciences for the tasks of inferring cellular networks [1], modelling protein signalling pathways [2], systems biology, data integration [3], classification [4], and genetic data analysis [5]. The representation and use of probability theory makes BNs suitable for combining domain knowledge and data, expressing causal relationships, avoiding overfitting a model to training data, and learning from incomplete datasets. The probabilistic formalism provides a natural treatment for the stochastic nature of biological systems and measurements. This primer aims to introduce BNs to the computational biologist, focusing on the concepts behind methods for learning the parameters and structure of models, at a time when they are becoming the machine learning method of choice. There are many applications in biology where we wish to classify data; for example, gene function prediction. To solve such problems, a set of rules are required that can be used for prediction, but often such knowledge is unavailable, or in practice there turn out to be many exceptions to the rules or so many rules that this approach produces poor results. Machine learning approaches often produce better results, where a large number of examples (the training set) is used to adapt the parameters of a model that can then be used for performing predictions or classifications on data. There are many different types of models that may be required and many different approaches to training the models, each with its pros and cons. An excellent overview of the topic can be found in [6] and [7]. Neural networks, for example, are often able to learn a model from training data, but it is often difficult to extract information about the model, which with other methods can provide valuable insights into the data or problem being solved. A common problem in machine learning is overfitting, where the learned model is too complex and generalises poorly to unseen data. Increasing the size of the training dataset may reduce this; however, this assumes more training data is readily available, which is often not the case. In addition, often it is important to determine the uncertainty in the learned model parameters or even in the choice of model. This primer focuses on the use of BNs, which offer a solution to these issues. The use of Bayesian probability theory provides mechanisms for describing uncertainty and for adapting the number of parameters to the size of the data. Using a graphical representation provides a simple way to visualise the structure of a model. Inspection of models can provide valuable insights into the properties of the data and allow new models to be produced.

287 citations