scispace - formally typeset
Search or ask a question

Showing papers by "James Bailey published in 2016"


Journal ArticleDOI
TL;DR: This paper solves the key technical challenge of analytically computing the expected value and variance of generalized IT measures and proposes guidelines for using ARI and AMI as external validation indices.
Abstract: Adjusted for chance measures are widely used to compare partitions/clusterings of the same data set. In particular, the Adjusted Rand Index (ARI) based on pair-counting, and the Adjusted Mutual Information (AMI) based on Shannon information theory are very popular in the clustering community. Nonetheless it is an open problem as to what are the best application scenarios for each measure and guidelines in the literature for their usage are sparse, with the result that users often resort to using both. Generalized Information Theoretic (IT) measures based on the Tsallis entropy have been shown to link pair-counting and Shannon IT measures. In this paper, we aim to bridge the gap between adjustment of measures based on pair-counting and measures based on information theory. We solve the key technical challenge of analytically computing the expected value and variance of generalized IT measures. This allows us to propose adjustments of generalized IT measures, which reduce to well known adjusted clustering comparison measures as special cases. Using the theory of generalized IT measures, we are able to propose the following guidelines for using ARI and AMI as external validation indices: ARI should be used when the reference clustering has large equal sized clusters; AMI should be used when the reference clustering is unbalanced and there exist small clusters.

123 citations


Proceedings ArticleDOI
13 Aug 2016
TL;DR: This work proposes an efficient online algorithm that can incrementally track the CP decompositions of dynamic tensors with an arbitrary number of dimensions and shows not only significantly better decomposition quality, but also better performance in terms of stability, efficiency and scalability.
Abstract: Tensors are a natural representation for multidimensional data. In recent years, CANDECOMP/PARAFAC (CP) decomposition, one of the most popular tools for analyzing multi-way data, has been extensively studied and widely applied. However, today's datasets are often dynamically changing over time. Tracking the CP decomposition for such dynamic tensors is a crucial but challenging task, due to the large scale of the tensor and the velocity of new data arriving. Traditional techniques, such as Alternating Least Squares (ALS), cannot be directly applied to this problem because of their poor scalability in terms of time and memory. Additionally, existing online approaches have only partially addressed this problem and can only be deployed on third-order tensors. To fill this gap, we propose an efficient online algorithm that can incrementally track the CP decompositions of dynamic tensors with an arbitrary number of dimensions. In terms of effectiveness, our algorithm demonstrates comparable results with the most accurate algorithm, ALS, whilst being computationally much more efficient. Specifically, on small and moderate datasets, our approach is tens to hundreds of times faster than ALS, while for large-scale datasets, the speedup can be more than 3,000 times. Compared to other state-of-the-art online approaches, our method shows not only significantly better decomposition quality, but also better performance in terms of stability, efficiency and scalability.

113 citations


Journal ArticleDOI
TL;DR: This paper identifies a set of assumptions under which the original high-dimensional mutual information based criterion can be decomposed into a setof low-dimensional MI quantities, and proposes a theoretically sound approach to extend current Mutual information based feature selection methods to handle high-order dependencies.

104 citations


Journal ArticleDOI
TL;DR: This paper investigates several open challenges faced by existing outlying aspects mining techniques and proposes novel solutions, including how to design effective scoring functions that are unbiased with respect to dimensionality and yet being computationally efficient, and how to efficiently search through the exponentially large search space of all possible subspaces.
Abstract: We address the problem of outlying aspects mining: given a query object and a reference multidimensional data set, how can we discover what aspects (i.e., subsets of features or subspaces) make the query object most outlying? Outlying aspects mining can be used to explain any data point of interest, which itself might be an inlier or outlier. In this paper, we investigate several open challenges faced by existing outlying aspects mining techniques and propose novel solutions, including (a) how to design effective scoring functions that are unbiased with respect to dimensionality and yet being computationally efficient, and (b) how to efficiently search through the exponentially large search space of all possible subspaces. We formalize the concept of dimensionality unbiasedness, a desirable property of outlyingness measures. We then characterize existing scoring measures as well as our novel proposed ones in terms of efficiency, dimensionality unbiasedness and interpretability. Finally, we evaluate the effectiveness of different methods for outlying aspects discovery and demonstrate the utility of our proposed approach on both large real and synthetic data sets.

79 citations


Journal ArticleDOI
01 Nov 2016
TL;DR: A rich understanding is offered on how technology can support sense-making on personal sleep data: SleepExplorer organizes a flux of sleep data into sleep structure, guides sleep-tracking activities, highlights connections between sleep and contributing factors, and supports individuals in taking actions.
Abstract: Getting enough quality sleep is a key part of a healthy lifestyle. Many people are tracking their sleep through mobile and wearable technology, together with contextual information that may influence sleep quality, like exercise, diet, and stress. However, there is limited support to help people make sense of this wealth of data, i.e., to explore the relationship between sleep data and contextual data. We strive to bridge this gap between sleep-tracking and sense-making through the design of SleepExplorer, a web-based tool that helps individuals understand sleep quality through multi-dimensional sleep structure and explore correlations between sleep data and contextual information. Based on a two-week field study with 12 participants, this paper offers a rich understanding on how technology can support sense-making on personal sleep data: SleepExplorer organizes a flux of sleep data into sleep structure, guides sleep-tracking activities, highlights connections between sleep and contributing factors, and supports individuals in taking actions. We discuss challenges and opportunities to inform the work of researchers and designers creating data-driven health and well-being applications.

60 citations


Journal Article
TL;DR: Wang et al. as discussed by the authors proposed SleepExplorer, a web-based tool that helps individuals understand sleep quality through multi-dimensional sleep structure and explore correlations between sleep data and contextual information.
Abstract: Getting enough quality sleep is a key part of a healthy lifestyle. Many people are tracking their sleep through mobile and wearable technology, together with contextual information that may influence sleep quality, like exercise, diet, and stress. However, there is limited support to help people make sense of this wealth of data, i.e., to explore the relationship between sleep data and contextual data. We strive to bridge this gap between sleep-tracking and sense-making through the design of SleepExplorer, a web-based tool that helps individuals understand sleep quality through multi-dimensional sleep structure and explore correlations between sleep data and contextual information. Based on a two-week field study with 12 participants, this paper offers a rich understanding on how technology can support sense-making on personal sleep data: SleepExplorer organizes a flux of sleep data into sleep structure, guides sleep-tracking activities, highlights connections between sleep and contributing factors, and supports individuals in taking actions. We discuss challenges and opportunities to inform the work of researchers and designers creating data-driven health and well-being applications.

47 citations


Posted Content
TL;DR: This work identifies a new type of bias arising from the distribution of the ground truth (reference) partition against which candidate partitions are compared, and names it as GT bias, which is the first extensive study of such a property for external cluster validity indices.
Abstract: It has been noticed that some external CVIs exhibit a preferential bias towards a larger or smaller number of clusters which is monotonic (directly or inversely) in the number of clusters in candidate partitions. This type of bias is caused by the functional form of the CVI model. For example, the popular Rand index (RI) exhibits a monotone increasing (NCinc) bias, while the Jaccard Index (JI) index suffers from a monotone decreasing (NCdec) bias. This type of bias has been previously recognized in the literature. In this work, we identify a new type of bias arising from the distribution of the ground truth (reference) partition against which candidate partitions are compared. We call this new type of bias ground truth (GT) bias. This type of bias occurs if a change in the reference partition causes a change in the bias status (e.g., NCinc, NCdec) of a CVI. For example, NCinc bias in the RI can be changed to NCdec bias by skewing the distribution of clusters in the ground truth partition. It is important for users to be aware of this new type of biased behaviour, since it may affect the interpretations of CVI results. The objective of this article is to study the empirical and theoretical implications of GT bias. To the best of our knowledge, this is the first extensive study of such a property for external cluster validity indices.

33 citations


Posted Content
TL;DR: A new parametrisation of the transition matrix is presented which allows efficient training of an RNN while ensuring that the matrix is always orthogonal, and gives similar benefits to the unitary constraint, without the time complexity limitations.
Abstract: The problem of learning long-term dependencies in sequences using Recurrent Neural Networks (RNNs) is still a major challenge. Recent methods have been suggested to solve this problem by constraining the transition matrix to be unitary during training which ensures that its norm is equal to one and prevents exploding gradients. These methods either have limited expressiveness or scale poorly with the size of the network when compared with the simple RNN case, especially when using stochastic gradient descent with a small mini-batch size. Our contributions are as follows; we first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint. Then we present a new parametrisation of the transition matrix which allows efficient training of an RNN while ensuring that the matrix is always orthogonal. Our results show that the orthogonal constraint on the transition matrix applied through our parametrisation gives similar benefits to the unitary constraint, without the time complexity limitations.

27 citations


Proceedings Article
09 Jul 2016
TL;DR: This paper proposed Elliptical Summary Randomization (ESRand), which is an efficient domain generalization approach that comprises of a randomised kernel and elliptical data summarization, and learns a domain interdependent projection to a latent subspace that minimises the existing biases to the data while maintaining the functional relationship between domains.
Abstract: Many conventional statistical machine learning algorithms generalise poorly if distribution bias exists in the datasets. For example, distribution bias arises in the context of domain generalisation, where knowledge acquired from multiple source domains need to be used in a previously unseen target domains. We propose Elliptical Summary Randomisation (ESRand), an efficient domain generalisation approach that comprises of a randomised kernel and elliptical data summarisation. ESRand learns a domain interdependent projection to a latent subspace that minimises the existing biases to the data while maintaining the functional relationship between domains. In the latent subspace, ellipsoidal summaries replace the samples to enhance the generalisation by further removing bias and noise in the data. Moreover, the summarisation enables large-scale data processing by significantly reducing the size of the data. Through comprehensive analysis, we show that our subspace-based approach outperforms state-of-the-art results on several activity recognition benchmark datasets, while keeping the computational complexity significantly low.

26 citations


Proceedings ArticleDOI
01 Dec 2016
TL;DR: This paper develops a dependency measure between variables based on an extreme-value theoretic treatment of intrinsic dimensionality that theoretically prove a connection between information theory and intrinsicdimensionality theory.
Abstract: Measuring the amount of dependency among multiple variables is an important task in pattern recognition. In the last few years, many new dependency measures have been developed for the exploration of functional relationships. In this paper, we develop a dependency measure between variables based on an extreme-value theoretic treatment of intrinsic dimensionality. Our measure identifies variables with low intrinsic dimension — that is, those that support embeddings of the data within low-dimensional manifolds. To build a dependency measure on strong foundations, we theoretically prove a connection between information theory and intrinsic dimensionality theory. This allows us also to propose novel estimators of intrinsic dimensionality. Finally, we show that our dependency measure enables to find patterns that cannot be found by other state-of-the-art measures on real and synthetic data.

24 citations


Proceedings ArticleDOI
01 Dec 2016
TL;DR: An optimization scheduling algorithm is proposed that maximizes profits for AaaS platforms and guarantees SLAs for query requests and performs significantly better in cost saving and profit enhancement compared to the state-of-the-art scheduling algorithms.
Abstract: The value that can be extracted from big data greatly motivates organizations to explore data analytics technologies for better decision making and problem solving in a wide range of application domains. Cloud computing greatly eases and benefits big data analytics by offering on-demand and scalable computing infrastructures, platforms, and applications as services. Big data Analytics-as-a-Service (AaaS) platforms aim to deliver data analytics as consumable services in cloud computing environments in a pay as you go model with Service Level Agreement (SLA) guarantees. Resource scheduling for AaaS platforms is significant as big data analytics requires large-scale computing, which can consume huge amounts of resources and incur high resource costs. Our research focuses on proposing automatic and scalable resource scheduling algorithms to maximize the profits for AaaS platforms while delivering AaaS services to users with SLA guarantees on budgets and deadlines to allow timely responses with controllable costs. In this paper, we model and formulate the profit optimization resource scheduling problem and propose an optimization scheduling algorithm that maximizes profits for AaaS platforms and guarantees SLAs for query requests. Experimental evaluations show that the profit optimization scheduling algorithm performs significantly better in cost saving and profit enhancement compared to the state-of-the-art scheduling algorithms.

Proceedings ArticleDOI
01 Dec 2016
TL;DR: This paper shows how robust neural networks can be trained using random projection, and shows that while random projection acts as a strong regularizer, boosting model accuracy similar to other regularizers, it is far more robust to adversarial noise and fooling samples.
Abstract: Regularization plays an important role in machine learning systems. We propose a novel methodology for model regularization using random projection. We demonstrate the technique on neural networks, since such models usually comprise a very large number of parameters, calling for strong regularizers. It has been shown recently that neural networks are sensitive to two kinds of samples: (i) adversarial samples, which are generated by imperceptible perturbations of previously correctly-classified samples—yet the network will misclassify them; and (ii) fooling samples, which are completely unrecognizable, yet the network will classify them with extremely high confidence. In this paper, we show how robust neural networks can be trained using random projection. We show that while random projection acts as a strong regularizer, boosting model accuracy similar to other regularizers, such as weight decay and dropout, it is far more robust to adversarial noise and fooling samples. We further show that random projection also helps to improve the robustness of traditional classifiers, such as Random Forrest and Gradient Boosting Machines.

Proceedings ArticleDOI
30 Jun 2016
TL;DR: In this paper, the authors propose a framework to adjust dependency measure estimates on finite samples, which are helpful in improving interpretability when quantifying dependency and in improving accuracy on the task of ranking dependencies.
Abstract: Copyright © by SIAM. Estimating the strength of dependency between two variables is fundamental for exploratory analysis and many other applications in data mining. For example: non-linear dependencies between two continuous variables can be explored with the Maximal Information Coefficient (MIC); and categorical variables that are dependent to the target class are selected using Gini gain in random forests. Nonetheless, because dependency measures are estimated on finite samples, the in-terpretability of their quantification and the accuracy when ranking dependencies become challenging. Dependency estimates are not equal to 0 when variables are independent, cannot be compared if computed on different sample size, and they are inflated by chance on variables with more categories. In this paper, we propose a framework to adjust dependency measure estimates on finite samples. Our adjustments, which are simple and applicable to any dependency measure, are helpful in improving interpretability when quantifying dependency and in improving accuracy on the task of ranking dependencies. In particular, we demonstrate that our approach enhances the interpretability of MIC when used as a proxy for the amount of noise between variables, and to gain accuracy when ranking variables during the splitting procedure in random forests.

Journal ArticleDOI
TL;DR: This paper examines four methods for completing the input data with imputed values before imaging and chooses a best method using contaminated versions of the complete Iris data, for which the desired results are known.
Abstract: The iVAT (asiVAT) algorithms reorder symmetric (asymmetric) dissimilarity data so that an image of the data may reveal cluster substructure. Images formed from incomplete data don't offer a very rich interpretation of cluster structure. In this paper, we examine four methods for completing the input data with imputed values before imaging. We choose a best method using contaminated versions of the complete Iris data, for which the desired results are known. Then, we analyze two real world data sets from social networks that are incomplete using the best imputation method chosen in the juried trials with Iris: (i) Sampson's monastery data, an incomplete, asymmetric relation matrix; and (ii) the karate club data, comprising a symmetric similarity matrix that is about 86 percent incomplete.

Proceedings ArticleDOI
01 Jan 2016
TL;DR: The first unsupervised tensorial anomaly detection method is presented, along with a randomised version of the method, the One-class Support Tensor Machine (1STM), which preserves the multiway structure of tensor data, while achieving significant improvement in accuracy and efficiency over conventional vectorised methods.
Abstract: Identifying unusual or anomalous patterns in an underlying dataset is an important but challenging task in many applications. The focus of the unsupervised anomaly detection literature has mostly been on vectorised data. However, many applications are more naturally described using higher-order tensor representations. Approaches that vectorise tensorial data can destroy the structural information encoded in the high-dimensional space, and lead to the problem of the curse of dimensionality. In this paper we present the first unsupervised tensorial anomaly detection method, along with a randomised version of our method. Our anomaly detection method, the One-class Support Tensor Machine (1STM), is a generalisation of conventional one-class Support Vector Machines to higher-order spaces. 1STM preserves the multiway structure of tensor data, while achieving significant improvement in accuracy and efficiency over conventional vectorised methods. We then leverage the theory of nonlinear random projections to propose the Randomised 1STM (R1STM). Our empirical analysis on several real and synthetic datasets shows that our R1STM algorithm delivers comparable or better accuracy to a state-of-the-art deep learning method and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.

Proceedings ArticleDOI
02 Nov 2016
TL;DR: This paper introduces a method of automatically segmenting a surgical trajectory into steps and shows that it is accurate and efficient, and proposes a pre-processing step that uses domain knowledge specific to the application to reduce the solution space.
Abstract: One of the roadblocks to the wide-spread use of virtual reality simulation as a surgical training platform is the need for expert supervision during training to ensure proper skill acquisition. To fully utilize the capacity of virtual reality in surgical training, it is imperative that the guidance process is automated. In this paper, we discuss a method of providing one aspect of performance guidance: advice on the steps of a surgery or procedural guidance. We manually segment the surgical trajectory of an expert surgeon into steps and present them one at a time to guide trainees through a surgical procedure. We show, using a randomized controlled trial, that this form of guidance is effective in moving trainee behavior towards an expert ideal. To support practice variation and different surgical styles adopted by experts, separate guidance templates have to be generated. To enable this, we introduce a method of automatically segmenting a surgical trajectory into steps. We propose a pre-processing step that uses domain knowledge specific to our application to reduce the solution space. We show how this can be incorporated into existing trajectory segmentation methods, as well as a greedy approach that we propose. We compare this segmentation method to existing techniques and show that it is accurate and efficient.

Proceedings ArticleDOI
30 Jun 2016
TL;DR: LCC is an iterative optimization procedure which incorporates dynamic penalties for violated constraints and experiments show that LCC can outperform existing constrained clustering algorithms in scenarios which satisfying as many constraints as possible.
Abstract: Copyright © by SIAM. Incorporating background knowledge in clustering problems has attracted wide interest. This knowledge can be represented as pairwise instance-level constraints. Existing techniques approach satisfaction of such constraints from a soft (discretionary) perspective, yet there exist scenarios for constrained clustering where satisfying as many constraints as possible. We present a new Lagrangian Constrained Clustering framework (LCC) for clustering in the presence of pairwise constraints which gives high priority to satisfying constraints. LCC is an iterative optimization procedure which incorporates dynamic penalties for violated constraints. Experiments show that LCC can outperform existing constrained clustering algorithms in scenarios which satisfying as many constraints as possible.

Journal ArticleDOI
TL;DR: CSMiner is presented, a mining method that uses kernel density estimation in conjunction with various pruning techniques and is experimentally investigated on a range of data sets, evaluating its efficiency, effectiveness, and stability and demonstrating it is substantially faster than a baseline method.
Abstract: We tackle the novel problem of mining contrast subspaces. Given a set of multidimensional objects in two classes $$C_+$$C+ and $$C_-$$C- and a query object $$o$$o, we want to find the top-$$k$$k subspaces that maximize the ratio of likelihood of $$o$$o in $$C_+$$C+ against that in $$C_-$$C-. Such subspaces are very useful for characterizing an object and explaining how it differs between two classes. We demonstrate that this problem has important applications, and, at the same time, is very challenging, being MAX SNP-hard. We present CSMiner, a mining method that uses kernel density estimation in conjunction with various pruning techniques. We experimentally investigate the performance of CSMiner on a range of data sets, evaluating its efficiency, effectiveness, and stability and demonstrating it is substantially faster than a baseline method.

Proceedings Article
12 Feb 2016
TL;DR: In this article, the authors adapt topic models to the psychometric testing of MOOC students based on their online forum postings to infer latent skill levels by relating them to individuals' observed responses on a series of items such as quiz questions.
Abstract: This paper adapts topic models to the psychometric testing of MOOC students based on their online forum postings. Measurement theory from education and psychology provides statistical models for quantifying a person's attainment of intangible attributes such as attitudes, abilities or intelligence. Such models infer latent skill levels by relating them to individuals' observed responses on a series of items such as quiz questions. The set of items can be used to measure a latent skill if individuals' responses on them conform to a Guttman scale. Such well-scaled items differentiate between individuals and inferred levels span the entire range from most basic to the advanced. In practice, education researchers manually devise items (quiz questions) while optimising well-scaled conformance. Due to the costly nature and expert requirements of this process, psychometric testing has found limited use in everyday teaching. We aim to develop usable measurement models for highly-instrumented MOOC delivery platforms, by using participation in automatically-extracted online forum topics as items. The challenge is to formalise the Guttman scale educational constraint and incorporate it into topic models. To favour topics that automatically conform to a Guttman scale, we introduce a novel regularisation into non-negative matrix factorisation-based topic modelling. We demonstrate the suitability of our approach with both quantitative experiments on three Coursera MOOCs, and with a qualitative survey of topic interpretability on two MOOCs by domain expert interviews.

Book ChapterDOI
19 Sep 2016
TL;DR: This paper addresses the problem of anomaly detection in time-evolving graphs, where graphs are a natural representation for data in many types of applications, and proposes a pre-processing step before running any further analysis on the data, where it permute the rows and columns of the adjacency matrix.
Abstract: Anomaly detection is a vital task for maintaining and improving any dynamic system. In this paper, we address the problem of anomaly detection in time-evolving graphs, where graphs are a natural representation for data in many types of applications. A key challenge in this context is how to process large volumes of streaming graphs. We propose a pre-processing step before running any further analysis on the data, where we permute the rows and columns of the adjacency matrix. This pre-processing step expedites graph mining techniques such as anomaly detection, PageRank, or graph coloring. In this paper, we focus on detecting anomalies in a sequence of graphs based on rank correlations of the reordered nodes. The merits of our approach lie in its simplicity and resilience to challenges such as unsupervised input, large volumes and high velocities of data. We evaluate the scalability and accuracy of our method on real graphs, where our method facilitates graph processing while producing more deterministic orderings. We show that the proposed approach is capable of revealing anomalies in a more efficient manner based on node rankings. Furthermore, our method can produce visual representations of graphs that are useful for graph compression.

Proceedings Article
01 Jan 2016
TL;DR: A PageRank based temporal influence ranking model (TIR) is used to identify influential users and demonstrates that TIR has better performance and is more stable than the existing models in global influence ranking and friend recommendation.
Abstract: With the growing popularity of online social media, identifying influential users in these social networks has become very popular. Existing works have studied user attributes, network structure and user interactions when measuring user influence. In contrast to these works, we focus on user behavioural characteristics. We investigate the temporal dynamics of user activity patterns and how these patterns affect user interactions. We assimilate such characteristics into a PageRank based temporal influence ranking model (TIR) to identify influential users. The transition probability in TIR is predicted by a logistic regression model and the random walk, biased according to users' temporal activity patterns. Experiments demonstrate that TIR has better performance and is more stable than the existing models in global influence ranking and friend recommendation.

Proceedings ArticleDOI
25 Apr 2016
TL;DR: Two research teams from the University of Melbourne's Learning Analytics Research Group used validation as applied in educational measurement to provide a framework for collaboration on defining and building measures of learning capability of MOOCs participants.
Abstract: Two research teams from the University of Melbourne's Learning Analytics Research Group used validation as applied in educational measurement to provide a framework for collaboration. One team was focussed on defining and building measures of learning capability of MOOCs participants, and the other on using topic modelling to discover topics in MOOC forums. The collaboration explored the suitability of items discovered from MOOC forums using topic modelling as measures of learning capability of participants in MOOCs.


Proceedings ArticleDOI
28 Nov 2016
TL;DR: A novel algorithm to automatically detect user POIs in near real time at room level accuracy using only lightweight ambient environment sensors is developed and outperforms the existing approaches such as Google place search and Foursquare venue search.
Abstract: The advancement of sensor equipped smartphones provides tremendous opportunities for fine-grained monitoring of user Points of Interest (POIs) for a range of mobile applications such as place-based advertisement, personalised healthcare services, location based social networks. Existing systems, however, cannot infer both indoor and outdoor POIs using a single approach and typically require a mix of technologies for localization such as GPS, GSM or Wi-Fi. The accuracy of these techniques depend on the availability of local infrastructure and normally can retrieve POIs only at a coarse level, for example at the level of a building or region. We develop a novel algorithm to automatically detect user POIs in near real time at room level accuracy using only lightweight ambient environment sensors. Our method can infer both indoor and outdoor POIs at a fine granularity without depending on local infrastructure or without using GPS or Wi-Fi. It works in an unsupervised manner using covariances of ambient sensor data to detect user visits to POIs. An experimental study with real-world data shows that our system can achieve an F1 score of approximately 80% for the top 3 retrieved locations and outperforms the existing approaches such as Google place search and Foursquare venue search.


Book ChapterDOI
01 Jan 2016
TL;DR: This chapter uses dynamical system theory to develop a mechanistic mean field model for neural activity to study the anesthetic cascade and addresses the more general question of synchronization of neural activity without mean field assumptions.
Abstract: With the advances in biochemistry, molecular biology, and neurochemistry there has been impressive progress in understanding the molecular properties of anesthetic agents. However, there has been little focus on how the molecular properties of anesthetic agents lead to the observed macroscopic property that defines the anesthetic state—that is, lack of responsiveness to noxious stimuli. In this chapter we use dynamical system theory to develop a mechanistic mean field model for neural activity to study the anesthetic cascade. The proposed synaptic drive firing rate model predicts the conscious-unconscious transition as the implied anesthetic concentration increases, where excitatory neural activity is characterized by a Poincare-Andronov-Hopf bifurcation with the awake state transitioning to a stable limit cycle and then subsequently to an asymptotically stable unconscious equilibrium state. Furthermore, we address the more general question of synchronization of neural activity without mean field assumptions. We do this by focusing on a postulated subset of inhibitory neurons that are not themselves connected to other inhibitory neurons. Finally, several numerical experiments are presented to illustrate the different aspects of the proposed theory.

Posted Content
TL;DR: This paper combines the Rasch model with non-negative matrix factorisation based topic modelling, jointly fitting both models, and demonstrates the suitability of this approach with quantitative experiments on data from three Coursera MOOCs, and with qualitative survey results on topic interpretability on a Discrete Optimisation MOOC.
Abstract: This paper explores the suitability of using automatically discovered topics from MOOC discussion forums for modelling students' academic abilities. The Rasch model from psychometrics is a popular generative probabilistic model that relates latent student skill, latent item difficulty, and observed student-item responses within a principled, unified framework. According to scholarly educational theory, discovered topics can be regarded as appropriate measurement items if (1) students' participation across the discovered topics is well fit by the Rasch model, and if (2) the topics are interpretable to subject-matter experts as being educationally meaningful. Such Rasch-scaled topics, with associated difficulty levels, could be of potential benefit to curriculum refinement, student assessment and personalised feedback. The technical challenge that remains, is to discover meaningful topics that simultaneously achieve good statistical fit with the Rasch model. To address this challenge, we combine the Rasch model with non-negative matrix factorisation based topic modelling, jointly fitting both models. We demonstrate the suitability of our approach with quantitative experiments on data from three Coursera MOOCs, and with qualitative survey results on topic interpretability on a Discrete Optimisation MOOC.