Author
Eric Järpe
Other affiliations: University of Gothenburg
Bio: Eric Järpe is an academic researcher from Halmstad University. The author has contributed to research in topics: Steganography & Markov renewal process. The author has an hindex of 8, co-authored 25 publications receiving 198 citations. Previous affiliations of Eric Järpe include University of Gothenburg.
Papers
More filters
TL;DR: Results show that spatial and temporal deviations can be revealed through analysis of a 2D map of high dimensional data and it is demonstrated that such a map is stable in terms of the number of clusters formed.
Abstract: A new approach to modelling human behaviour patterns in smart homes is presented.We examine detection of deviating human behaviour patterns such as falls.We analyse deviations in space, time and transitions between behaviour patterns.Spatial and temporal deviations can be found through analysis of a 2D map of data. A system for detecting deviating human behaviour in a smart home environment is the long-term goal of this work. Clearly, such systems will be very important in ambient assisted living services. A new approach to modelling human behaviour patterns is suggested in this paper. The approach reveals promising results in unsupervised modelling of human behaviour and detection of deviations by using such a model. Human behaviour/activity in a short time interval is represented in a novel fashion by responses of simple non-intrusive sensors. Deviating behaviour is revealed through data clustering and analysis of associations between clusters and data vectors representing adjacent time intervals (analysing transitions between clusters). To obtain clusters of human behaviour patterns, first, a random forest is trained without using beforehand defined teacher signals. Then information collected in the random forest data proximity matrix is mapped onto the 2D space and data clusters are revealed there by agglomerative clustering. Transitions between clusters are modelled by the third order Markov chain.Three types of deviations are considered: deviation in time, deviation in space and deviation in the transition between clusters of similar behaviour patterns.The proposed modelling approach does not make any assumptions about the position, type, and relationship of sensors but is nevertheless able to successfully create and use a model for deviation detection-this is claimed as a significant result in the area of expert and intelligent systems. Results show that spatial and temporal deviations can be revealed through analysis of a 2D map of high dimensional data. It is demonstrated that such a map is stable in terms of the number of clusters formed. We show that the data clusters can be understood/explored by finding the most important variables and by analysing the structure of the most representative tree.
40 citations
01 Oct 2016
TL;DR: By renaming the system tool that handles shadow copies it is possible to recover from infections from all four of the most common Crypto Ransomwares.
Abstract: Extortion using digital platforms is an increasing form of crime. A commonly seen problem is extortion in the form of an infection of a Crypto Ransomware that encrypts the files of the target and demands a ransom to recover the locked data. By analyzing the four most common Crypto Ransomwares, at writing, a clear vulnerability is identified; all infections rely on tools available on the target system to be able to prevent a simple recovery after the attack has been detected. By renaming the system tool that handles shadow copies it is possible to recover from infections from all four of the most common Crypto Ransomwares. The solution is packaged in a single, easy to use script.
36 citations
TL;DR: In this paper, the interaction parameter of the Ising model is considered in order to detect changes of spatial patterns, and the interaction parameters of the model are used to detect spatial patterns.
Abstract: Surveillance to detect changes of spatial patterns is of interest in many areas such as environmental control and regional analysis. Here the interaction parameter of the Ising model, is considered ...
34 citations
08 Sep 2014
TL;DR: Three methods that can be used to produce p-values are compared: one class support vector machine (OCSVM), conformal anomaly detection (CAD), and a simple "most central pattern" (MCP) algorithm, which give reasonable results on the real life data set but that they have clear strengths and weaknesses on the synthetic data sets.
Abstract: Deviation detection is important for self-monitoring systems To perform deviation detection well requires methods that, given only "normal" data from a distribution of unknown parametric form, can produce a reliable statistic for rejecting the null hypothesis, ie evidence for devating data One measure of the strength of this evidence based on the data is the p-value, but few deviation detection methods utilize p-value estimation We compare three methods that can be used to produce p-values: one class support vector machine (OCSVM), conformal anomaly detection (CAD), and a simple "most central pattern" (MCP) algorithm The SVM and the CAD method should be able to handle a distribution of any shape The methods are evaluated on synthetic data sets to test and illustrate their strengths and weaknesses, and on data from a real life self-monitoring scenario with a city bus fleet in normal traffic The OCSVM has a Gaussian kernel for the synthetic data and a Hellinger kernel for the empirical data The MCP method uses the Mahalanobis metric for the synthetic data and the Hellinger metric for the empirical data The CAD uses the same metrics as the MCP method and has a k-nearest neighbour (kNN) non-conformity measure for both sets The conclusion is that all three methods give reasonable, and quite similar, results on the real life data set but that they have clear strengths and weaknesses on the synthetic data sets The MCP algorithm is quick and accurate when the "normal" data distribution is unimodal and symmetric (with the chosen metric) but not otherwise The OCSVM is a bit cumbersome to use to create (quantized) p-values but is accurate and reliable when the data distribution is multimodal and asymmetric The CAD is also accurate for multimodal and asymmetric distributions The experiment on the vehicle data illustrate how algorithms like these can be used in a self-monitoring system that uses a fleet of vehicles to conduct deviation detection without supervision and without prior knowledge about what is being monitored
20 citations
23 Mar 2015
TL;DR: Results suggest that the proposed approach increase realism of simulated data, however results also indicate that improvements could be achieved using the geometric distribution as a model for the number of PIR events during a time interval.
Abstract: Development, testing and validation of algorithms for smart home applications are often complex, expensive and tedious processes. Research on simulation of resident activity patterns in Smart Homes is an active research area and facilitates development of algorithms of smart home applications. However, the simulation of passive infrared (PIR) sensors is often used in a static fashion by generating equidistant events while an intended occupant is within sensor proximity. This paper suggests the combination of avatar-based control and probabilistic sampling in order to increase realism of the simulated data. The number of PIR events during a time interval is assumed to be Poisson distributed and this assumption is used in the simulation of Smart Home data. Results suggest that the proposed approach increase realism of simulated data, however results also indicate that improvements could be achieved using the geometric distribution as a model for the number of PIR events during a time interval.
16 citations
Cited by
More filters
2,730 citations
TL;DR: By using a space–time scan statistic, a system for regular time periodic disease surveillance to detect any currently ‘active’ geographical clusters of disease and which tests the statistical significance of such clusters adjusting for the multitude of possible geographical locations and sizes, time intervals and time periodic analyses is proposed.
Abstract: Most disease registries are updated at least yearly. If a geographically localized health hazard suddenly occurs, we would like to have a surveillance system in place that can pick up a new geographical disease cluster as quickly as possible, irrespective of its location and size. At the same time, we want to minimize the number of false alarms By using a space-time scan statistic, we propose and illustrate a system for regular time periodic disease surveillance to detect any currently active' geographical clusters of disease and which tests the statistical significance of such clusters adjusting for the multitude of possible geographical locations and sizes, time intervals and time periodic analyses. The method is illustrated on thyroid cancer among men in New Mexico 1973-1992.
687 citations
01 Jan 1981
TL;DR: In this article, Monte Carlo techniques are used to estimate the probability of a given set of variables for a particular set of classes of data, such as conditional probability and hypergeometric probability.
Abstract: 1. Introduction 1.1 An Overview 1.2 Some Examples 1.3 A Brief History 1.4 A Chapter Summary 2. Probability 2.1 Introduction 2.2 Sample Spaces and the Algebra of Sets 2.3 The Probability Function 2.4 Conditional Probability 2.5 Independence 2.6 Combinatorics 2.7 Combinatorial Probability 2.8 Taking a Second Look at Statistics (Monte Carlo Techniques) 3. Random Variables 3.1 Introduction 3.2 Binomial and Hypergeometric Probabilities 3.3 Discrete Random Variables 3.4 Continuous Random Variables 3.5 Expected Values 3.6 The Variance 3.7 Joint Densities 3.8 Transforming and Combining Random Variables 3.9 Further Properties of the Mean and Variance 3.10 Order Statistics 3.11 Conditional Densities 3.12 Moment-Generating Functions 3.13 Taking a Second Look at Statistics (Interpreting Means) Appendix 3.A.1 MINITAB Applications 4. Special Distributions 4.1 Introduction 4.2 The Poisson Distribution 4.3 The Normal Distribution 4.4 The Geometric Distribution 4.5 The Negative Binomial Distribution 4.6 The Gamma Distribution 4.7 Taking a Second Look at Statistics (Monte Carlo Simulations) Appendix 4.A.1 MINITAB Applications Appendix 4.A.2 A Proof of the Central Limit Theorem 5. Estimation 5.1 Introduction 5.2 Estimating Parameters: The Method of Maximum Likelihood and the Method of Moments 5.3 Interval Estimation 5.4 Properties of Estimators 5.5 Minimum-Variance Estimators: The Crami?½r-Rao Lower Bound 5.6 Sufficient Estimators 5.7 Consistency 5.8 Bayesian Estimation 5.9 Taking A Second Look at Statistics (Beyond Classical Estimation) Appendix 5.A.1 MINITAB Applications 6. Hypothesis Testing 6.1 Introduction 6.2 The Decision Rule 6.3 Testing Binomial Dataâ H0: p = po 6.4 Type I and Type II Errors 6.5 A Notion of Optimality: The Generalized Likelihood Ratio 6.6 Taking a Second Look at Statistics (Statistical Significance versus â Practicalâ Significance) 7. Inferences Based on the Normal Distribution 7.1 Introduction 7.2 Comparing Y-i?½ s/ vn and Y-i?½ S/ vn 7.3 Deriving the Distribution of Y-i?½ S/ vn 7.4 Drawing Inferences About i?½ 7.5 Drawing Inferences About s2 7.6 Taking a Second Look at Statistics (Type II Error) Appendix 7.A.1 MINITAB Applications Appendix 7.A.2 Some Distribution Results for Y and S2 Appendix 7.A.3 A Proof that the One-Sample t Test is a GLRT Appendix 7.A.4 A Proof of Theorem 7.5.2 8. Types of Data: A Brief Overview 8.1 Introduction 8.2 Classifying Data 8.3 Taking a Second Look at Statistics (Samples Are Not â Validâ !) 9. Two-Sample Inferences 9.1 Introduction 9.2 Testing H0: i?½X =i?½Y 9.3 Testing H0: s2X=s2Yâ The F Test 9.4 Binomial Data: Testing H0: pX = pY 9.5 Confidence Intervals for the Two-Sample Problem 9.6 Taking a Second Look at Statistics (Choosing Samples) Appendix 9.A.1 A Derivation of the Two-Sample t Test (A Proof of Theorem 9.2.2) Appendix 9.A.2 MINITAB Applications 10. Goodness-of-Fit Tests 10.1 Introduction 10.2 The Multinomial Distribution 10.3 Goodness-of-Fit Tests: All Parameters Known 10.4 Goodness-of-Fit Tests: Parameters Unknown 10.5 Contingency Tables 10.6 Taking a Second Look at Statistics (Outliers) Appendix 10.A.1 MINITAB Applications 11. Regression 11.1 Introduction 11.2 The Method of Least Squares 11.3 The Linear Model 11.4 Covariance and Correlation 11.5 The Bivariate Normal Distribution 11.6 Taking a Second Look at Statistics (How Not to Interpret the Sample Correlation Coefficient) Appendix 11.A.1 MINITAB Applications Appendix 11.A.2 A Proof of Theorem 11.3.3 12. The Analysis of Variance 12.1 Introduction 12.2 The F Test 12.3 Multiple Comparisons: Tukeyâ s Method 12.4 Testing Subhypotheses with Contrasts 12.5 Data Transformations 12.6 Taking a Second Look at Statistics (Putting the Subject of Statistics togetherâ the Contributions of Ronald A. Fisher) Appendix 12.A.1 MINITAB Applications Appendix 12.A.2 A Proof of Theorem 12.2.2 Appendix 12.A.3 The Distribution of SSTR/(kâ 1) SSE/(nâ k)When H1 is True 13. Randomized Block Designs 13.1 Introduction 13.2 The F Test for a Randomized Block Design 13.3 The Paired t Test 13.4 Taking a Second Look at Statistics (Choosing between a Two-Sample t Test and a Paired t Test) Appendix 13.A.1 MINITAB Applications 14. Nonparametric Statistics 14.1 Introduction 14.2 The Sign Test 14.3 Wilcoxon Tests 14.4 The Kruskal-Wallis Test 14.5 The Friedman Test 14.6 Testing for Randomness 14.7 Taking a Second Look at Statistics (Comparing Parametric and Nonparametric Procedures) Appendix 14.A.1 MINITAB Applications Appendix: Statistical Tables Answers to Selected Odd-Numbered Questions Bibliography Index
524 citations
TL;DR: There are many applications of control charts in health-care monitoring and in public-health surveillance as mentioned in this paper, and these applications to industrial practitioners and discuss some of the ideas that arise that may be applicable in industrial monitoring.
Abstract: There are many applications of control charts in health-care monitoring and in public-health surveillance. We introduce these applications to industrial practitioners and discuss some of the ideas that arise that may be applicable in industrial monitoring. The advantages and disadvantages of the charting methods proposed in the health-care and public-health areas are considered. Some additional contributions in the industrial statistical process control literature relevant to this area are given. There are many application and research opportunities available in the use of control charts for health-related monitoring.
497 citations
Journal Article•
TL;DR: There are many applications of control charts in health-care monitoring and in public-health surveillance as mentioned in this paper, and these applications to industrial practitioners and discuss some of the ideas that arise that may be applicable in industrial monitoring.
Abstract: There are many applications of control charts in health-care monitoring and in public-health surveillance. We introduce these applications to industrial practitioners and discuss some of the ideas that arise that may be applicable in industrial monitoring. The advantages and disadvantages of the charting methods proposed in the health-care and public-health areas are considered. Some additional contributions in the industrial statistical process control literature relevant to this area are given. There are many application and research opportunities available in the use of control charts for health-related monitoring.
481 citations