# Solution of Linear and Non Linear Regression Problem by K Nearest Neighbour Approach: By Using Three Sigma Rule

02 Apr 2015-pp 197-201

TL;DR: This paper is an attempt in using KNN as function estimation problem, made for linear as well as nonlinear regression problem, and made an assumption that supervisor data given is reliable.

Abstract: K Nearest Neighbor is one of the simplest method for classification as well as regression problem. That is the reason it is widely adopted. KNN is a supervised method that uses estimation based on values of neighbors. Though KNN came into existence in decade of 1990, it still demands improvements based on domain in which it is being used. Now the researchers have invented methods in which multiple techniques can be combined in some order such that advantages of each technique covers the disability of techniques being combined for example, KNN-Kernel based algorithms are being used for clustering. Though heavy applicability of KNN in classification problems, it is not that much used in function estimation problems. This paper is an attempt in using KNN as function estimation problem. The approach is made for linear as well as nonlinear regression problem. We have made an assumption that supervisor data given is reliable. We have considered here two dimensional data to illustrate the idea which is equally applicable to n-dimensional data for some large but finite n.

##### Citations

More filters

••

[...]

TL;DR: A novel method for ball screw fault diagnosis is proposed that combines weighted data of the multiple sensors at different positions with convolutional neural network and it considers the sensitive index of different faults at different sensors for weight assignment.

Abstract: In the real industrial application, the problem of ball screw health condition monitoring and fault diagnosis is still confronted many challenges. In some cases, the rotating machinery has long rotor, it need to arrange multiple sensors at different positions of the system and different faults are located at different positions of the system. The primary difficult issue involved in the task is to recognize the multiple faults at different positions of ball screw with high accuracy and feasibility. In order to overcome the problem, a novel method for ball screw fault diagnosis is proposed. The proposed method combines weighted data of the multiple sensors at different positions with convolutional neural network and it considers the sensitive index of different faults at different sensors for weight assignment. The proposed method mainly contains three steps. Firstly, a new data segmentation algorithm is proposed to obtain the uniform data of the vibration signals. Secondly, a sensitive sensor data selection criterion based on ball screw failure mechanism is developed to obtain the sensor importance factor. Finally, the weighted data is classified by the convolutional neural network. The effectiveness of the proposed method is verified by the experiment on the ball screw test-bed.

8 citations

### Cites methods from "Solution of Linear and Non Linear R..."

[...]

••

[...]

TL;DR: The detailed analysis that is discussed in the present paper clearly shows that expanding image-derived LiDAR samples helps in refining the prediction of regional forest volume while using satellite data and nonparametric models.

Abstract: Accurate information regarding forest volume plays an important role in estimating afforestation, timber harvesting, and forest ecological services. Traditionally, operations on forest growing stock volume using field measurements are labor-intensive and time-consuming. Recently, remote sensing technology has emerged as a time-cost efficient method for forest inventory. In the present study, we have adopted three procedures, including samples expanding, feature selection, and results generation and evaluation. Extrapolating the samples from Light Detection and Ranging (LiDAR) scanning is the most important step in satisfying the requirement of sample size for nonparametric methods operation and result in accuracy improvement. Besides, mean decrease Gini (MDG) methodology embedded into Random Forest (RF) algorithm served as a selector for feature measure; afterwards, RF and K-Nearest Neighbor (KNN) were adopted in subsequent forest volume prediction. The results show that the retrieval of Forest volume in the entire area was in the range of 50–360 m3/ha, and the results from the two models show a better consistency while using the sample combination extrapolated by the optimal threshold value (2 × 10−4), leading to the best performances of RF (R2 = 0.618, root mean square error, RMSE = 43.641 m3/ha, mean absolute error, MAE = 33.016 m3/ha), followed by KNN (R2 = 0.617, RMSE = 43.693 m3/ha, MAE = 32.534 m3/ha). The detailed analysis that is discussed in the present paper clearly shows that expanding image-derived LiDAR samples helps in refining the prediction of regional forest volume while using satellite data and nonparametric models.

6 citations

••

[...]

TL;DR: It is shown that both Genetic Algorithms and Particle Swarm Optimization can be used as an alternative way for coefficients estimation of nonlinear regression models.

Abstract: Nonlinear regression is a type of regression which is used for modeling a relation between the independent variables and dependent variables. Finding the proper regression model and coefficients is important for all disciplines. In this study, it is aimed at finding the nonlinear model coefficients with two well-known population-based optimization algorithms. Genetic Algorithms (GA) and Particle Swarm Optimization (PSO) were used for finding some nonlinear regression model coefficients. It is shown that both algorithms can be used as an alternative way for coefficients estimation of nonlinear regression models.

6 citations

••

[...]

TL;DR: In this article, the applicability of KNN algorithm in predicting the fracture toughness of polymer composites reinforced with silica particles was investigated, and the proposed model predicts the results with an accuracy of 96% as around 4% was the mean absolute percentage error.

Abstract: The mechanical behavior of particle reinforced polymer composites depends largely on the properties of the particles used to reinforce it. Geometrical properties such as shape and size (aspect ratio) have a vital part in deciding the behavior of the composite material when it is subjected to impact loading. Generally, increase in aspect ratio results in increased energy absorption capability which further results in higher fracture toughness. But investigating the fracture toughness of particle reinforced composites experimentally for varying aspect ratio is cumbersome. Therefore, the presented work focuses on investigating the applicability of K- Nearest Neighbor (KNN) algorithm in predicting the fracture toughness of polymer composites reinforced with silica particles. The aim of this work is to predict the results with utmost accuracy with limited experimentation. The current approach utilizes four model parameters viz. aspect ratio, time, volume fraction of the fillers and elastic modulus to predict the Stress Intensity Factor (SIF) which directly gives the measure of fracture toughness. KNN has been implemented to predict the fracture behavior of the composite corresponding to different values of aspect ratios. The proposed model predicts the results with an accuracy of ~96%, as around 4% was found to be the mean absolute percentage error. This work is an effort to expand the scope of applying the machine learning technique in the field of material and design for the structural parts subjected to impact loading situations.

6 citations

### Cites methods from "Solution of Linear and Non Linear R..."

[...]

[...]

##### References

More filters

•

[...]

01 Jan 1972

TL;DR: This completely revised second edition presents an introduction to statistical pattern recognition, which is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field.

Abstract: This completely revised second edition presents an introduction to statistical pattern recognition Pattern recognition in general covers a wide range of problems: it is applied to engineering problems, such as character readers and wave form analysis as well as to brain modeling in biology and psychology Statistical decision and estimation, which are the main subjects of this book, are regarded as fundamental to the study of pattern recognition This book is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field Each chapter contains computer projects as well as exercises

10,516 citations

•

[...]

TL;DR: In this article, the authors present a list of basic reference books for convergence of Minimization Methods in linear algebra and linear algebra with a focus on convergence under partial ordering.

Abstract: Preface to the Classics Edition Preface Acknowledgments Glossary of Symbols Introduction Part I. Background Material. 1. Sample Problems 2. Linear Algebra 3. Analysis Part II. Nonconstructive Existence Theorems. 4. Gradient Mappings and Minimization 5. Contractions and the Continuation Property 6. The Degree of a Mapping Part III. Iterative Methods. 7. General Iterative Methods 8. Minimization Methods Part IV. Local Convergence. 9. Rates of Convergence-General 10. One-Step Stationary Methods 11. Multistep Methods and Additional One-Step Methods Part V. Semilocal and Global Convergence. 12. Contractions and Nonlinear Majorants 13. Convergence under Partial Ordering 14. Convergence of Minimization Methods An Annotated List of Basic Reference Books Bibliography Author Index Subject Index.

7,605 citations

••

[...]

TL;DR: This article proposed three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference metric (IVDM), and the Windowed Value Difference measure (WVDM) to handle applications with nominal attributes, continuous attributes and both.

Abstract: Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference Metric (IVDM), and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher classification accuracy on average than three previous distance functions on those datasets that have both nominal and continuous attributes.

1,245 citations

••

[...]

TL;DR: A local density based outlier detection method providing an outlier "score" in the range of [0, 1] that is directly interpretable as a probability of a data object for being an outliest.

Abstract: Many outlier detection methods do not merely provide the decision for a single data object being or not being an outlier but give also an outlier score or "outlier factor" signaling "how much" the respective data object is an outlier. A major problem for any user not very acquainted with the outlier detection method in question is how to interpret this "factor" in order to decide for the numeric score again whether or not the data object indeed is an outlier. Here, we formulate a local density based outlier detection method providing an outlier "score" in the range of [0, 1] that is directly interpretable as a probability of a data object for being an outlier.

394 citations

##### Related Papers (5)

[...]

[...]