Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

Machine learning

With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data.

Learning from Imbalanced Data

A major new professional reference work on fingerprint security systems and technology from leading international researchers in the field Handbook provides authoritative and comprehensive coverage of all major topics, concepts, and methods for fingerprint security systems This unique reference work is an absolutely essential resource for all biometric security professionals, researchers, and systems administrators

/pdf/handbook-of-fingerprint-recognition-16mh887j24.pdf

Handbook of Fingerprint Recognition

In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research. Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction. Adaptive Computation and Machine Learning series

/pdf/semi-supervised-learning-scg1exixez.pdf

Semi-Supervised Learning

Visual surveillance in dynamic scenes, especially for humans and vehicles, is currently one of the most active research topics in computer vision. It has a wide spectrum of promising applications, including access control in special areas, human identification at a distance, crowd flux statistics and congestion analysis, detection of anomalous behaviors, and interactive surveillance using multiple cameras, etc. In general, the processing framework of visual surveillance in dynamic scenes includes the following stages: modeling of environments, detection of motion, classification of moving objects, tracking, understanding and description of behaviors, human identification, and fusion of data from multiple cameras. We review recent developments and general strategies of all these stages. Finally, we analyze possible research directions, e.g., occlusion handling, a combination of twoand three-dimensional tracking, a combination of motion analysis and biometrics, anomaly detection and behavior prediction, content-based retrieval of surveillance videos, behavior understanding and natural language description, fusion of information from multiple sensors, and remote surveillance.

/pdf/a-survey-on-visual-surveillance-of-object-motion-and-302zsaqo09.pdf

A survey on visual surveillance of object motion and behaviors

Support vector machines (SVM) have been recently proposed as a new technique for pattern recognition. SVM with a binary tree recognition strategy are used to tackle the face recognition problem. We illustrate the potential of SVM on the Cambridge ORL face database, which consists of 400 images of 40 individuals, containing quite a high degree of variability in expression, pose, and facial details. We also present the recognition experiment on a larger face database of 1079 images of 137 individuals. We compare the SVM-based recognition with the standard eigenface approach using the nearest center classification (NCC) criterion.

/pdf/face-recognition-by-support-vector-machines-dw7016g0xx.pdf

Face recognition by support vector machines

Support vector machines for face recognition

ECG signal conditioning by morphological filtering

Background
Detection of characteristic waves, such as QRS complex, P wave and T wave, is one of the essential tasks in the cardiovascular arrhythmia recognition from Electrocardiogram (ECG).

/pdf/characteristic-wave-detection-in-ecg-signal-using-1mc2nv9hiz.pdf

Characteristic wave detection in ECG signal using morphological transform

In this paper, a method of harmonics extraction from Higher Order Statistics (HOS) is developed for texture decomposition. We show that the diagonal slice of the fourth-order cumulants is proportional to the autocorrelation of a related noiseless sinusoidal signal with identical frequencies. We propose to use this fourth-order cumulants slice to estimate a power spectrum from which the harmonic frequencies can be easily extracted. Hence, a texture can be decomposed into deterministic components and indeterministic components as in a unified texture model through a Wold-like decomposition procedure. The simulation and experimental results demonstrated that this method is effective for texture decomposition and it performs better than traditional lower order statistics based decomposition methods.

Kap Luk Chan

Papers

Face recognition by support vector machines

Support vector machines for face recognition

ECG signal conditioning by morphological filtering

Characteristic wave detection in ECG signal using morphological transform

Texture decomposition by harmonics extraction from higher order statistics