Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

Machine learning

Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model are twofold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters, however, better performance. We test our network on a public road data set and compare it with U-Net and other two state-of-the-art deep-learning-based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts.

Road Extraction by Deep Residual U-Net

Object detection in very high resolution optical remote sensing images is a fundamental problem faced for remote sensing image analysis. Due to the advances of powerful feature representations, machine-learning-based object detection is receiving increasing attention. Although numerous feature representations exist, most of them are handcrafted or shallow-learning-based features. As the object detection task becomes more challenging, their description capability becomes limited or even impoverished. More recently, deep learning algorithms, especially convolutional neural networks (CNNs), have shown their much stronger feature representation power in computer vision. Despite the progress made in nature scene images, it is problematic to directly use the CNN feature for object detection in optical remote sensing images because it is difficult to effectively deal with the problem of object rotation variations. To address this problem, this paper proposes a novel and effective approach to learn a rotation-invariant CNN (RICNN) model for advancing the performance of object detection, which is achieved by introducing and learning a new rotation-invariant layer on the basis of the existing CNN architectures. However, different from the training of traditional CNN models that only optimizes the multinomial logistic regression objective, our RICNN model is trained by optimizing a new objective function via imposing a regularization constraint, which explicitly enforces the feature representations of the training samples before and after rotating to be mapped close to each other, hence achieving rotation invariance. To facilitate training, we first train the rotation-invariant layer and then domain-specifically fine-tune the whole RICNN network to further boost the performance. Comprehensive evaluations on a publicly available ten-class object detection data set demonstrate the effectiveness of the proposed method.

Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images

Object detection in optical remote sensing images, being a fundamental but challenging problem in the field of aerial and satellite image analysis, plays an important role for a wide range of applications and is receiving significant attention in recent years. While enormous methods exist, a deep review of the literature concerning generic object detection is still lacking. This paper aims to provide a review of the recent progress in this field. Different from several previously published surveys that focus on a specific object class such as building and road, we concentrate on more generic object categories including, but are not limited to, road, building, tree, vehicle, ship, airport, urban-area. Covering about 270 publications we survey (1) template matching-based object detection methods, (2) knowledge-based object detection methods, (3) object-based image analysis (OBIA)-based object detection methods, (4) machine learning-based object detection methods, and (5) five publicly available datasets and three standard evaluation metrics. We also discuss the challenges of current studies and propose two promising research directions, namely deep learning-based feature representation and weakly supervised learning-based geospatial object detection. It is our hope that this survey will be beneficial for the researchers to have better understanding of this research field.

A survey on object detection in optical remote sensing images

Substantial efforts have been devoted more recently to presenting various methods for object detection in optical remote sensing images. However, the current survey of datasets and deep learning based methods for object detection in optical remote sensing images is not adequate. Moreover, most of the existing datasets have some shortcomings, for example, the numbers of images and object categories are small scale, and the image diversity and variations are insufficient. These limitations greatly affect the development of deep learning based object detection methods. In the paper, we provide a comprehensive review of the recent deep learning based object detection progress in both the computer vision and earth observation communities. Then, we propose a large-scale, publicly available benchmark for object DetectIon in Optical Remote sensing images, which we name as DIOR. The dataset contains 23,463 images and 192,472 instances, covering 20 object classes. The proposed DIOR dataset (1) is large-scale on the object categories, on the object instance number, and on the total image number; (2) has a large range of object size variations, not only in terms of spatial resolutions, but also in the aspect of inter- and intra-class size variability across objects; (3) holds big variations as the images are obtained with different imaging conditions, weathers, seasons, and image quality; and (4) has high inter-class similarity and intra-class diversity. The proposed benchmark can help the researchers to develop and validate their data-driven methods. Finally, we evaluate several state-of-the-art approaches on our DIOR dataset to establish a baseline for future research.

Object detection in optical remote sensing images: A survey and a new benchmark

The process of road extraction from high-resolution satellite images is complex, and most researchers have shown results on a few selected set of images. Based on the satellite data acquisition sensor and geolocation of the region, the type of processing varies and users tune several heuristic parameters to achieve a reasonable degree of accuracy. We exploit two salient features of roads, namely, distinct spectral contrast and locally linear trajectory, to design a multistage framework to extract roads from high-resolution multispectral satellite images. We trained four Probabilistic Support Vector Machines separately using four different categories of training samples extracted from urban/suburban areas. Dominant Singular Measure is used to detect locally linear edge segments as potential trajectories for roads. This complimentary information is integrated using an optimization framework to obtain potential targets for roads. This provides decent results in situations only when the roads have few obstacles (trees, large vehicles, and tall buildings). Linking of disjoint segments uses the local gradient functions at the adjacent pair of road endings. Region part segmentation uses curvature information to remove stray nonroad structures. Medial-Axis-Transform-based hypothesis verification eliminates connected nonroad structures to improve the accuracy in road detection. Results are evaluated with a large set of multispectral remotely sensed images and are compared against a few state-of-the-art methods to validate the superior performance of our proposed method.

Use of Salient Features for the Design of a Multistage Framework to Extract Roads From High-Resolution Multispectral Satellite Images

This paper presents a new approach for automated path planning of cooperative crane manipulators using a genetic algorithm (GA). The inverse kinematic problem, i.e., determining the joint angle configuration for the cooperative crane manipulator system in moving the object from pick location to place location, is defined as an optimization problem and solved using GA. For generating the collision-free path, GA with an interference detection algorithm is employed and search is made in the manipulator joint angle space (configuration space). The effectiveness of the proposed approach for automated path planning is demonstrated by comparing the performance of the present approach with the earlier heuristic search proposed by Sivakumar et al. The GA approach finds a near-optimal path with lower path cost and less computational time than earlier heuristic searches.

Collision free path planning of cooperative crane manipulators using genetic algorithm

Recognizing the activities of workers helps to measure and control safety, productivity, and quality in construction sites. Automated activity recognition can enhance the efficiency of the measurement system. The present study investigates accelerometer-based activity classification for automating the work-sampling process. A methodology is developed for evaluating classifiers for recognizing activities based on the features generated from accelerometer data segments. An experimental study is carried out in instructed and uninstructed modes for classifying masonry activities by using accelerometers attached to the waist of the mason. Three types of classifiers were evaluated, and multilayer perceptron, a neural network classifier, gave the best results. A 50% overlap for data segments enhanced classifier performance. The study showed that the utilization of best features instead of all features did not affect the classification accuracy significantly but reduced the run time considerably. An accuracy of 8...

Accelerometer-Based Activity Recognition in Construction

The use of cooperative cranes can improve the cost effectiveness of heavy lift operations. However, the complexity in developing a reliable lift plan prevents the widespread use of cooperative crane lifts. The availability of a computer-aided planning system can improve planning efficiency and reliability. Path planning is an important subtask of the lift planning process. This paper presents work done to develop a computer aided path planner for two crane lifts. Two heuristic search methods, hill climbing and A*, were implemented for automating the path-planning task. Search space was represented using the concept of configuration space. The effec- tiveness of the search methods was evaluated by solving three problems with increasing levels of complexity. The formulation of these problems was based on the type of movement of cooperative cranes ~in synchronous or asynchronous manner! and the presence of trapping space. It was found that while the hill climbing approach found feasible paths in a few seconds or minutes, these paths were far from optimal in situations containing trapping space. In contrast, the A * search resulted in near optimal paths, but the execution time was of the order of hours.

Automated Path Planning of Cooperative Crane Lifts Using Heuristic Search

Planning the lift path for a mobile crane involves the generation and selection of a suitable lift path through three-dimensional space while considering the degrees of freedom of the crane, its lifting capacity and potential site obstacles. This paper presents work done toward applying configuration space (C-space) and search concepts to develop a tool to identify lift paths that satisfies the planning requirements. An interference detection technique is used to generate the C-space and two levels of heuristic search are performed within the C-space. The first search is a heuristic depth search to determine the obstacle-free lift paths. The second search performs a more detailed optimization of the path within a constrained search space. The tool developed here can be used within the AutoCAD environment and is based on program modules developed using AutoLisp and external programs. Test findings indicate that the approach is capable of generating good paths in complex situations within reasonable time. Directions for further research are indicated.

Koshy Varghese

Papers

Use of Salient Features for the Design of a Multistage Framework to Extract Roads From High-Resolution Multispectral Satellite Images

Collision free path planning of cooperative crane manipulators using genetic algorithm

Accelerometer-Based Activity Recognition in Construction

Automated Path Planning of Cooperative Crane Lifts Using Heuristic Search

Automated Path Planning for Mobile Crane Lifts