# Inferring customer occupancy status in for-hire vehicles using PU Learning

02 Jan 2021-pp 290-298

TL;DR: In this article, the authors proposed a PU learning approach to identify mislabeled data in the dataset by casting it as one of learning from Positive and Unlabeled instances (PU Learning).

Abstract: Data from Global Positioning Systems (GPS) and fare-meters in For-Hire vehicles (FHVs) have been used for various applications – both in research as well as organizational decision-making. The utility of such exercises largely depend on the accuracy of the data. This study looks at an environment where the data is partially mislabeled. Specifically, we take a common real-world setting where vehicle operators choose to render transportation services to customers without the use of a fare-meter, often by negotiating a fixed rate with the customer. This practice, which to different degrees, has been observed and documented across urban areas in the world, leads to various undesirable effects. In this study, we seek to identify cases of such behavior in the dataset. Typically, a supervised learning classifier could be built to predict the occupancy status from GPS traces, which can then be used, to look for anomalies between the predicted and stated behaviors. However, in our case the training dataset also contains instances of incorrect tagging. We address this problem by casting it as one of learning from Positive and Unlabeled instances (PU Learning) . This is owing to the fact that we observe the phenomenon of one-sided label noise, where trips tagged ‘vacant’ by the taximeter could be truly vacant or occupied, whereas trips tagged ‘occupied’ are expected to be occupied in reality as well. To support this novel formulation, we apply three state-of-the-art PU Learning algorithms on a real-world trajectory data set from an organization plying 170 active vehicles over a period of two months. We compare these to the baselines of standard supervised learning. Validation is carried out by the organization through alternate channels of investigation which is not indicated in the data set. The results show that the PU Learners provide a significant improvement in classification across a range of metrics when compared to the baseline approaches. This translates to a significant increase in identifying or reclassifying the mislabeled rides.

##### References

More filters

••

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

••

Bell Labs

^{1}TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.
High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations

••

01 Jul 1992TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.

Abstract: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

11,211 citations

••

TL;DR: In this article, the authors study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period and find that the individual travel patterns collapse into a single spatial probability distribution, indicating that humans follow simple reproducible patterns.

Abstract: The mapping of large-scale human movements is important for urban planning, traffic forecasting and epidemic prevention. Work in animals had suggested that their foraging might be explained in terms of a random walk, a mathematical rendition of a series of random steps, or a Levy flight, a random walk punctuated by occasional larger steps. The role of Levy statistics in animal behaviour is much debated — as explained in an accompanying News Feature — but the idea of extending it to human behaviour was boosted by a report in 2006 of Levy flight-like patterns in human movement tracked via dollar bills. A new human study, based on tracking the trajectory of 100,000 cell-phone users for six months, reveals behaviour close to a Levy pattern, but deviating from it as individual trajectories show a high degree of temporal and spatial regularity: work and other commitments mean we are not as free to roam as a foraging animal. But by correcting the data to accommodate individual variation, simple and predictable patterns in human travel begin to emerge. The cover photo (by Cesar Hidalgo) captures human mobility in New York's Grand Central Station. This study used a sample of 100,000 mobile phone users whose trajectory was tracked for six months to study human mobility patterns. Displacements across all users suggest behaviour close to the Levy-flight-like pattern observed previously based on the motion of marked dollar bills, but with a cutoff in the distribution. The origin of the Levy patterns observed in the aggregate data appears to be population heterogeneity and not Levy patterns at the level of the individual. Despite their importance for urban planning1, traffic forecasting2 and the spread of biological3,4,5 and mobile viruses6, our understanding of the basic laws governing human motion remains limited owing to the lack of tools to monitor the time-resolved location of individuals. Here we study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period. We find that, in contrast with the random trajectories predicted by the prevailing Levy flight and random walk models7, human trajectories show a high degree of temporal and spatial regularity, each individual being characterized by a time-independent characteristic travel distance and a significant probability to return to a few highly frequented locations. After correcting for differences in travel distances and the inherent anisotropy of each trajectory, the individual travel patterns collapse into a single spatial probability distribution, indicating that, despite the diversity of their travel history, humans follow simple reproducible patterns. This inherent similarity in travel patterns could impact all phenomena driven by human mobility, from epidemic prevention to emergency response, urban planning and agent-based modelling.

5,514 citations

••

TL;DR: In this paper, the authors propose a method to estimate a function f that is positive on S and negative on the complement of S. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space.

Abstract: Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

4,397 citations