scispace - formally typeset
Proceedings ArticleDOI

Effects of Clustering Feature Vectors on Bus Travel Time Prediction: A Case Study

TLDR
In this article, the authors analyzed the use of different feature vectors for clustering and the effect on travel time predictions and showed that the prediction accuracy is highest when only travel times are used as a clustering feature vector.
Abstract
Improving the accuracy of travel time predictions depends on providing the correct inputs as well as the prediction algorithm used. Clustering algorithms can be used to identify the patterns in the data, which can improve the inputs to the prediction algorithm. The feature vectors used for clustering greatly affect the clusters formed and, ultimately, the prediction performance. Clustering being an unsupervised learning technique, the accuracy or correctness of the cluster formed can not be evaluated directly. A possible solution for this would be to link the problem with prediction accuracy and choose the feature vector combination with maximum prediction accuracy. The present study analyses the use of different feature vectors for clustering and the effect on travel time predictions. Here, three cases, namely, travel time alone, travel time along with features such as time of the day, section index, and day of the week as numerical features and as a mix of categorical and numerical feature vectors, are studied. The effects of using each of these cases as clustering feature vectors on travel time predictions are evaluated. It is observed that the prediction accuracy is the highest when only travel times are used as a clustering feature vector. The study demonstrates the importance of choosing the correct feature vectors for clustering and its effect on a final application, namely, travel time prediction.

read more

References
More filters
Journal ArticleDOI

Who belongs in the family

TL;DR: I was sitting before my TV set, a while back, watching Captain Video and pondering the organizational problems of psychologists, psychometricians, psychodiagnosticians, psycho-somatists, psychosomnabulists, and psychoceramics, and decided to enlist Captain Video's help to bring me from the Black Planet that superogalactian hypermetrician, Dr. Idnozs HcahscrorTenib, cosmos-famous disc
Journal ArticleDOI

Travel-time prediction with support vector regression

TL;DR: The feasibility of applying SVR in travel-time prediction is demonstrated and it is proved that SVR is applicable and performs well for traffic data analysis.
Journal ArticleDOI

Computational cluster validation in post-genomic data analysis

TL;DR: In this article, the authors present a review of clustering validation techniques for post-genomic data analysis, with a particular focus on their application to postgenomic analysis of biological data.
Proceedings Article

Scaling clustering algorithms to large databases

TL;DR: A scalable clustering framework applicable to a wide class of iterative clustering that requires at most one scan of the database and is instantiated and numerically justified with the popular K-Means clustering algorithm.
Journal ArticleDOI

A k-mean clustering algorithm for mixed numeric and categorical data

TL;DR: A clustering algorithm based on k-mean paradigm that works well for data with mixed numeric and categorical features is presented and a new cost function and distance measure based on co-occurrence of values is proposed.
Related Papers (5)