Effects of Clustering Feature Vectors on Bus Travel Time Prediction: A Case Study

doi:10.1109/COMSNETS51098.2021.9352855

Proceedings ArticleDOI

Effects of Clustering Feature Vectors on Bus Travel Time Prediction: A Case Study

- pp 741-746

TLDR

In this article, the authors analyzed the use of different feature vectors for clustering and the effect on travel time predictions and showed that the prediction accuracy is highest when only travel times are used as a clustering feature vector.

Abstract:

Improving the accuracy of travel time predictions depends on providing the correct inputs as well as the prediction algorithm used. Clustering algorithms can be used to identify the patterns in the data, which can improve the inputs to the prediction algorithm. The feature vectors used for clustering greatly affect the clusters formed and, ultimately, the prediction performance. Clustering being an unsupervised learning technique, the accuracy or correctness of the cluster formed can not be evaluated directly. A possible solution for this would be to link the problem with prediction accuracy and choose the feature vector combination with maximum prediction accuracy. The present study analyses the use of different feature vectors for clustering and the effect on travel time predictions. Here, three cases, namely, travel time alone, travel time along with features such as time of the day, section index, and day of the week as numerical features and as a mix of categorical and numerical feature vectors, are studied. The effects of using each of these cases as clustering feature vectors on travel time predictions are evaluated. It is observed that the prediction accuracy is the highest when only travel times are used as a clustering feature vector. The study demonstrates the importance of choosing the correct feature vectors for clustering and its effect on a final application, namely, travel time prediction.

References

PDF

Open Access

More filters

Journal ArticleDOI

Who belongs in the family

Robert L. Thorndike

- 01 Dec 1953 -

Psychometrika

TL;DR: I was sitting before my TV set, a while back, watching Captain Video and pondering the organizational problems of psychologists, psychometricians, psychodiagnosticians, psycho-somatists, psychosomnabulists, and psychoceramics, and decided to enlist Captain Video's help to bring me from the Black Planet that superogalactian hypermetrician, Dr. Idnozs HcahscrorTenib, cosmos-famous disc

...read moreread less

Journal ArticleDOI

Travel-time prediction with support vector regression

Chun-Hsin Wu, +2 more

- 01 Dec 2004 -

IEEE Transactions on Intelligent Transpo...

TL;DR: The feasibility of applying SVR in travel-time prediction is demonstrated and it is proved that SVR is applicable and performs well for traffic data analysis.

...read moreread less

Journal ArticleDOI

Computational cluster validation in post-genomic data analysis

Julia Handl, +2 more

- 01 Aug 2005 -

Bioinformatics

TL;DR: In this article, the authors present a review of clustering validation techniques for post-genomic data analysis, with a particular focus on their application to postgenomic analysis of biological data.

...read moreread less

Proceedings Article

Scaling clustering algorithms to large databases

Paul S. Bradley, +2 more

TL;DR: A scalable clustering framework applicable to a wide class of iterative clustering that requires at most one scan of the database and is instantiated and numerically justified with the popular K-Means clustering algorithm.

...read moreread less

Journal ArticleDOI

A k-mean clustering algorithm for mixed numeric and categorical data

Amir Ahmad, +1 more

TL;DR: A clustering algorithm based on k-mean paradigm that works well for data with mixed numeric and categorical features is presented and a new cost function and distance measure based on co-occurrence of values is proposed.

...read moreread less