scispace - formally typeset
Search or ask a question
Author

Lalita Thakali

Bio: Lalita Thakali is an academic researcher from University of Waterloo. The author has contributed to research in topics: Nonparametric statistics & Kernel regression. The author has an hindex of 6, co-authored 15 publications receiving 182 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A case study conducted with historical crash data collected between 2003 and 2007 in the Hennepin County of Minnesota found that the kriging method outperformed the KDE method in its ability to detect hotspots, for all four tested groups of crash data with different times of day.
Abstract: This paper presents a study aimed at comparing the outcome of two geostatistical-based approaches, namely kernel density estimation (KDE) and kriging, for identifying crash hotspots in a road network. Aiming at locating high-risk locations for potential intervention, hotspot identification is an integral component of any comprehensive road safety management programs. A case study was conducted with historical crash data collected between 2003 and 2007 in the Hennepin County of Minnesota, U.S. The two methods were evaluated on the basis of a prediction accuracy index (PAI) and a comparison in hotspot ranking. It was found that, based on the PAI measure, the kriging method outperformed the KDE method in its ability to detect hotspots, for all four tested groups of crash data with different times of day. Furthermore, the lists of hotspots identified by the two methods were found to be moderately different, indicating the importance of selecting the right geostatistical method for hotspot identification. Notwithstanding the fact that the comparison study presented herein is limited to one case study, the findings have shown the promising perspective of the kriging technique for road safety analysis.

112 citations

Journal ArticleDOI
TL;DR: A framework to benchmark different regions in terms of a selected safety performance measure and employs a multilevel Bayesian heteroskedastic Poisson lognormal model with grouped random parameters allowing heterogeneity in both mean and variance parameters to overcome unobserved heterogeneity.

46 citations

Journal ArticleDOI
TL;DR: The experimental results have shown that a DBN model could be trained with different crash datasets with prediction performance being at least comparable to that of the locally calibrated negative binomial (NB) model.
Abstract: This paper explores the idea of applying a machine learning approach to develop a global road safety performance function (SFP) that can be used to predict the expected crash frequencies of different highways from different regions. A deep belief network (DBN) – one of the most popular deep learning models is introduced as an alternative to the traditional regression models for crash modelling. An extensive empirical study is conducted using three real world crash data sets covering six classes of highways as defined by location (urban vs. rural), number of lanes, access control, and region. The study involves a number of experiments aiming at addressing several critical questions pertaining to the relative performance of the DBN in terms of network structure, training method, data size, and generalization ability, as compared to the traditional regression models. The experimental results have shown that a DBN model could be trained with different crash datasets with prediction performance being at least comparable to that of the locally calibrated negative binomial (NB) model.

32 citations

01 Jan 2016
TL;DR: Two popular techniques from the two approaches are compared: negative binomial models for the parametric approach and kernel regression for the nonparametric counterpart, and it is shown that the kernel regression method outperforms the model-based approach for predictive performance, and that performance advantage increases noticeably as data available for calibration grow.
Abstract: Crash data for road safety analysis and modeling are growing steadily in size and completeness due to latest advancement in information technologies. This increased availability of large datasets has generated resurgent interest in applying data-driven nonparametric approach as an alternative to the traditional parametric models for crash risk prediction. This paper investigates the question of how the relative performance of these two alternative approaches changes as crash data grows. The authors focus on comparing two popular techniques from the two approaches: negative binomial models (NB) for the parametric approach and kernel regression (KR) for the nonparametric counterpart. Using two large crash datasets, the authors investigate the performance of these two methods as a function of the amount of training data. Through a rigorous bootstrapping validation process, the study found that the two approaches exhibit strikingly different patterns, especially, in terms of sensitivity to data size. The kernel regression method outperforms the model based approach – NB in terms of predictive performance and that performance advantage increases noticeably as data available for calibration grows. With the arrival of the Big Data era and the added benefits of enabling automated road safety analysis and improved responsiveness to latest safety issues, nonparametric techniques (especially those of modern machine approaches) could be included as one of the important tools for road safety studies.

10 citations

Dissertation
16 Aug 2016

8 citations


Cited by
More filters
01 Jan 2005
TL;DR: The results illustrate that the Empirical Bayes technique significantly outperforms ranking and confidence interval techniques (with certain caveats) and false positives and negatives are inversely related.
Abstract: Identifying crash “hotspots”, “blackspots”, “sites with promise”, or “high risk” locations is standard practice in departments of transportation throughout the US. The literature is replete with the development and discussion of statistical methods for hotspot identification (HSID). Theoretical derivations and empirical studies have been used to weigh the benefits of various HSID methods; however, a small number of studies have used controlled experiments to systematically assess various methods. Using experimentally derived simulated data—which are argued to be superior to empirical data, three hot spot identification methods observed in practice are evaluated: simple ranking, confidence interval, and Empirical Bayes. Using simulated data, sites with promise are known a priori, in contrast to empirical data where high risk sites are not known for certain. To conduct the evaluation, properties of observed crash data are used to generate simulated crash frequency distributions at hypothetical sites. A variety of factors is manipulated to simulate a host of ‘real world’ conditions. Various levels of confidence are explored, and false positives (identifying a safe site as high risk) and false negatives (identifying a high risk site as safe) are compared across methods. Finally, the effects of crash history duration in the three HSID approaches are assessed. The results illustrate that the Empirical Bayes technique significantly outperforms ranking and confidence interval techniques (with certain caveats). As found by others, false positives and negatives are inversely related. Three years of crash history appears, in general, to provide an appropriate crash history duration.

261 citations

Journal ArticleDOI
TL;DR: A new approach to quantify surface UHII (SUHII) using the relationship between MODIS land surface temperature (LST) and impervious surface areas (ISA) is proposed and verified using finer resolution Landset data, to prove its reliability.

185 citations

01 Jan 2007
TL;DR: In this paper, a series of models were compared using data collected on rural frontage roads in Texas and the results showed that both types of neural network models perform better than the NB regression model in terms of statistical fit and prediction.
Abstract: Statistical models have frequently been used in highway safety studies. They can be utilized for various purposes, including establishing relationships between variables, screening covariates and predicting values. Generalized linear models (GLM), and more recently hierarchical Bayes models (HBM), have been the most common types of model favored by transportation safety analysts. Over the last few years, researchers have proposed the back-propagation neural network (BPNN) model for modeling the phenomenon under study. Compared to GLMs and HBMs, BPNNs have received much less attention in highway safety modeling. The reasons are attributed to the complexity for estimating this kind of model as well as the problem related to “over-fitting” the data. To circumvent the latter problem, some statisticians have proposed the use of Bayesian neural network (BNN) models. These models have been shown to perform better than BPNN models while at the same time reducing the difficulty associated with over-fitting the data. The objective of this study is to evaluate the application of BNN models for predicting motor vehicle crashes. To accomplish this objective, a series of models were estimated using data collected on rural frontage roads in Texas. Three types of models were compared: BPNN, BNN and the traditional Negative Binomial (NB) regression models. The results of this study show that both types of neural network models perform better than the NB regression model in terms of statistical fit and prediction. Although the BPNN model provides a superior statistical fit than the other two models, its prediction performance is consistently worse than the BNN model, which suggests that the BNN model effectively alleviates the over-fitting problem and has better generalization abilities than the BPNN model. The results also show that BNNs could be used for other useful analyses in highway safety, including the development of accident modification factors and for improving the prediction capabilities for evaluating different highway design alternatives.

178 citations

Journal ArticleDOI
TL;DR: The findings show that the automotive industry is leading the adoption of machine learning algorithms for risk assessment, and Artificial neural networks are the most applied machine learning method to aid in engineering risk assessment.

162 citations

Journal ArticleDOI
TL;DR: This paper provides a discussion of the issues involved in this tradeoff with regard to specific methodological alternatives and presents researchers with a better understanding of the trade-offs often being inherently made in their analysis.

149 citations