scispace - formally typeset
Search or ask a question

Showing papers by "Ramez Elmasri published in 2020"


Journal ArticleDOI
TL;DR: A framework that aims to provide the requirements for building the Trajectory Data Warehouse (TDW) is proposed and discussed, which discusses different applications using the TDW and how these applications utilize theTDW.
Abstract: Advanced technologies in location acquisition allow us to track the movement of moving objects (people, planes, vehicles, animals, ships, ...) in geographical space. These technologies generate a vast amount of trajectory data (TD). Several applications in different fields can utilize such TD, for example, traffic management control, social behavior analysis, wildlife migrations and movements, ship trajectories, shoppers behavior in a mall, facial nerve trajectory, location-based services and many others. Trajectory data can be mainly handled either with Moving Object Databases (MOD) or Trajectory Data Warehouse (TDW). In this paper, we aim to review existing studies on storing, managing, and analyzing TD using data warehouse technologies. We propose a framework that aims to provide the requirements for building the TDW. Furthermore, we discuss different applications using the TDW and how these applications utilize the TDW. We address some issues with existing TDWs and discuss future work in this field.

17 citations


Journal ArticleDOI
17 Sep 2020
TL;DR: This paper uses their spatio-temporal application based on SQL system, ‘hydrological rainstorm analysis’, as an original example showing how analysis and mining tasks can be performed on the conceptual storm stored in a spatio/temporal RDB.
Abstract: Spatio-temporal data serves as a foundation for most location-based applications nowadays. To handle spatio-temporal data, an appropriate methodology needs to be properly followed, in which space a...

9 citations


Proceedings ArticleDOI
30 Jun 2020
TL;DR: This paper built several models that each use offenders' current arrest's criminal information and accrue past criminal history information based on different numbers of prior arrests cycles to select a model to increase recidivism prediction accuracy while reducing race-based bias.
Abstract: Recidivism, the propensity of convicts to reoffend after release from prison on parole, is a domain that has both benefited from machine learning based decision support systems and experienced race-based bias in the predictions. In this paper we propose an approach to select a model to increase recidivism prediction accuracy while reducing race-based bias. We built several models that each use offenders' current arrest's criminal information and accrue past criminal history information based on different numbers of prior arrests cycles. We then monitor accuracy and inherent bias in the results for different sub-populations. Finally, from the various criminal history-based models developed, we select the one that offers minimal bias for different subpopulations while increasing the accuracy by using False Positive Rate Parity. The approach allows adaptation to the diversity in training data, different crime types, and varied length of prior arrest cycle history. We assessed model prediction performance using ten independent iterations of Monte Carlo cross-validation.

3 citations


Proceedings ArticleDOI
01 Jul 2020
TL;DR: SimsterQ - a clustering based system for answering questions that makes use of word vectors that returned answers similar to the review snippets from the Amazon QA Dataset as measured by the cosine similarity.
Abstract: In recent years, there has been an increase in online shopping resulting in an increased number of online reviews. Customers cannot delve into the huge amount of data when they are looking for specific aspects of a product. Some of these aspects can be extracted from the product reviews. In this paper we introduced SimsterQ - a clustering based system for answering questions that makes use of word vectors. Clustering was performed using cosine similarity scores between sentence vectors of reviews and questions. Two variants (Sim and Median) with and without stopwords were evaluated against traditional methods that use term frequency. We also used an n-gram approach to study the effect of noise. We used the reviews in the Amazon Reviews dataset to pick the answers. Evaluation was performed both at the individual sentence level using the top sentence from Okapi BM25 as the gold standard and at the whole answer level using review snippets as the gold standard. At the sentence level our system performed slightly better than a more complicated deep learning method. Our system returned answers similar to the review snippets from the Amazon QA Dataset as measured by the cosine similarity. Analysis was also performed on the quality of the clusters generated by our system.

2 citations


Journal ArticleDOI
14 Nov 2020
TL;DR: A fairness measure called Bias Parity (BP) score is proposed to measure quantifiable decrease in bias in the prediction models, which leverages an existing intuition of bias awareness and summarizes it in a single measure.
Abstract: Machine learning-based decision support systems bring relief and support to the decision-maker in many domains such as loan application acceptance, dating, hiring, granting parole, insurance coverage, and medical diagnoses. These support systems facilitate processing tremendous amounts of data to decipher the patterns embedded in them. However, these decisions can also absorb and amplify bias embedded in the data. To address this, the work presented in this paper introduces a new fairness measure as well as an enhanced, feature-rich representation derived from the temporal aspects in the data set that permits the selection of the lowest bias model among the set of models learned on various versions of the augmented feature set. Specifically, our approach uses neural networks to forecast recidivism from many unique feature-rich models created from the same raw offender dataset. We create multiple records from one summarizing criminal record per offender in the raw dataset. This is achieved by grouping each set of arrest to release information into a unique record. We use offenders’ criminal history, substance abuse, and treatments taken during imprisonment in different numbers of past arrests to enrich the input feature vectors for the prediction models generated. We propose a fairness measure called Bias Parity (BP) score to measure quantifiable decrease in bias in the prediction models. BP score leverages an existing intuition of bias awareness and summarizes it in a single measure. We demonstrate how BP score can be used to quantify bias for a variety of statistical quantities and how to associate disparate impact with this measure. By using our feature enrichment approach we could increase the accuracy of predicting recidivism for the same dataset from 77.8% in another study to 89.2% in the current study while achieving an improved BP score computed for average accuracy of 99.4, where a value of 100 means no bias for the two subpopulation groups compared. Moreover, an analysis of the accuracy and BP scores for various levels of our feature augmentation method shows consistent trends among scores for a range of fairness measures, illustrating the benefit of the method for picking fairer models without significant loss of accuracy.