scispace - formally typeset
Search or ask a question
Author

Fazle Karim

Bio: Fazle Karim is an academic researcher from University of Illinois at Chicago. The author has contributed to research in topics: Dynamic time warping & Convolutional neural network. The author has an hindex of 8, co-authored 22 publications receiving 947 citations. Previous affiliations of Fazle Karim include University of Illinois at Urbana–Champaign.

Papers
More filters
Journal ArticleDOI
TL;DR: The augmentation of fully convolutional networks with long short term memory recurrent neural network (LSTM RNN) sub-modules for time series classification with attention mechanism and refinement as a method to enhance the performance of trained models are proposed.
Abstract: Fully convolutional neural networks (FCNs) have been shown to achieve the state-of-the-art performance on the task of classifying time series sequences. We propose the augmentation of fully convolutional networks with long short term memory recurrent neural network (LSTM RNN) sub-modules for time series classification. Our proposed models significantly enhance the performance of fully convolutional networks with a nominal increase in model size and require minimal preprocessing of the data set. The proposed long short term memory fully convolutional network (LSTM-FCN) achieves the state-of-the-art performance compared with others. We also explore the usage of attention mechanism to improve time series classification with the attention long short term memory fully convolutional network (ALSTM-FCN). The attention mechanism allows one to visualize the decision process of the LSTM cell. Furthermore, we propose refinement as a method to enhance the performance of trained models. An overall analysis of the performance of our model is provided and compared with other techniques.

851 citations

Journal ArticleDOI
TL;DR: In this paper, the authors proposed transforming the existing univariate time series classification models, the Long Short Term Memory Fully Convolutional Network (LSTM-FCN) and Attention LSTM FCN (ALSTMFCN), into a multivariate time-series classification model by augmenting the fully convolutional block with a squeeze-and-excitation block to further improve accuracy.

509 citations

Journal ArticleDOI
TL;DR: In this paper, a series of ablation tests (3627 experiments) on the LSTM-FCN and ALSTMFCN were performed to provide a better understanding of the model and each of its sub-modules.
Abstract: Long short-term memory fully convolutional neural networks (LSTM-FCNs) and Attention LSTM-FCN (ALSTM-FCN) have shown to achieve the state-of-the-art performance on the task of classifying time series signals on the old University of California-Riverside (UCR) time series repository. However, there has been no study on why LSTM-FCN and ALSTM-FCN perform well. In this paper, we perform a series of ablation tests (3627 experiments) on the LSTM-FCN and ALSTM-FCN to provide a better understanding of the model and each of its sub-modules. The results from the ablation tests on the ALSTM-FCN and LSTM-FCN show that the LSTM and the FCN blocks perform better when applied in a conjoined manner. Two z-normalizing techniques, z-normalizing each sample independently and z-normalizing the whole dataset, are compared using a Wilcoxson signed-rank test to show a statistical difference in performance. In addition, we provide an understanding of the impact dimension shuffle that has on LSTM-FCN by comparing its performance with LSTM-FCN when no dimension shuffle is applied. Finally, we demonstrate the performance of the LSTM-FCN when the LSTM block is replaced by a gated recurrent unit (GRU), basic neural network (RNN), and dense block.

129 citations

Proceedings ArticleDOI
03 Apr 2014
TL;DR: A prediction model to forecast the SAP of the Engineering students is introduced based on the Bayesian networks framework and it is proven to outperform the conventional models in grade prediction.
Abstract: Predicting students' academic performance (SAP) provides invaluable information for educational institutes' authorities. This information offers numerous opportunities for instructors and decision makers to improve their quality of services and consequently help the students to succeed in their education. In this paper, we introduce a prediction model to forecast the SAP of the Engineering students. The model is based on the Bayesian networks framework. The model is constructed using a database of the undergraduate engineering students at University of Illinois at Chicago (UIC). The specific objective of this model is to predict the students' grades in three major courses which most of the students take in their second semester. The grades in these courses have major impact on students' retention rates as many students receive low grades in them. Therefore, predicting students' grades in these courses can be used to identify the students who might receive low grades and hence need extra help from the educational authorities. The proposed model has been tested against the conventional models which have been proposed in the literature and it is proven to outperform them in grade prediction.

45 citations

Journal ArticleDOI
TL;DR: A new approximation method for reducing the length of the time series as the input of DTW, called control chart approximation (CCA), after a similar concept used in statistical quality control processing, which shows similar or better accuracy on the long time series in the experiment.
Abstract: Throughout recent years, dynamic time warping (DTW) has remained as a robust similarity measure in time series classification (TSC). 1-nearest neighbor (1-NN) algorithm with DTW is the most widely used classification method on time series serving as a benchmark. With the increasing demand for TSC on low-resource devices and the widespread of wearable devices, the need for a efficient and accurate time series classifier has never been higher. Although 1-NN DTW attains accurate results, it highly falls back on efficiency due to its quadratic complexity in the length of time series. In this paper, we propose a new approximation method for reducing the length of the time series as the input of DTW. We call it control chart approximation (CCA), after a similar concept used in statistical quality control processing. CCA representation approximates raw time series by transforming them into a set of segments with aggregated values and durations forming a reduced 3-D vector. We also propose an adaptation of DTW in 3-D space as a distance measure for 1-NN classifier, and denote the method as 1-NN 3-D DTW. Our experiments on 85 datasets from UCR archive—including 28 long-length (>500 points) time series datasets—show up to two orders of magnitude performance gain in running time compared to the state-of-the-art 1-NN DTW implementation. Moreover, it shows similar or better accuracy on the long time series in the experiment.

35 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: The Long short-term memory (LSTM) networks, a deep learning approach to forecast the future COVID-19 cases are presented, which predicted the possible ending point of this outbreak will be around June 2020 and compared transmission rates of Canada with Italy and USA.
Abstract: On March 11 th 2020, World Health Organization (WHO) declared the 2019 novel corona virus as global pandemic. Corona virus, also known as COVID-19 was first originated in Wuhan, Hubei province in China around December 2019 and spread out all over the world within few weeks. Based on the public datasets provided by John Hopkins university and Canadian health authority, we have developed a forecasting model of COVID-19 outbreak in Canada using state-of-the-art Deep Learning (DL) models. In this novel research, we evaluated the key features to predict the trends and possible stopping time of the current COVID-19 outbreak in Canada and around the world. In this paper we presented the Long short-term memory (LSTM) networks, a deep learning approach to forecast the future COVID-19 cases. Based on the results of our Long short-term memory (LSTM) network, we predicted the possible ending point of this outbreak will be around June 2020. In addition to that, we compared transmission rates of Canada with Italy and USA. Here we also presented the 2, 4, 6, 8, 10, 12 and 14 th day predictions for 2 successive days. Our forecasts in this paper is based on the available data until March 31, 2020. To the best of our knowledge, this of the few studies to use LSTM networks to forecast the infectious diseases.

673 citations

Journal ArticleDOI
TL;DR: This work reviews the recent status of methodologies and techniques related to the construction of digital twins mostly from a modeling perspective to provide a detailed coverage of the current challenges and enabling technologies along with recommendations and reflections for various stakeholders.
Abstract: Digital twin can be defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision making. Recent advances in computational pipelines, multiphysics solvers, artificial intelligence, big data cybernetics, data processing and management tools bring the promise of digital twins and their impact on society closer to reality. Digital twinning is now an important and emerging trend in many applications. Also referred to as a computational megamodel, device shadow, mirrored system, avatar or a synchronized virtual prototype, there can be no doubt that a digital twin plays a transformative role not only in how we design and operate cyber-physical intelligent systems, but also in how we advance the modularity of multi-disciplinary systems to tackle fundamental barriers not addressed by the current, evolutionary modeling practices. In this work, we review the recent status of methodologies and techniques related to the construction of digital twins mostly from a modeling perspective. Our aim is to provide a detailed coverage of the current challenges and enabling technologies along with recommendations and reflections for various stakeholders.

660 citations

Journal ArticleDOI
TL;DR: In this paper, the authors proposed transforming the existing univariate time series classification models, the Long Short Term Memory Fully Convolutional Network (LSTM-FCN) and Attention LSTM FCN (ALSTMFCN), into a multivariate time-series classification model by augmenting the fully convolutional block with a squeeze-and-excitation block to further improve accuracy.

509 citations

Journal ArticleDOI
15 Jul 2021-PLOS ONE
TL;DR: A taxonomy is proposed and outline the four families in time series data augmentation, including transformation-based methods, pattern mixing, generative models, and decomposition methods, and their application to time series classification with neural networks.
Abstract: In recent times, deep artificial neural networks have achieved many successes in pattern recognition. Part of this success can be attributed to the reliance on big data to increase generalization. However, in the field of time series recognition, many datasets are often very small. One method of addressing this problem is through the use of data augmentation. In this paper, we survey data augmentation techniques for time series and their application to time series classification with neural networks. We propose a taxonomy and outline the four families in time series data augmentation, including transformation-based methods, pattern mixing, generative models, and decomposition methods. Furthermore, we empirically evaluate 12 time series data augmentation methods on 128 time series classification datasets with six different types of neural networks. Through the results, we are able to analyze the characteristics, advantages and disadvantages, and recommendations of each data augmentation method. This survey aims to help in the selection of time series data augmentation for neural network applications.

198 citations