scispace - formally typeset
Search or ask a question
Author

Goce Trajcevski

Bio: Goce Trajcevski is an academic researcher from Iowa State University. The author has contributed to research in topics: Wireless sensor network & Computer science. The author has an hindex of 26, co-authored 186 publications receiving 4694 citations. Previous affiliations of Goce Trajcevski include Northwestern University & University of Illinois at Chicago.


Papers
More filters
Journal ArticleDOI
01 Aug 2008
TL;DR: An extensive set of time series experiments are conducted re-implementing 8 different representation methods and 9 similarity measures and their variants and testing their effectiveness on 38 time series data sets from a wide variety of application domains to provide a unified validation of some of the existing achievements.
Abstract: The last decade has witnessed a tremendous growths of interests in applications that deal with querying and mining of time series data. Numerous representation methods for dimensionality reduction and similarity measures geared towards time series have been introduced. Each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive set of time series experiments re-implementing 8 different representation methods and 9 similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this paper, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. Our experiments have provided both a unified validation of some of the existing achievements, and in some cases, suggested that certain claims in the literature may be unduly optimistic.

1,387 citations

Journal ArticleDOI
TL;DR: An extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants and testing their effectiveness on 38 time series data sets from a wide variety of application domains gives an overview of these different techniques and presents comparative experimental findings regarding their effectiveness.
Abstract: The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this article, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.

747 citations

Journal ArticleDOI
TL;DR: This article proposes to model an uncertain trajectory as a three-dimensional (3D) cylindrical body and introduces a set of novel but natural spatio-temporal operators which capture the uncertainty and are used to express spatio/temporal range queries.
Abstract: This article addresses the problem of managing Moving Objects Databases (MODs) which capture the inherent imprecision of the information about the moving object's location at a given time. We deal systematically with the issues of constructing and representing the trajectories of moving objects and querying the MOD. We propose to model an uncertain trajectory as a three-dimensional (3D) cylindrical body and we introduce a set of novel but natural spatio-temporal operators which capture the uncertainty and are used to express spatio-temporal range queries. We devise and analyze algorithms for processing the operators and demonstrate that the model incorporates the uncertainty in a manner which enables efficient querying, thus striking a balance between the modeling power and computational efficiency. We address some implementation aspects which we experienced in our DOMINO project, as a part of which the operators that we introduce have been implemented. We also report on some experimental observations of a practical relevance.

303 citations

Reference EntryDOI
15 Sep 2008
TL;DR: The event-condition-action (ECA) paradigm to specify reactive behavior and its enabling tools and triggers which are available in commercial DBMS are discussed and the need for formal reasoning about active database behavior is motivated.
Abstract: Active database systems extend traditional database systems with the capability to react to various events, which can be either generated internally or externally. We discuss the event-condition-action (ECA) paradigm to specify reactive behavior and its enabling tools and triggers which are available in commercial DBMS. We motivate the need for formal reasoning about active database behavior and present the desirable global properties of termination and confluence. Then we discuss the various semantic dimensions of triggers that relate to their instantiation, condition evaluation, scheduling, and execution and we discuss their coupling with the underlying transaction model. The main features of prototype active database systems and of commercially available DBMS are reviewed. We introduce some of the challenges that face active database systems in novel, highly heterogeneous, distributed, and dynamic driven application domains such as workflow systems, streaming database, and moving objects databases. Finally, we give a brief overview of a new paradigm, (ECA)'2, which is geared toward settings in which both the event detection and the condition evaluation are done in distributed fashion and are continuous in nature. Keywords: active databases; triggers; event-condition-action; semantic dimensions; composite events; coupling; conflict resolution; children triggers

191 citations

Journal ArticleDOI
01 Sep 2006
TL;DR: In this article, a line simplification technique was proposed to reduce the size of the trajectories of mobile devices by adopting existing linguistic constructs to manage the uncertainty introduced by the trajectory approximation.
Abstract: A common way of storing spatio-temporal information about mobile devices is in the form of a 3D (2D geography + time) trajectory. We argue that when cellular phones and Personal Digital Assistants become location-aware, the size of the spatio-temporal information generated may prohibit efficient processing. We propose to adopt a technique studied in computer graphics, namely line-simplification, as an approximation technique to solve this problem. Line simplification will reduce the size of the trajectories. Line simplification uses a distance function in producing the trajectory approximation. We postulate the desiderata for such a distance-function: it should be sound, namely the error of the answers to spatio-temporal queries must be bounded. We analyze several distance functions, and prove that some are sound in this sense for some types of queries, while others are not. A distance function that is sound for all common spatio-temporal query types is introduced and analyzed. Then we propose an aging mechanism which gradually shrinks the size of the trajectories as time progresses. We also propose to adopt existing linguistic constructs to manage the uncertainty introduced by the trajectory approximation. Finally, we analyze experimentally the effectiveness of line-simplification in reducing the size of a trajectories database.

187 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: The primary objective of this paper is to serve as a glossary for interested researchers to have an overall picture on the current time series data mining development and identify their potential research direction to further investigation.

1,358 citations

Journal ArticleDOI
Yu Zheng1
TL;DR: A systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics, and introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors.
Abstract: The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles, and animals. Many techniques have been proposed for processing, managing, and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics. Following a road map from the derivation of trajectory data, to trajectory data preprocessing, to trajectory data management, and to a variety of mining tasks (such as trajectory pattern mining, outlier detection, and trajectory classification), the survey explores the connections, correlations, and differences among these existing techniques. This survey also introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors, to which more data mining and machine learning techniques can be applied. Finally, some public trajectory datasets are presented. This survey can help shape the field of trajectory data mining, providing a quick understanding of this field to the community.

1,289 citations

Book
11 Jan 2013
TL;DR: Outlier Analysis is a comprehensive exposition, as understood by data mining experts, statisticians and computer scientists, and emphasis was placed on simplifying the content, so that students and practitioners can also benefit.
Abstract: With the increasing advances in hardware technology for data collection, and advances in software technology (databases) for data organization, computer scientists have increasingly participated in the latest advancements of the outlier analysis field. Computer scientists, specifically, approach this field based on their practical experiences in managing large amounts of data, and with far fewer assumptions the data can be of any type, structured or unstructured, and may be extremely large. Outlier Analysisis a comprehensive exposition, as understood by data mining experts, statisticians and computer scientists. The book has been organized carefully, and emphasis was placed on simplifying the content, so that students and practitioners can also benefit. Chapters will typically cover one of three areas: methods and techniques commonly used in outlier analysis, such as linear methods, proximity-based methods, subspace methods, and supervised methods; data domains, such as, text, categorical, mixed-attribute, time-series, streaming, discrete sequence, spatial and network data; and key applications of these methods as applied to diverse domains such as credit card fraud detection, intrusion detection, medical diagnosis, earth science, web log analytics, and social network analysis are covered.

1,278 citations