scispace - formally typeset
Search or ask a question

Showing papers by "Yannis Theodoridis published in 2009"


Journal ArticleDOI
01 Jul 2009
TL;DR: It is shown that the proposed scheme can be efficiently and effectively applied for medical image retrieval from large databases, providing unsupervised semantic interpretation of the results, which can be further extended by knowledge representation methodologies.
Abstract: In this paper, we propose a novel scheme for efficient content-based medical image retrieval, formalized according to the PAtterns for Next generation DAtabase systems (PANDA) framework for pattern representation and management. The proposed scheme involves block-based low-level feature extraction from images followed by the clustering of the feature space to form higher-level, semantically meaningful patterns. The clustering of the feature space is realized by an expectation-maximization algorithm that uses an iterative approach to automatically determine the number of clusters. Then, the 2-component property of PANDA is exploited: the similarity between two clusters is estimated as a function of the similarity of both their structures and the measure components. Experiments were performed on a large set of reference radiographic images, using different kinds of features to encode the low-level image content. Through this experimentation, it is shown that the proposed scheme can be efficiently and effectively applied for medical image retrieval from large databases, providing unsupervised semantic interpretation of the results, which can be further extended by knowledge representation methodologies.

94 citations


Proceedings ArticleDOI
06 Dec 2009
TL;DR: This paper proposes an intuitionistic point vector representation of trajectories that encompasses the underlying uncertainty and introduces an effective distance metric to cope with uncertainty in TD clustering.
Abstract: Mining Trajectory Databases (TD) has recently gained great interest due to the popularity of tracking devices. On the other hand, the inherent presence of uncertainty in TD (e.g., due to GPS errors) has not been taken yet into account during the mining process. In this paper, we study the effect of uncertainty in TD clustering and introduce a three-step approach to deal with it. First, we propose an intuitionistic point vector representation of trajectories that encompasses the underlying uncertainty and introduce an effective distance metric to cope with uncertainty. Second, we devise CenTra, a novel algorithm which tackles the problem of discovering the Centroid Trajectory of a group of movements. Third, we propose a variant of the Fuzzy C-Means (FCM) clustering algorithm, which embodies CenTra at its update procedure. The experimental evaluation over real world TD demonstrates the efficiency and effectiveness of our approach.

86 citations


Book ChapterDOI
30 Jun 2009
TL;DR: This paper proposes solutions tackling the combined, map matched trajectory compression problem, the efficiency of which is demonstrated through an experimental evaluation using a real trajectory dataset.
Abstract: The wide usage of location aware devices, such as GPS-enabled cellphones or PDAs, generates vast volumes of spatiotemporal streams modeling objects movements, raising management challenges, such as efficient storage and querying. Therefore, compression techniques are inevitable also in the field of moving object databases. Moreover, due to erroneous measurements from GPS devices, the problem of matching the location recordings with the underlying traffic network has recently gained the attention of the research community. So far, the proposed compression techniques are not designed for network constrained moving objects, while map matching algorithms do not consider compression issues. In this paper, we propose solutions tackling the combined, map matched trajectory compression problem, the efficiency of which is demonstrated through an experimental evaluation using a real trajectory dataset.

45 citations


Proceedings ArticleDOI
24 Mar 2009
TL;DR: A framework for the challenges and the mining solutions for the geographic information collected by Moving Object Database (MOD) engines is established and a research agenda is proposed to identify areas where interdisciplinary studies are needed.
Abstract: A flood of data pertinent to moving objects is available today, and will be more in the near future, particularly due to the automated collection of privacy-sensitive telecom data from mobile phones and other location-aware devices. Such wealth of data, referenced both in space and time, may enable novel classes of applications of high societal and economic impact, provided that the discovery of consumable and concise knowledge out of these raw data is made possible. Recent research activities have developed theory, techniques and systems for geographic knowledge discovery and delivery, some of them based on privacy-preserving methods for extracting knowledge from large amounts of raw data referenced in space and time. All these efforts aim at devising knowledge discovery and analysis methods for trajectories of moving objects.The fundamental hypothesis is that it is possible, in principle, to aid citizens in their mobile activities by analysing the traces of their past activities by means of data mining techniques. For instance, behavioural patterns derived from mobile trajectories may allow inducing traffic flow information, capable to help people travel efficiently, to help public administrations in traffic-related decision making for sustainable mobility and security management, as well as to help mobile operators in optimising bandwidth and power allocation on the network. On the other hand, it is clear that the use of personal sensitive data arouses concerns about citizen's privacy rights.In this tutorial, we establish a framework for the challenges and the mining solutions for the geographic information collected by Moving Object Database (MOD) engines. We first discuss the challenges of collecting mobility data, and elaborate on the impact of trajectory data analysis in several modern applications. We then discuss methodologies and techniques to collect raw data, reconstruct trajectory information, and efficiently store it in MODs. We continue with an overview of knowledge discovery approaches for movement data. Finally, we propose a research agenda and identify areas where interdisciplinary studies are needed.

23 citations


Journal Article
TL;DR: The MONIC + framework for cluster-type-specific transference modeling and detection encompasses a typification of clusters and cluster- type-specific transition indicators, by exploit- ing cluster topology and cluster statistics for the transition detec- tion process.
Abstract: Clustering algorithms detect groups of similar pop- ulation members, like customers, news or genes. In many cluster- ing applications the observed population evolves and changes over time, subject to internal and external factors. Detecting and under- standing changes is important for decision support. In this work, we present the MONIC + framework for cluster-type-specific transi- tion modeling and detection. MONIC + encompasses a typification of clusters and cluster-type-specific transition indicators, by exploit- ing cluster topology and cluster statistics for the transition detec- tion process. Our experiments on both synthetic and real datasets demonstrate the usefulness and applicability of our framework. Keywords: dynamic environments, change detection, cluster transitions, transition indicators, cluster-type-specific indicators.

20 citations


Journal ArticleDOI
TL;DR: This paper presents the first theoretical analysis that estimates the average number of false hits introduced in the results of rectangular range queries in the case of data points uniformly distributed in 2D space and relax the original distribution assumptions showing how to deal with arbitrarily distributed data points and more realistic location uncertainty distributions.
Abstract: An emerging topic in the field of spatial data management is the handling of location uncertainty of spatial objects, mainly due to inaccurate measurements. The literature on location uncertainty so far has focused on modifying traditional spatial search algorithms in order to handle the impact of objects' location uncertainty in query results. In this paper, we present the first, to the best of our knowledge, theoretical analysis that estimates the average number of false hits introduced in the results of rectangular range queries in the case of data points uniformly distributed in 2D space. Then, we relax the original distribution assumptions showing how to deal with arbitrarily distributed data points and more realistic location uncertainty distributions. The accuracy of the results of our analytical approach is demonstrated through an extensive experimental study using various synthetic and real datasets. Our proposal can be directly employed in spatial database systems in order to provide users with the accuracy of spatial query results based only on known dataset and query parameters.

18 citations


Journal ArticleDOI
01 Feb 2009
TL;DR: In Panda the problem of comparing complex patterns is decomposed into simpler sub-problems on the component patterns and so-obtained partial solutions are then smartly aggregated into an overall dissimilarity score, which grants Panda with a high flexibility and allows it to easily handle patterns with highly complex structures.
Abstract: Data Mining techniques are commonly used to extract patterns, like association rules and decision trees, from huge volumes of data. The comparison of patterns is a fundamental issue, which can be exploited, among others, to synthetically measure dissimilarities in evolving or different datasets and to compare the output produced by different data mining algorithms on a same dataset. In this paper, we present the Panda framework for computing the dissimilarity of both simple and complex patterns, defined upon raw data and other patterns, respectively. In Panda the problem of comparing complex patterns is decomposed into simpler sub-problems on the component (simple or complex) patterns and so-obtained partial solutions are then smartly aggregated into an overall dissimilarity score. This intrinsically recursive approach grants Panda with a high flexibility and allows it to easily handle patterns with highly complex structures. Panda is built upon a few basic concepts so as to be generic and clear to the end user. We demonstrate the generality and flexibility of Panda by showing how it can be easily applied to a variety of pattern types, including sets of itemsets and clusterings.

14 citations


Proceedings Article
01 Jan 2009
TL;DR: This research investigates the extension of Data Warehousing and data mining technology so as to be applicable on mobility data, and presents the developed framework for analyzing mobility data and some preliminary results.
Abstract: The usage of location aware devices, such as mobile phones and GPS-enabled devices, is widely spread nowadays, allowing access to large spatiotemporal datasets. The space-time nature of this kind of data results in the generation of huge amounts of mobility data and imposes new challenges regarding the analytical tools that can be used for transforming raw data to knowledge. In our research, we investigate the extension of Data Warehousing and data mining technology so as to be applicable on mobility data. In this paper, we present the, so far, developed framework for analyzing mobility data and some preliminary results.

12 citations



01 Jan 2009
TL;DR: A multidimensional handle controlled without displacement is used for precisely positioned control and input in an actuating rod.

5 citations




Book ChapterDOI
23 Apr 2009
TL;DR: GF-Miner is proposed which is a genetic fuzzy classifier that improves Fuzzy Miner firstly by adopting a clustering method for succeeding a more natural fuzzy partitioning of the input space, and secondly by optimizing the resulting fuzzy if-then rules with the use of genetic algorithms.
Abstract: Fuzzy logic and genetic algorithms are well-established computational techniques that have been employed to deal with the problem of classification as this is presented in the context of data mining. Based on Fuzzy Miner which is a recently proposed state-of-the-art fuzzy rule based system for numerical data, in this paper we propose GF-Miner which is a genetic fuzzy classifier that improves Fuzzy Miner firstly by adopting a clustering method for succeeding a more natural fuzzy partitioning of the input space, and secondly by optimizing the resulting fuzzy if-then rules with the use of genetic algorithms. More specifically, the membership functions of the fuzzy partitioning are extracted in an unsupervised way by using the fuzzy c- means clustering algorithm, while the extracted rules are optimized in terms of the volume of the rulebase and the size of each rule, using two appropriately designed genetic algorithms. The efficiency of our approach is demonstrated through an extensive experimental evaluation using the IRIS benchmark dataset.