scispace - formally typeset
Search or ask a question
Author

Martin Atzmueller

Bio: Martin Atzmueller is an academic researcher from Tilburg University. The author has contributed to research in topics: Social media & Social network analysis. The author has an hindex of 28, co-authored 184 publications receiving 2609 citations. Previous affiliations of Martin Atzmueller include University of Osnabrück & University of Würzburg.


Papers
More filters
Journal ArticleDOI
TL;DR: Subgroup discovery is a broadly applicable descriptive data mining technique for identifying interesting subgroups according to some property of interest as mentioned in this paper, and it is a widely used technique in data mining.
Abstract: Subgroup discovery is a broadly applicable descriptive data mining technique for identifying interesting subgroups according to some property of interest. This article summarizes fundamentals of subgroup discovery, before that it also reviews algorithms and further advanced methodological issues. In addition, we briefly discuss tools and applications of subgroup discovery approaches. In that context, we also discuss experiences and lessons learned and outline some of the future directions in order to show the advantages and benefits of subgroup discovery. WIREs Data Mining Knowl Discov 2015, 5:35-49. doi: 10.1002/widm.1144

215 citations

Book ChapterDOI
18 Sep 2006
TL;DR: It is shown how SD-Map can handle missing values, and how the algorithm can identify all interesting subgroup patterns contained in a data set, in contrast to heuristic or sampling-based methods.
Abstract: In this paper we present the novel SD-Map algorithm for exhaustive but efficient subgroup discovery. SD-Map guarantees to identify all interesting subgroup patterns contained in a data set, in contrast to heuristic or sampling-based methods. The SD-Map algorithm utilizes the well-known FP-growth method for mining association rules with adaptations for the subgroup discovery task. We show how SD-Map can handle missing values, and provide an experimental evaluation of the performance of the algorithm using synthetic data.

171 citations

Journal ArticleDOI
TL;DR: This paper focuses on description-oriented community detection using subgroup discovery and proposes several optimistic estimates of standard community quality functions to be used for efficient pruning of the search space in an exhaustive branch-and-bound algorithm.

117 citations

Proceedings Article
30 Jul 2005
TL;DR: This paper categorizes several classes of background knowledge for subgroup discovery, and presents how the necessary knowledge elements can be modelled, and shows how sub group discovery methods benefit from the utilization of backgroundknowledge.
Abstract: In general, knowledge-intensive data mining methods exploit background knowledge to improve the quality of their results. Then, in knowledge-rich domains often the interestingness of the mined patterns can be increased significantly. In this paper we categorize several classes of background knowledge for subgroup discovery, and present how the necessary knowledge elements can be modelled. Furthermore, we show how subgroup discovery methods benefit from the utilization of background knowledge, and discuss its application in an incremental process-model. The context of our work is to identify interesting diagnostic patterns to supplement a medical documentation and consultation system. We provide a case study in the medical domain, using a case base from a realworld application.

82 citations

Book ChapterDOI
27 Aug 2009
TL;DR: This paper proposes novel formalizations of effective pruning strategies for reducing the search space, and presents the SD-Map* algorithm that enables fast subgroup discovery for continuous target concepts.
Abstract: Subgroup discovery is a flexible data mining method for a broad range of applications. It considers a given property of interest (target concept), and aims to discover interesting subgroups with respect to this concept. In this paper, we especially focus on the handling of continuous target variables and describe an approach for fast and efficient subgroup discovery for such target concepts. We propose novel formalizations of effective pruning strategies for reducing the search space, and we present the SD-Map* algorithm that enables fast subgroup discovery for continuous target concepts. The approach is evaluated using real-world data from the industrial domain.

76 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Jan 2002

9,314 citations

Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

01 Jan 2012

3,692 citations