scispace - formally typeset
Search or ask a question
Author

Janina Pohl

Bio: Janina Pohl is an academic researcher from University of Münster. The author has contributed to research in topics: Feature selection & Travelling salesman problem. The author has an hindex of 2, co-authored 4 publications receiving 11 citations.

Papers
More filters
Book ChapterDOI
05 Sep 2020
TL;DR: This work focuses on the well-known Euclidean Traveling Salesperson Problem and two highly competitive inexact heuristic TSP solvers and shows that a feature-free deep neural network based approach solely based on visual representation of the instances already matches classical AS model results and thus shows huge potential for future studies.
Abstract: In this work we focus on the well-known Euclidean Traveling Salesperson Problem (TSP) and two highly competitive inexact heuristic TSP solvers, EAX and LKH, in the context of per-instance algorithm selection (AS). We evolve instances with \(1\,000\) nodes where the solvers show strongly different performance profiles. These instances serve as a basis for an exploratory study on the identification of well-discriminating problem characteristics (features). Our results in a nutshell: we show that even though (1) promising features exist, (2) these are in line with previous results from the literature, and (3) models trained with these features are more accurate than models adopting sophisticated feature selection methods, the advantage is not close to the virtual best solver in terms of penalized average runtime and so is the performance gain over the single best solver. However, we show that a feature-free deep neural network based approach solely based on visual representation of the instances already matches classical AS model results and thus shows huge potential for future studies.

14 citations

Book ChapterDOI
19 Jul 2020
TL;DR: A new two-phase framework that uses unsupervised stream clustering for detecting suspicious trends over time in a first step and traditional offline analyses are applied to distinguish between normal trend evolution and malicious manipulation attempts is proposed.
Abstract: The identification of coordinated campaigns within Social Media is a complex task that is often hindered by missing labels and large amounts of data that have to be processed. We propose a new two-phase framework that uses unsupervised stream clustering for detecting suspicious trends over time in a first step. Afterwards, traditional offline analyses are applied to distinguish between normal trend evolution and malicious manipulation attempts. We demonstrate the applicability of our framework in the context of the final days of the Brexit in 2019/2020.

14 citations

Proceedings ArticleDOI
06 Sep 2021
TL;DR: In this article, the authors proposed a normalization for two feature groups which stood out in multiple automated algorithm selection (AS) studies on the TSP: (a) features based on a minimum spanning tree (MST) and (b) a k-nearest neighbor graph (NNG) transformation of the input instance.
Abstract: Classic automated algorithm selection (AS) for (combinatorial) optimization problems heavily relies on so-called instance features, i.e., numerical characteristics of the problem at hand ideally extracted with computationally low-demanding routines. For the traveling salesperson problem (TSP) a plethora of features have been suggested. Most of these features are, if at all, only normalized imprecisely raising the issue of feature values being strongly affected by the instance size. Such artifacts may have detrimental effects on algorithm selection models. We propose a normalization for two feature groups which stood out in multiple AS studies on the TSP: (a) features based on a minimum spanning tree (MST) and (b) a k-nearest neighbor graph (NNG) transformation of the input instance. To this end we theoretically derive minimum and maximum values for properties of MSTs and k-NNGs of Euclidean graphs. We analyze the differences in feature space between normalized versions of these features and their unnormalized counterparts. Our empirical investigations on various TSP benchmark sets point out that the feature scaling succeeds in eliminating the effect of the instance size. Eventually, a proof-of-concept AS-study shows promising results: models trained with normalized features tend to outperform those trained with the respective vanilla features.

2 citations

Journal ArticleDOI
TL;DR: This survey explores, summarizes, and categorizes work within the domain of stream classification and identifies core research threads over the past few years, which are structured based on the stream classification process to facilitate coordination within this complex topic.
Abstract: Due to the rise of continuous data-generating applications, analyzing data streams has gained increasing attention over the past decades. A core research area in stream data is stream classification, which categorizes or detects data points within an evolving stream of observations. Areas of stream classification are diverse—ranging, e.g., from monitoring sensor data to analyzing a wide range of (social) media applications. Research in stream classification is related to developing methods that adapt to the changing and potentially volatile data stream. It focuses on individual aspects of the stream classification pipeline, e.g., designing suitable algorithm architectures, an efficient train and test procedure, or detecting so-called concept drifts. As a result of the many different research questions and strands, the field is challenging to grasp, especially for beginners. This survey explores, summarizes, and categorizes work within the domain of stream classification and identifies core research threads over the past few years. It is structured based on the stream classification process to facilitate coordination within this complex topic, including common application scenarios and benchmarking data sets. Thus, both newcomers to the field and experts who want to widen their scope can gain (additional) insight into this research area and find starting points and pointers to more in-depth literature on specific issues and research directions in the field.

1 citations


Cited by
More filters
Posted Content
TL;DR: In this paper, the authors discuss how bot capabilities can be extended and controlled by integrating humans into the process and reason that this is currently the most promising way to go in order to realize effective interactions with other humans.
Abstract: Social bots are currently regarded an influential but also somewhat mysterious factor in public discourse and opinion making. They are considered to be capable of massively distributing propaganda in social and online media and their application is even suspected to be partly responsible for recent election results. Astonishingly, the term `Social Bot' is not well defined and different scientific disciplines use divergent definitions. This work starts with a balanced definition attempt, before providing an overview of how social bots actually work (taking the example of Twitter) and what their current technical limitations are. Despite recent research progress in Deep Learning and Big Data, there are many activities bots cannot handle well. We then discuss how bot capabilities can be extended and controlled by integrating humans into the process and reason that this is currently the most promising way to go in order to realize effective interactions with other humans.

81 citations

01 Jan 2012
TL;DR: This paper takes a statistical approach and examines the features of TSP instances that make the problem either hard or easy to solve, using the approximation ratio that it achieves on a given instance as a measure of problem difficulty.
Abstract: With this paper we contribute to the understanding of the success of 2-opt based local search algorithms for solving the traveling salesman problem (TSP). Although 2-opt is widely used in practice, it is hard to understand its success from a theoretical perspective. We take a statistical approach and examine the features of TSP instances that make the problem either hard or easy to solve. As a measure of problem difficulty for 2-opt we use the approximation ratio that it achieves on a given instance. Our investigations point out important features that make TSP instances hard or easy to be approximated by 2-opt.

38 citations

Posted Content
TL;DR: In this paper, a network-based framework for uncovering and studying coordinated behaviors on social media is proposed to expose different coordination patterns and to estimate the degree of coordination that characterizes diverse communities.
Abstract: Coordinated online behaviors are an essential part of information and influence operations, as they allow a more effective disinformation's spread Most studies on coordinated behaviors involved manual investigations, and the few existing computational approaches make bold assumptions or oversimplify the problem to make it tractable Here, we propose a new network-based framework for uncovering and studying coordinated behaviors on social media Our research extends existing systems and goes beyond limiting binary classifications of coordinated and uncoordinated behaviors It allows to expose different coordination patterns and to estimate the degree of coordination that characterizes diverse communities We apply our framework to a dataset collected during the 2019 UK General Election, detecting and characterizing coordinated communities that participated in the electoral debate Our work conveys both theoretical and practical implications and provides more nuanced and fine-grained results for studying online information manipulation

25 citations

Journal ArticleDOI
TL;DR: It is shown that the more viral a stock is on Twitter, the more that virality is artificially caused by social bots, and two methods for detecting the presence and the extent of financial disinformation on Twitter are proposed.

13 citations

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a scheduling framework to select the scheduling algorithms (SFSSA) for different scheduling scenarios considering no algorithm well suitable to all scenarios, and concretize SFSSA, they propose deep learning-based algorithms selectors (DLS) trained by labeled data and deep reinforcement learningbased algorithm selectors trained by feedback from dynamic scenarios to complete the algorithms selection regarding the scheduling algorithm as selectable tools.

10 citations