scispace - formally typeset
Search or ask a question

Showing papers by "Simon Scheider published in 2021"


Journal ArticleDOI
TL;DR: The scientific challenges of geo-analytical question answering are investigated, introducing the problems of unknown answers and indirect QA and it is argued why core concepts of spatial information play an important role in addressing this challenge.
Abstract: Question Answering (QA), the process of computing valid answers to questions formulated in natural language, has recently gained attention in both industry and academia. Translating this idea to th...

30 citations


Journal ArticleDOI
TL;DR: Results show that CCD concepts significantly improve the precision of workflow synthesis, and the quality of automatically synthesized workflows against a benchmark generated from common data types is measured.
Abstract: Loose programming enables analysts to program with concepts instead of procedural code. Data transformations are left underspecified, leaving out procedural details and exploiting knowledge about the applicability of functions to data types. To synthesize workflows of high quality for a geo-analytical task, the semantic type system needs to reflect knowledge of geographic information systems (GIS) at a level that is deep enough to capture geo-analytical concepts and intentions, yet shallow enough to generalize over GIS implementations. Recently, core concepts of spatial information and related geo-analytical concepts were proposed as a way to add the required abstraction level to current geodata models. The core concept data types (CCD) ontology is a semantic type system that can be used to constrain GIS functions for workflow synthesis. However, to date, it is unknown what gain in precision and workflow quality can be expected. In this article we synthesize workflows by annotating GIS tools with these types, specifying a range of common analytical tasks taken from an urban livability scenario. We measure the quality of automatically synthesized workflows against a benchmark generated from common data types. Results show that CCD concepts significantly improve the precision of workflow synthesis.

12 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the relationship between running behavior and external situations for different types of users and found that specific temporal and environmental situations (hour in a day, day in a week, temperature, distance to residential areas, and population density) influence the running performance of users more than other situational features.
Abstract: Running is a popular form of physical activity. Personal, social, and environmental determinants influence the engagement of the individual. To get insight in the relation between running behavior and external situations for different types of users, we carried out an extensive data mining study on large-scale datasets. We combined 4 years of historical running data (collected by a mobile exercise application from over 10K participants) with weather, topographical and demographical datasets. We introduce weighted frequent item mining for the analysis of the data. In this way, we capture temporal and environmental situations that frequently associate with different running performances. The results show that specific temporal and environmental situations (hour in a day, day in a week, temperature, distance to residential areas, and population density) influence the running performance of users more than other situational features. Hierarchical agglomerative clustering on the running data is used to split runners in two clusters (with sustained and less sustained running behavior). We compared the two groups of runners and found that runners with less sustained behavior are more sensitive to the environmental situations (especially several weather and location related features, such as temperature, weather type, distance to the nearest park) than regular runners. Further analysis focused on the situational features for the less sustained runners. Results show that specific feature values correspond to a better or worse running distance. Not only the influence of individual features was examined but also the interplay between features. Our findings provide important empirical evidence that the role of external situations in the running behavior of individuals can be derived from analysis of the combined historical datasets. This opens up a large potential to take those situations specifically into consideration when supporting individuals which show less sustained behavior.

9 citations


Journal ArticleDOI
TL;DR: This study investigates how geo-analytical questions are structured syntactically and semantically, and how the structure may be interpreted by human analysts to compose workflows, and identifies analytical goals attributable to these notions.
Abstract: This study investigates the GeoAnQu corpus of geo-analytical questions. Unlike other question corpora, the questions in this corpus imply analytical goals and are thus supposed to be answered with GIS workflows, not with the retrieval of geographic facts. We investigate how geo-analytical questions are structured syntactically and semantically, and how the structure may be interpreted by human analysts to compose workflows. Our question analysis model is based on the notions of a measure, support, and extent, which are inspired by Sinton’s three dimensions of spatial analysis. We use XPath queries to automatically extract syntactic patterns from constituency parse trees corresponding to these notions. Results show that geo-analytical questions are of considerable complexity, yet often have predictable syntactic patterns that can be reliably mapped to measures, supports, and extents. Furthermore, we identify analytical goals attributable to these notions. To our knowledge, this is the first reported systematic analysis of this kind. The findings open new opportunities in Natural Language Interpretation and query generation for the automated answering of geo-analytical questions. Additionally, our study shows that questions asked in a scientific context can be on different levels of concreteness. Therefore, we also discuss best practices for formulating questions clearly and concretely.

1 citations