scispace - formally typeset
T

Thomas Gottron

Researcher at University of Koblenz and Landau

Publications -  73
Citations -  1563

Thomas Gottron is an academic researcher from University of Koblenz and Landau. The author has contributed to research in topics: Linked data & RDF. The author has an hindex of 20, co-authored 71 publications receiving 1458 citations. Previous affiliations of Thomas Gottron include University of Mainz & Harvard University.

Papers
More filters
Proceedings ArticleDOI

Bad news travel fast: a content-based analysis of interestingness on Twitter

TL;DR: This paper analyzes a set of high- and low-level content-based features on several large collections of Twitter messages to obtain insights into what makes a message on Twitter worth retweeting and, thus, interesting.
Journal ArticleDOI

SchemEX - Efficient construction of a data catalogue by stream-based indexing of linked data

TL;DR: The schema index provided by SchemEX can be used to locate distributed data sources in the LOD cloud and is capable of locating relevant data sources with recall between 71% and 98% and a precision between 74% and 100% at a window size of 100 K triples observed in the stream.
Proceedings ArticleDOI

Searching microblogs: coping with sparsity and document quality

TL;DR: The results show that deliberately ignoring length normalization yields better retrieval results in general and that interestingness improves retrieval for underspecified queries.
Book ChapterDOI

A comparison of language identification approaches on short, query-style texts

TL;DR: This work compares the performance of some typical approaches for language detection on very short, query-style texts and shows that already for single words an accuracy of more than 80% can be achieved, for slightly longer texts the authors even observed accuracy values close to 100%.
Proceedings ArticleDOI

Content Code Blurring: A New Approach to Content Extraction

TL;DR: This work introduces content code blurring, a novel content extraction algorithm that identifies exactly these regions in an iterative process of identifying the main content and/or removing the additional contents in a web document.