T
Thomas Gottron
Researcher at University of Koblenz and Landau
Publications - 73
Citations - 1563
Thomas Gottron is an academic researcher from University of Koblenz and Landau. The author has contributed to research in topics: Linked data & RDF. The author has an hindex of 20, co-authored 71 publications receiving 1458 citations. Previous affiliations of Thomas Gottron include University of Mainz & Harvard University.
Papers
More filters
Proceedings ArticleDOI
Bad news travel fast: a content-based analysis of interestingness on Twitter
TL;DR: This paper analyzes a set of high- and low-level content-based features on several large collections of Twitter messages to obtain insights into what makes a message on Twitter worth retweeting and, thus, interesting.
Journal ArticleDOI
SchemEX - Efficient construction of a data catalogue by stream-based indexing of linked data
TL;DR: The schema index provided by SchemEX can be used to locate distributed data sources in the LOD cloud and is capable of locating relevant data sources with recall between 71% and 98% and a precision between 74% and 100% at a window size of 100 K triples observed in the stream.
Proceedings ArticleDOI
Searching microblogs: coping with sparsity and document quality
TL;DR: The results show that deliberately ignoring length normalization yields better retrieval results in general and that interestingness improves retrieval for underspecified queries.
Book ChapterDOI
A comparison of language identification approaches on short, query-style texts
Thomas Gottron,Nedim Lipka +1 more
TL;DR: This work compares the performance of some typical approaches for language detection on very short, query-style texts and shows that already for single words an accuracy of more than 80% can be achieved, for slightly longer texts the authors even observed accuracy values close to 100%.
Proceedings ArticleDOI
Content Code Blurring: A New Approach to Content Extraction
TL;DR: This work introduces content code blurring, a novel content extraction algorithm that identifies exactly these regions in an iterative process of identifying the main content and/or removing the additional contents in a web document.