scispace - formally typeset
Search or ask a question
Author

Ioannis Katakis

Bio: Ioannis Katakis is an academic researcher from National and Kapodistrian University of Athens. The author has contributed to research in topics: Sentiment analysis & Voting. The author has an hindex of 17, co-authored 49 publications receiving 5465 citations. Previous affiliations of Ioannis Katakis include Aristotle University of Thessaloniki & University of Cyprus.


Papers
More filters
Book ChapterDOI
29 Oct 2006
TL;DR: The main advantages of PersoNews are the aggregation of many different news sources, machine learning filtering offering personalization not only per user but also for every feed a user is subscribed to, and finally the ability for every user to watch a more abstracted topic of interest by employing a simple form of semantic filtering through a taxonomy of topics.
Abstract: In this paper, we present a web-based, machine-learning enhanced news reader (PersoNews) The main advantages of PersoNews are the aggregation of many different news sources, machine learning filtering offering personalization not only per user but also for every feed a user is subscribed to, and finally the ability for every user to watch a more abstracted topic of interest by employing a simple form of semantic filtering through a taxonomy of topics.

28 citations

Journal ArticleDOI
TL;DR: A novel propagation model, namely the Dynamic Linear Threshold (DLT) model, is suggested that effectively captures the way contradictory information, i.e., misinformation and credible information, propagates in the network.

20 citations

Proceedings ArticleDOI
13 Jun 2016
TL;DR: A novel propagation model, namely the Dynamic Linear Threshold (DLT) model, is suggested that effectively captures the way contradictory information, i.e., misinformation and credible information, propagates in the network.
Abstract: Online Social Networks (OSNs) constitute one of the most important communication channels and are widely utilized as news sources. Information spreads widely and rapidly in OSNs through the word-of-mouth effect. However, it is not uncommon for misinformation to propagate in the network. Misinformation dissemination may lead to undesirable effects, especially in cases where the non-credible information concerns emergency events. Therefore, it is essential to timely limit the propagation of misinformation. Towards this goal, we suggest a novel propagation model, namely the Dynamic Linear Threshold (DLT) model, that effectively captures the way contradictory information, i.e., misinformation and credible information, propagates in the network. The DLT model considers the probability of a user alternating between competing beliefs, assisting in either the propagation of misinformation or credible news. Based on the DLT model, we formulate an optimization problem that aims in identifying the most appropriate subset of users to limit the spread of misinformation by initiating the propagation of credible information. Through extensive experimental evaluation we demonstrate that our approach outperforms its competitors.

18 citations

Proceedings ArticleDOI
11 Aug 2014
TL;DR: This paper applies a plethora of analysis processes on two subsets of Twitter public data, obtained through the service's sampling API's, and extensively evaluates their relative performance in numerous scenarios.
Abstract: Social media analysis constitutes a scientific field that is rapidly gaining ground due to its numerous research challenges and practical applications, as well as the unprecedented availability of data in real time. Several of these applications have significant social and economical impact, such as journalism, crisis management, advertising, etc. However, two issues regarding these applications have to be confronted. The first one is the financial cost. Despite the abundance of information, it typically comes at a premium price, and only a fraction is provided free of charge. For example, Twitter, a predominant social media online service, grants researchers and practitioners free access to only a small proportion (1%) of its publicly available stream. The second issue is the computational cost. Even when the full stream is available, off the shelf approaches are unable to operate in such settings due to the real-time computational demands. Consequently, real world applications as well as research efforts that exploit such information are limited to utilizing only a subset of the available data. In this paper, we are interested in evaluating the extent to which analytical processes are affected by the aforementioned limitation. In particular, we apply a plethora of analysis processes on two subsets of Twitter public data, obtained through the service's sampling API's. The first one is the default 1% sample, whereas the second is the Garden hose sample that our research group has access to, returning 10% of all public data. We extensively evaluate their relative performance in numerous scenarios.

18 citations

Proceedings ArticleDOI
03 Dec 2012
TL;DR: This research is based on new data gathered by the voting advice application Choose4Greece which was widely used for the most recent elections in Greece and produces more effective recommendations by utilizing two different measures: accuracy and weighted mean rank.
Abstract: Voting advice applications (VAA) are very recently developed in order to aid users in deciding what to vote in elections. Every user is presented with a set of important issues and she is asked to submit her opinion by selecting one of a predefined set of answers (e.g. agree/disagree). The VAA gathers the same information for all candidates that are about to compete in the elections. Hence, it can provide recommendation to users: the candidates that agree with the user on these selected issues. In this paper, we propose a collaborating filtering approach for providing such suggestions. Like-minded users are clustered together based on their profiles (views on the selected issues) and voting recommendation is provided to a user by the members of the nearest (to her profile) cluster. We observe that this method produces more effective recommendations by utilizing two different measures: accuracy and weighted mean rank. Furthermore, the proposed method provides with important insight and summarization information about the electorate's opinion. This research is based on new data gathered by the voting advice application Choose4Greece which was widely used for the most recent elections in Greece.

17 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.
Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

7,072 citations

Journal ArticleDOI
TL;DR: This paper aims to provide a timely review on this area with emphasis on state-of-the-art multi-label learning algorithms with relevant analyses and discussions.
Abstract: Multi-label learning studies the problem where each example is represented by a single instance while associated with a set of labels simultaneously. During the past decade, significant amount of progresses have been made toward this emerging machine learning paradigm. This paper aims to provide a timely review on this area with emphasis on state-of-the-art multi-label learning algorithms. Firstly, fundamentals on multi-label learning including formal definition and evaluation metrics are given. Secondly and primarily, eight representative multi-label learning algorithms are scrutinized under common notations with relevant analyses and discussions. Thirdly, several related learning settings are briefly summarized. As a conclusion, online resources and open research problems on multi-label learning are outlined for reference purposes.

2,495 citations