scispace - formally typeset
Search or ask a question

Showing papers by "Emiliano De Cristofaro published in 2019"


Proceedings ArticleDOI
19 May 2019
TL;DR: In this article, passive and active inference attacks are proposed to exploit the leakage of information about participants' training data in federated learning, where each participant can infer the presence of exact data points and properties that hold only for a subset of the training data and are independent of the properties of the joint model.
Abstract: Collaborative machine learning and related techniques such as federated learning allow multiple participants, each with his own training dataset, to build a joint model by training locally and periodically exchanging model updates. We demonstrate that these updates leak unintended information about participants' training data and develop passive and active inference attacks to exploit this leakage. First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i.e., membership inference). Then, we show how this adversary can infer properties that hold only for a subset of the training data and are independent of the properties that the joint model aims to capture. For example, he can infer when a specific person first appears in the photos used to train a binary gender classifier. We evaluate our attacks on a variety of tasks, datasets, and learning configurations, analyze their limitations, and discuss possible defenses.

1,084 citations


Journal ArticleDOI
01 Jan 2019
TL;DR: In this paper, membership inference attacks against generative models are presented, where given a data point, the adversary determines whether or not it was used to train the model, and the attacks leverage Generative Adversarial Networks (GANs), which combine a discriminative and a generative model, to detect overfitting and recognize inputs that were part of training datasets, using the discriminator's capacity to learn statistical differences in distributions.
Abstract: Generative models estimate the underlying distribution of a dataset to generate realistic samples according to that distribution. In this paper, we present the first membership inference attacks against generative models: given a data point, the adversary determines whether or not it was used to train the model. Our attacks leverage Generative Adversarial Networks (GANs), which combine a discriminative and a generative model, to detect overfitting and recognize inputs that were part of training datasets, using the discriminator's capacity to learn statistical differences in distributions. We present attacks based on both white-box and black-box access to the target model, against several state-of-the-art generative models, over datasets of complex representations of faces (LFW), objects (CIFAR-10), and medical images (Diabetic Retinopathy). We also discuss the sensitivity of the attacks to different training parameters, and their robustness against mitigation strategies, finding that defenses are either ineffective or lead to significantly worse performances of the generative models in terms of training stability and/or sample quality.

266 citations


Journal ArticleDOI
09 Apr 2019
TL;DR: MAMADROID as discussed by the authors is a static analysis-based system that abstracts API calls to their class, package, or family, and builds a model from their sequences obtained from the call graph of an app as Markov chains.
Abstract: As Android has become increasingly popular, so has malware targeting it, thus motivating the research community to propose different detection techniques. However, the constant evolution of the Android ecosystem, and of malware itself, makes it hard to design robust tools that can operate for long periods of time without the need for modifications or costly re-training. Aiming to address this issue, we set to detect malware from a behavioral point of view, modeled as the sequence of abstracted API calls. We introduce MAMADROID, a static-analysis-based system that abstracts app’s API calls to their class, package, or family, and builds a model from their sequences obtained from the call graph of an app as Markov chains. This ensures that the model is more resilient to API changes and the features set is of manageable size. We evaluate MAMADROID using a dataset of 8.5K benign and 35.5K malicious apps collected over a period of 6 years, showing that it effectively detects malware (with up to 0.99 F-measure) and keeps its detection capabilities for long periods of time (up to 0.87 F-measure 2 years after training). We also show that MAMADROID remarkably overperforms DROIDAPIMINER, a state-of-the-art detection system that relies on the frequency of (raw) API calls. Aiming to assess whether MAMADROID’s effectiveness mainly stems from the API abstraction or from the sequencing modeling, we also evaluate a variant of it that uses frequency (instead of sequences), of abstracted API calls. We find that it is not as accurate, failing to capture maliciousness when trained on malware samples that include API calls that are equally or more frequently used by benign apps.

216 citations


Proceedings ArticleDOI
13 May 2019
TL;DR: The authors analyzed 27k tweets posted by 1K Twitter users identified as having ties with Russia's Internet Research Agency and thus likely state-sponsored trolls, and found that Russian trolls managed to stay active for long periods of time and to reach a substantial number of Twitter users with their tweets.
Abstract: Over the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” Although they are often involved in spreading disinformation on social media, there is little understanding of how these trolls operate, what type of content they disseminate, and most importantly their influence on the information ecosystem. In this paper, we shed light on these questions by analyzing 27K tweets posted by 1K Twitter users identified as having ties with Russia’s Internet Research Agency and thus likely state-sponsored trolls. We compare their behavior to a random set of Twitter users, finding interesting differences in terms of the content they disseminate, the evolution of their account, as well as their general behavior and use of Twitter. Then, using Hawkes Processes, we quantify the influence that trolls had on the dissemination of news on social platforms like Twitter, Reddit, and 4chan. Overall, our findings indicate that Russian trolls managed to stay active for long periods of time and to reach a substantial number of Twitter users with their tweets. When looking at their ability of spreading news content and making it viral, however, we find that their effect on social platforms was minor, with the significant exception of news published by the Russian state-sponsored news outlet RT (Russia Today).

119 citations


Journal ArticleDOI
TL;DR: This work presents a robust methodology to distinguish bullies and aggressors from normal Twitter users by considering text, user, and network-based attributes, and discusses the current status of Twitter user accounts marked as abusive by the methodology and the performance of potential mechanisms that can be used by Twitter to suspend users in the future.
Abstract: Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression, isolation from other community members, which embed the risk to lead to even more critical consequences, such as suicide attempts.In this work, we take the first concrete steps to understand the characteristics of abusive behavior in Twitter, one of today’s largest social media platforms. We analyze 1.2 million users and 2.1 million tweets, comparing users participating in discussions around seemingly normal topics like the NBA, to those more likely to be hate-related, such as the Gamergate controversy, or the gender pay inequality at the BBC station. We also explore specific manifestations of abusive behavior, i.e., cyberbullying and cyberaggression, in one of the hate-related communities (Gamergate). We present a robust methodology to distinguish bullies and aggressors from normal Twitter users by considering text, user, and network-based attributes. Using various state-of-the-art machine-learning algorithms, we classify these accounts with over 90% accuracy and AUC. Finally, we discuss the current status of Twitter user accounts marked as abusive by our methodology and study the performance of potential mechanisms that can be used by Twitter to suspend users in the future.

80 citations


Journal ArticleDOI
07 Nov 2019
TL;DR: This paper proposes an automated solution to identify YouTube videos that are likely to be targeted by coordinated harassers from fringe communities like 4chan, and uses an ensemble of classifiers to determine the likelihood that a video will be raided with very good results.
Abstract: Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effective countermeasures to mitigate it. Unlike well-studied problems like spam and phishing, coordinated aggressive behavior both targets and is perpetrated by humans, making defense mechanisms that look for automated activity unsuitable. Therefore, the de-facto solution is to reactively rely on user reports and human moderation. In this paper, we propose an automated solution to identify YouTube videos that are likely to be targeted by coordinated harassers from fringe communities like 4chan. First, we characterize and model YouTube videos along several axes (metadata, audio transcripts, thumbnails) based on a ground truth dataset of videos that were targeted by raids. Then, we use an ensemble of classifiers to determine the likelihood that a video will be raided with very good results (AUC up to 94%). Overall, our work provides an important first step towards deploying proactive systems to detect and mitigate coordinated hate attacks on platforms like YouTube.

77 citations


Journal ArticleDOI
TL;DR: This paper presents a novel technique for privately releasing generative models and entire high-dimensional datasets produced by these models, and evaluates it using the MNIST dataset, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.
Abstract: Generative models are used in a wide range of applications building on large amounts of contextually rich information. Due to possible privacy violations of the individuals whose data is used to train these models, however, publishing or sharing generative models is not always viable. In this paper, we present a novel technique for privately releasing generative models and entire high-dimensional datasets produced by these models. We model the generator distribution of the training data with a mixture of $k$k generative neural networks. These are trained together and collectively learn the generator distribution of a dataset. Data is divided into $k$k clusters, using a novel differentially private kernel $k$k-means, then each cluster is given to separate generative neural networks, such as Restricted Boltzmann Machines or Variational Autoencoders, which are trained only on their own cluster using differentially private gradient descent. We evaluate our approach using the MNIST dataset, as well as call detail records and transit datasets, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.

65 citations


Proceedings ArticleDOI
21 Oct 2019
TL;DR: This paper focuses on identifying key challenges that might disrupt continuing efforts to decentralise the web, and empirically highlight a number of properties that are creating natural pressures towards re-centralisation.
Abstract: The Decentralised Web (DW) has recently seen a renewed momentum, with a number of DW platforms like Mastodon, PeerTube, and Hubzilla gaining increasing traction. These offer alternatives to traditional social networks like Twitter, YouTube, and Facebook, by enabling the operation of web infrastructure and services without centralised ownership or control. Although their services differ greatly, modern DW platforms mostly rely on two key innovations: first, their open source software allows anybody to setup independent servers ("instances") that people can sign-up to and use within a local community; and second, they build on top of federation protocols so that instances can mesh together, in a peer-to-peer fashion, to offer a globally integrated platform.In this paper, we present a measurement-driven exploration of these two innovations, using a popular DW microblogging platform (Mastodon) as a case study. We focus on identifying key challenges that might disrupt continuing efforts to decentralise the web, and empirically highlight a number of properties that are creating natural pressures towards re-centralisation. Finally, our measurements shed light on the behaviour of both administrators (i.e., people setting up instances) and regular users who sign-up to the platforms, also discussing a few techniques that may address some of the issues observed.

36 citations


Posted Content
TL;DR: In this article, the authors present a measurement-driven exploration of these two innovations, using a popular Decentralized Web microblogging platform (Mastodon) as a case study, identifying key challenges that might disrupt continuing efforts to decentralise the web, and empirically highlighting a number of properties that are creating natural pressures towards recentralisation.
Abstract: The Decentralised Web (DW) has recently seen a renewed momentum, with a number of DW platforms like Mastodon, Peer-Tube, and Hubzilla gaining increasing traction. These offer alternatives to traditional social networks like Twitter, YouTube, and Facebook, by enabling the operation of web infrastructure and services without centralised ownership or control. Although their services differ greatly, modern DW platforms mostly rely on two key innovations: first, their open source software allows anybody to setup independent servers ("instances") that people can sign-up to and use within a local community; and second, they build on top of federation protocols so that instances can mesh together, in a peer-to-peer fashion, to offer a globally integrated platform. In this paper, we present a measurement-driven exploration of these two innovations, using a popular DW microblogging platform (Mastodon) as a case study. We focus on identifying key challenges that might disrupt continuing efforts to decentralise the web, and empirically highlight a number of properties that are creating natural pressures towards recentralisation. Finally, our measurements shed light on the behaviour of both administrators (i.e., people setting up instances) and regular users who sign-up to the platforms, also discussing a few techniques that may address some of the issues observed.

36 citations


Posted Content
TL;DR: This paper presents a measurement study shedding light on how genetic testing is being discussed on Web communities in Reddit and 4chan, using NLP and computer vision tools to identify trends, themes, and topics of discussion.
Abstract: Progress in genomics has enabled the emergence of a booming market for "direct-to-consumer" genetic testing. Nowadays, companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. At the same time, alt- and far-right groups have also taken an interest in genetic testing, using them to attack minorities and prove their genetic "purity." In this paper, we present a measurement study shedding light on how genetic testing is being discussed on Web communities in Reddit and 4chan. We collect 1.3M comments posted over 27 months on the two platforms, using a set of 280 keywords related to genetic testing. We then use NLP and computer vision tools to identify trends, themes, and topics of discussion. Our analysis shows that genetic testing attracts a lot of attention on Reddit and 4chan, with discussions often including highly toxic language expressed through hateful, racist, and misogynistic comments. In particular, on 4chan's politically incorrect board (/pol/), content from genetic testing conversations involves several alt-right personalities and openly antisemitic rhetoric, often conveyed through memes. Finally, we find that discussions build around user groups, from technology enthusiasts to communities promoting fringe political views.

29 citations


Journal ArticleDOI
01 Jan 2019
TL;DR: In this article, the current knowledge on privacy-enhancing technologies used for testing, storing, and sharing genomic data, using a representative sample of the work published in the past decade.
Abstract: Rapid advances in human genomics are enabling researchers to gain a better understanding of the role of the genome in our health and well-being, stimulating hope for more effective and cost efficient healthcare. However, this also prompts a number of security and privacy concerns stemming from the distinctive characteristics of genomic data. To address them, a new research community has emerged and produced a large number of publications and initiatives. In this paper, we rely on a structured methodology to contextualize and provide a critical analysis of the current knowledge on privacy-enhancing technologies used for testing, storing, and sharing genomic data, using a representative sample of the work published in the past decade. We identify and discuss limitations, technical challenges, and issues faced by the community, focusing in particular on those that are inherently tied to the nature of the problem and are harder for the community alone to address. Finally, we report on the importance and difficulty of the identified challenges based on an online survey of genome data privacy experts

Posted Content
TL;DR: The first study of images shared by state-sponsored accounts by analyzing a ground truth dataset of 1.8M images posted to Twitter by accounts controlled by the Russian Internet Research Agency shows that the trolls were more effective in disseminating politics-related imagery than other images.
Abstract: State-sponsored organizations are increasingly linked to efforts aimed to exploit social media for information warfare and manipulating public opinion. Typically, their activities rely on a number of social network accounts they control, aka trolls, that post and interact with other users disguised as "regular" users. These accounts often use images and memes, along with textual content, in order to increase the engagement and the credibility of their posts. In this paper, we present the first study of images shared by state-sponsored accounts by analyzing a ground truth dataset of 1.8M images posted to Twitter by accounts controlled by the Russian Internet Research Agency. First, we analyze the content of the images as well as their posting activity. Then, using Hawkes Processes, we quantify their influence on popular Web communities like Twitter, Reddit, 4chan's Politically Incorrect board (/pol/), and Gab, with respect to the dissemination of images. We find that the extensive image posting activity of Russian trolls coincides with real-world events (e.g., the Unite the Right rally in Charlottesville), and shed light on their targets as well as the content disseminated via images. Finally, we show that the trolls were more effective in disseminating politics-related imagery than other images.

Posted Content
TL;DR: It is shown that, while there is no silver bullet that enables arbitrary analysis, there are defenses that provide reasonable utility for particular tasks while reducing the extent of the inference.
Abstract: Aggregate location statistics are used in a number of mobility analytics to express how many people are in a certain location at a given time (but not who). However, prior work has shown that an adversary with some prior knowledge of a victim's mobility patterns can mount membership inference attacks to determine whether or not that user contributed to the aggregates. In this paper, we set to understand why such inferences are successful and what can be done to mitigate them. We conduct an in-depth feature analysis, finding that the volume of data contributed and the regularity and particularity of mobility patterns play a crucial role in the attack. We then use these insights to adapt defenses proposed in the location privacy literature to the aggregate setting, and evaluate their privacy-utility trade-offs for common mobility analytics. We show that, while there is no silver bullet that enables arbitrary analysis, there are defenses that provide reasonable utility for particular tasks while reducing the extent of the inference.

Posted Content
TL;DR: Measurements show that there does not exist a unique generic defense that can preserve the utility of the analytics for arbitrary applications, and provide useful insights regarding the disclosure of sanitized aggregate location time-series.
Abstract: While location data is extremely valuable for various applications, disclosing it prompts serious threats to individuals' privacy. To limit such concerns, organizations often provide analysts with aggregate time-series that indicate, e.g., how many people are in a location at a time interval, rather than raw individual traces. In this paper, we perform a measurement study to understand Membership Inference Attacks (MIAs) on aggregate location time-series, where an adversary tries to infer whether a specific user contributed to the aggregates. We find that the volume of contributed data, as well as the regularity and particularity of users' mobility patterns, play a crucial role in the attack's success. We experiment with a wide range of defenses based on generalization, hiding, and perturbation, and evaluate their ability to thwart the attack vis-a-vis the utility loss they introduce for various mobility analytics tasks. Our results show that some defenses fail across the board, while others work for specific tasks on aggregate location time-series. For instance, suppressing small counts can be used for ranking hotspots, data generalization for forecasting traffic, hotspot discovery, and map inference, while sampling is effective for location labeling and anomaly detection when the dataset is sparse. Differentially private techniques provide reasonable accuracy only in very specific settings, e.g., discovering hotspots and forecasting their traffic, and more so when using weaker privacy notions like crowd-blending privacy. Overall, our measurements show that there does not exist a unique generic defense that can preserve the utility of the analytics for arbitrary applications, and provide useful insights regarding the disclosure of sanitized aggregate location time-series.

Proceedings ArticleDOI
13 May 2019
TL;DR: Private Data Donor is presented, a decentralized and private-by-design platform providing crowd-sourced Web searches to researchers, built on a cryptographic protocol for privacy-preserving data aggregation and addressed a few practical challenges to add reliability into the system with regards to users disconnecting or stopping using the platform.
Abstract: Search engines play an important role on the Web, helping users find relevant resources and answers to their questions. At the same time, search logs can also be of great utility to researchers. For instance, a number of recent research efforts have relied on them to build prediction and inference models, for applications ranging from economics and marketing to public health surveillance. However, companies rarely release search logs, also due to the related privacy issues that ensue, as they are inherently hard to anonymize. As a result, it is very difficult for researchers to have access to search data, and even if they do, they are fully dependent on the company providing them. Aiming to overcome these issues, this paper presents Private Data Donor (PDD), a decentralized and private-by-design platform providing crowd-sourced Web searches to researchers. We build on a cryptographic protocol for privacy-preserving data aggregation, and address a few practical challenges to add reliability into the system with regards to users disconnecting or stopping using the platform. We discuss how PDD can be used to build a flu monitoring model, and evaluate the impact of the privacy-preserving layer on the quality of the results. Finally, we present the implementation of our platform, as a browser extension and a server, and report on a pilot deployment with real users.

Proceedings ArticleDOI
11 Nov 2019
TL;DR: Huang et al. as discussed by the authors analyzed the real-world security guarantees provided by GenoGuard and showed that if the adversary has access to side information in the form of partial information from the target sequence, it does appreciably increase her power in determining the rest of the sequence.
Abstract: Due to its hereditary nature, genomic data is not only linked to its owner but to that of close relatives as well. As a result, its sensitivity does not really degrade over time; in fact, the relevance of a genomic sequence is likely to be longer than the security provided by encryption. This prompts the need for specialized techniques providing long-term security for genomic data, yet the only available tool for this purpose is GenoGuard~\citehuang_genoguard:_2015. By relying on \em Honey Encryption, GenoGuard is secure against an adversary that can brute force all possible keys; i.e., whenever an attacker tries to decrypt using an incorrect password, she will obtain an incorrect but plausible looking decoy sequence. In this paper, we set to analyze the real-world security guarantees provided by GenoGuard; specifically, assess how much more information does access to a ciphertext encrypted using GenoGuard yield, compared to one that was not. Overall, we find that, if the adversary has access to side information in the form of partial information from the target sequence, the use of GenoGuard does appreciably increase her power in determining the rest of the sequence. We show that, in the case of a sequence encrypted using an easily guessable (low-entropy) password, the adversary is able to rule out most decoy sequences, and obtain the target sequence with just 2.5% of it available as side information. In the case of a harder-to-guess (high-entropy) password, we show that the adversary still obtains, on average, better accuracy in guessing the rest of the target sequences than using state-of-the-art genomic sequence inference methods, obtaining up to 15% improvement in accuracy.

Journal ArticleDOI
28 Jan 2019
TL;DR: In this article, the authors present a measurement study of state-of-the-art collaborative predictive blacklisting (CPB) techniques, aiming to shed light on the actual impact of collaboration.
Abstract: Collaborative predictive blacklisting (CPB) allows to forecast future attack sources based on logs and alerts contributed by multiple organizations. Unfortunately, however, research on CPB has only focused on increasing the number of predicted attacks but has not considered the impact on false positives and false negatives. Moreover, sharing alerts is often hindered by confidentiality, trust, and liability issues, which motivates the need for privacy-preserving approaches to the problem. In this paper, we present a measurement study of state-of-the-art CPB techniques, aiming to shed light on the actual impact of collaboration. To this end, we reproduce and measure two systems: a non privacy-friendly one that uses a trusted coordinating party with access to all alerts [12] and a peer-to-peer one using privacy-preserving data sharing [8]. We show that, while collaboration boosts the number of predicted attacks, it also yields high false positives, ultimately leading to poor accuracy. This motivates us to present a hybrid approach, using a semi-trusted central entity, aiming to increase utility from collaboration while, at the same time, limiting information disclosure and false positives. This leads to a better trade-off of true and false positive rates, while at the same time addressing privacy concerns.

Journal ArticleDOI
TL;DR: The design and implementation of SplitBox are presented, a system for privacy-preserving processing of network functions outsourced to cloud middleboxes—i.e., without revealing the policies governing these functions, while providing provably secure guarantees.

Posted Content
TL;DR: Overall, it is found that, if the adversary has access to side information in the form of partial information from the target sequence, the use of GenoGuard does appreciably increase her power in determining the rest of the sequence.
Abstract: Due to its hereditary nature, genomic data is not only linked to its owner but to that of close relatives as well. As a result, its sensitivity does not really degrade over time; in fact, the relevance of a genomic sequence is likely to be longer than the security provided by encryption. This prompts the need for specialized techniques providing long-term security for genomic data, yet the only available tool for this purpose is GenoGuard (Huang et al., 2015). By relying on Honey Encryption, GenoGuard is secure against an adversary that can brute force all possible keys; i.e., whenever an attacker tries to decrypt using an incorrect password, she will obtain an incorrect but plausible looking decoy sequence. In this paper, we set to analyze the real-world security guarantees provided by GenoGuard; specifically, assess how much more information does access to a ciphertext encrypted using GenoGuard yield, compared to one that was not. Overall, we find that, if the adversary has access to side information in the form of partial information from the target sequence, the use of GenoGuard does appreciably increase her power in determining the rest of the sequence. We show that, in the case of a sequence encrypted using an easily guessable (low-entropy) password, the adversary is able to rule out most decoy sequences, and obtain the target sequence with just 2.5\% of it available as side information. In the case of a harder-to-guess (high-entropy) password, we show that the adversary still obtains, on average, better accuracy in guessing the rest of the target sequences than using state-of-the-art genomic sequence inference methods, obtaining up to 15% improvement in accuracy.

Posted Content
TL;DR: This paper analyzed 1.2 million users and 2.1 million tweets, comparing users participating in discussions around seemingly normal topics like the NBA, to those more likely to be hate-related, such as the Gamergate controversy, or the gender pay inequality at the BBC station.
Abstract: Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression, isolation from other community members, which embed the risk to lead to even more critical consequences, such as suicide attempts. In this work, we take the first concrete steps to understand the characteristics of abusive behavior in Twitter, one of today's largest social media platforms. We analyze 1.2 million users and 2.1 million tweets, comparing users participating in discussions around seemingly normal topics like the NBA, to those more likely to be hate-related, such as the Gamergate controversy, or the gender pay inequality at the BBC station. We also explore specific manifestations of abusive behavior, i.e., cyberbullying and cyberaggression, in one of the hate-related communities (Gamergate). We present a robust methodology to distinguish bullies and aggressors from normal Twitter users by considering text, user, and network-based attributes. Using various state-of-the-art machine learning algorithms, we classify these accounts with over 90% accuracy and AUC. Finally, we discuss the current status of Twitter user accounts marked as abusive by our methodology, and study the performance of potential mechanisms that can be used by Twitter to suspend users in the future.