scispace - formally typeset
Search or ask a question

Showing papers by "Dawn Song published in 2012"


Proceedings ArticleDOI
20 May 2012
TL;DR: In over 20% of cases, the classifiers can correctly identify an anonymous author given a corpus of texts from 100,000 authors; in about 35% of Cases the correct author is one of the top 20 guesses.
Abstract: We study techniques for identifying an anonymous author via linguistic stylometry, i.e., comparing the writing style against a corpus of texts of known authorship. We experimentally demonstrate the effectiveness of our techniques with as many as 100,000 candidate authors. Given the increasing availability of writing samples online, our result has serious implications for anonymity and free speech - an anonymous blogger or whistleblower may be unmasked unless they take steps to obfuscate their writing style. While there is a huge body of literature on authorship recognition based on writing style, almost none of it has studied corpora of more than a few hundred authors. The problem becomes qualitatively different at a large scale, as we show, and techniques from prior work fail to scale, both in terms of accuracy and performance. We study a variety of classifiers, both "lazy" and "eager," and show how to handle the huge number of classes. We also develop novel techniques for confidence estimation of classifier outputs. Finally, we demonstrate stylometric authorship recognition on texts written in different contexts. In over 20% of cases, our classifiers can correctly identify an anonymous author given a corpus of texts from 100,000 authors; in about 35% of cases the correct author is one of the top 20 guesses. If we allow the classifier the option of not making a guess, via confidence estimation we are able to increase the precision of the top guess from 20% to over 80% with only a halving of recall.

290 citations


Proceedings ArticleDOI
20 May 2012
TL;DR: The design and evaluation of a new system, GUPT, that guarantees differential privacy to programs not developed with privacy in mind, makes no trust assumptions about the analysis program, and is secure to all known classes of side-channel attacks.
Abstract: It is often highly valuable for organizations to have their data analyzed by external agents. However, any program that computes on potentially sensitive data risks leaking information through its output. Differential privacy provides a theoretical framework for processing data while protecting the privacy of individual records in a dataset. Unfortunately, it has seen limited adoption because of the loss in output accuracy, the difficulty in making programs differentially private, lack of mechanisms to describe the privacy budget in a programmer's utilitarian terms, and the challenging requirement that data owners and data analysts manually distribute the limited privacy budget between queries.This paper presents the design and evaluation of a new system, GUPT, that overcomes these challenges. Unlike existing differentially private systems such as PINQ and Airavat, it guarantees differential privacy to programs not developed with privacy in mind, makes no trust assumptions about the analysis program, and is secure to all known classes of side-channel attacks.GUPT uses a new model of data sensitivity that degrades privacy of data over time. This enables efficient allocation of different levels of privacy for different user applications while guaranteeing an overall constant level of privacy and maximizing the utility of each application. GUPT also introduces techniques that improve the accuracy of output while achieving the same level of privacy. These approaches enable GUPT to easily execute a wide variety of data analysis programs while providing both utility and privacy.

290 citations


Book ChapterDOI
26 Jul 2012
TL;DR: Juxtapp is proposed, a scalable infrastructure for code similarity analysis among Android applications that provides a key solution to a number of problems in Android security, including determining if apps contain copies of buggy code, have significant code reuse that indicates piracy, or are instances of known malware.
Abstract: Mobile application markets such as the Android Marketplace provide a centralized showcase of applications that end users can purchase or download for free onto their mobile phones. Despite the influx of applications to the markets, applications are cursorily reviewed by marketplace maintainers due to the vast number of submissions. User policing and reporting is the primary method to detect misbehaving applications. This reactive approach to application security, especially when programs can contain bugs, malware, or pirated (inauthentic) code, puts too much responsibility on the end users. In light of this, we propose Juxtapp, a scalable infrastructure for code similarity analysis among Android applications. Juxtapp provides a key solution to a number of problems in Android security, including determining if apps contain copies of buggy code, have significant code reuse that indicates piracy, or are instances of known malware. We evaluate our system using more than 58,000 Android applications and demonstrate that our system scales well and is effective. Our results show that Juxtapp is able to detect: 1) 463 applications with confirmed buggy code reuse that can lead to serious vulnerabilities in real-world apps, 2) 34 instances of known malware and variants (13 distinct variants of the GoldDream malware), and 3) pirated variants of a popular paid game.

237 citations


Proceedings ArticleDOI
14 Nov 2012
TL;DR: A new generative model is developed to jointly reproduce the social structure and the node attributes of real social networks and it is demonstrated that the model provides more accurate predictions for practical application contexts.
Abstract: Understanding social network structure and evolution has important implications for many aspects of network and system design including provisioning, bootstrapping trust and reputation systems via social networks, and defenses against Sybil attacks. Several recent results suggest that augmenting the social network structure with user attributes (e.g., location, employer, communities of interest) can provide a more fine-grained understanding of social networks. However, there have been few studies to provide a systematic understanding of these effects at scale. We bridge this gap using a unique dataset collected as the Google+ social network grew over time since its release in late June 2011. We observe novel phenomena with respect to both standard social network metrics and new attribute-related metrics (that we define). We also observe interesting evolutionary patterns as Google+ went from a bootstrap phase to a steady invitation-only stage before a public release. Based on our empirical observations, we develop a new generative model to jointly reproduce the social structure and the node attributes. Using theoretical analysis and empirical evaluations, we show that our model can accurately reproduce the social and attribute structure of real social networks. We also demonstrate that our model provides more accurate predictions for practical application contexts.

212 citations


Book ChapterDOI
27 Feb 2012
TL;DR: In this article, the authors consider applications where an untrusted aggregator would like to collect privacy sensitive data from users, and compute aggregate statistics periodically, such as the total power consumption of a neighborhood every ten minutes, or the fraction of population watching ESPN on an hourly basis.
Abstract: We consider applications where an untrusted aggregator would like to collect privacy sensitive data from users, and compute aggregate statistics periodically. For example, imagine a smart grid operator who wishes to aggregate the total power consumption of a neighborhood every ten minutes; or a market researcher who wishes to track the fraction of population watching ESPN on an hourly basis.

205 citations


Journal ArticleDOI
TL;DR: This research explores a new cloud platform architecture called Data Protection as a Service, which dramatically reduces the per-application development effort required to offer data protection, while still allowing rapid development and maintenance.
Abstract: Offering strong data protection to cloud users while enabling rich applications is a challenging task. Researchers explore a new cloud platform architecture called Data Protection as a Service, which dramatically reduces the per-application development effort required to offer data protection, while still allowing rapid development and maintenance.

154 citations


Proceedings Article
08 Aug 2012
TL;DR: This is the first attempt to study the security implications of consumer-grade BCI devices and shows that the entropy of the private information is decreased on the average by approximately 15%-40% compared to random guessing attacks.
Abstract: Brain computer interfaces (BCI) are becoming increasingly popular in the gaming and entertainment industries. Consumer-grade BCI devices are available for a few hundred dollars and are used in a variety of applications, such as video games, hands-free keyboards, or as an assistant in relaxation training. There are application stores similar to the ones used for smart phones, where application developers have access to an API to collect data from the BCI devices. The security risks involved in using consumer-grade BCI devices have never been studied and the impact of malicious software with access to the device is unexplored. We take a first step in studying the security implications of such devices and demonstrate that this upcoming technology could be turned against users to reveal their private and secret information. We use inexpensive electroencephalography (EEG) based BCI devices to test the feasibility of simple, yet effective, attacks. The captured EEG signal could reveal the user's private information about, e.g., bank cards, PIN numbers, area of living, the knowledge of the known persons. This is the first attempt to study the security implications of consumer-grade BCI devices. We show that the entropy of the private information is decreased on the average by approximately 15%-40% compared to random guessing attacks.

137 citations


Proceedings ArticleDOI
10 Dec 2012
TL;DR: In this paper, the authors cluster a corpus of 188,389 Android applications and 27,029 Face book applications to find patterns in permission requests, and find that Face book permission requests follow a clear structure that can be fitted well with only five patterns, whereas Android applications demonstrate more complex permission requests.
Abstract: Android and Face book provide third-party applications with access to users' private data and the ability to perform potentially sensitive operations (e.g., post to a user's wall or place phone calls). As a security measure, these platforms restrict applications' privileges with permission systems: users must approve the permissions requested by applications before the applications can make privacy-or security-relevant API calls. However, recent studies have shown that users often do not understand permission requests and are unsure of which permissions are typical for applications. As a first step towards simplifying permission systems, we cluster a corpus of 188,389 Android applications and 27,029 Face book applications to find patterns in permission requests. Using a method for Boolean matrix factorization to find overlapping clusters of permissions, we find that Face book permission requests follow a clear structure that can be fitted well with only five patterns, whereas Android applications demonstrate more complex permission requests. We also find that low-reputation applications often deviate from the permission request patterns that we identified for high-reputation applications, which suggests that permission request patterns can be indicative of user satisfaction or application quality.

103 citations


Book ChapterDOI
10 Sep 2012
TL;DR: It is shown that any n-party protocol computing the sum with sparse communication graph must incur an additive error of $\Omega(\sqrt{n})$ with constant probability, in order to defend against potential coalitions of compromised users.
Abstract: We consider distributed private data analysis, where n parties each holding some sensitive data wish to compute some aggregate statistics over all parties' data. We prove a tight lower bound for the private distributed summation problem. Our lower bound is strictly stronger than the prior lower-bound result by Beimel, Nissim, and Omri published in CRYPTO 2008. In particular, we show that any n-party protocol computing the sum with sparse communication graph must incur an additive error of $\Omega(\sqrt{n})$ with constant probability, in order to defend against potential coalitions of compromised users. Furthermore, we show that in the client-server communication model, where all users communicate solely with an untrusted server, the additive error must be $\Omega(\sqrt{n})$, regardless of the number of messages or rounds. Both of our lower-bounds, for the general setting and the client-to-server communication model, are strictly stronger than those of Beimel, Nissim and Omri, since we remove the assumption on the number of rounds (and also the number of messages in the client-to-server communication model). Our lower bounds generalize to the (e, δ) differential privacy notion, for reasonably small values of δ.

86 citations


Proceedings ArticleDOI
03 Mar 2012
TL;DR: This work proposes a technique to explore a high-f fidelity emulator with symbolic execution, and then lift those test cases to test a lower-fidelity emulator, which serves as a proxy for the hardware specification.
Abstract: Processor emulators are widely used to provide isolation and instrumentation of binary software. However they have proved difficult to implement correctly: processor specifications have many corner cases that are not exercised by common workloads. It is untenable to base other system security properties on the correctness of emulators that have received only ad-hoc testing. To obtain emulators that are worthy of the required trust, we propose a technique to explore a high-fidelity emulator with symbolic execution, and then lift those test cases to test a lower-fidelity emulator. The high-fidelity emulator serves as a proxy for the hardware specification, but we can also further validate by running the tests on real hardware. We implement our approach and apply it to generate about 610,000 test cases; for about 95% of the instructions we achieve complete path coverage. The tests reveal thousands of individual differences; we analyze those differences to shed light on a number of root causes, such as atomicity violations and missing security features.

84 citations


Posted Content
TL;DR: This work proposes an algorithm that perturbs the structure of a social graph in order to provide link privacy, at the cost of slight reduction in the utility of the social graph.
Abstract: A growing body of research leverages social network based trust relationships to improve the functionality of the system. However, these systems expose users' trust relationships, which is considered sensitive information in today's society, to an adversary. In this work, we make the following contributions. First, we propose an algorithm that perturbs the structure of a social graph in order to provide link privacy, at the cost of slight reduction in the utility of the social graph. Second we define general metrics for characterizing the utility and privacy of perturbed graphs. Third, we evaluate the utility and privacy of our proposed algorithm using real world social graphs. Finally, we demonstrate the applicability of our perturbation algorithm on a broad range of secure systems, including Sybil defenses and secure routing.

Proceedings Article
08 Aug 2012
TL;DR: A new design for achieving effective privilege separation in HTML5 applications that shows how applications can cheaply create arbitrary number of components and considerably improves auditability is proposed.
Abstract: The standard approach for privilege separation in web applications is to execute application components in different web origins. This limits the practicality of privilege separation since each web origin has financial and administrative cost. In this paper, we propose a new design for achieving effective privilege separation in HTML5 applications that shows how applications can cheaply create arbitrary number of components. Our approach utilizes standardized abstractions already implemented in modern browsers. We do not advocate any changes to the underlying browser or require learning new high-level languages, which contrasts prior approaches. We empirically show that we can retrofit our design to real-world HTML5 applications (browser extensions and rich client-side applications) and achieve reduction of 6x to 10000x in TCB for our case studies. Our mechanism requires less than 13 lines of application-specific code changes and considerably improves auditability.

Proceedings Article
13 Jun 2012
TL;DR: A secure thin terminal that runs on standard PC hardware and provides a responsive interface to applications like banking, email, and document editing is implemented and it is shown that the cloud rendering engine can provide secure online banking for 5-10 cents per user per month.
Abstract: Current PC- and web-based applications provide insufficient security for the information they access, because vulnerabilities anywhere in a large client software stack can compromise confidentiality and integrity. We propose a new architecture for secure applications, Cloud Terminal, in which the only software running on the end host is a lightweight secure thin terminal, and most application logic is in a remote cloud rendering engine. The secure thin terminal has a very small TCB (23 KLOC) and no dependence on the untrusted OS, so it can be easily checked and remotely attested to. The terminal is also general-purpose: it simply supplies a secure display and input path to remote software. The cloud rendering engine runs an off-the-shelf application in a restricted VM hosted by the provider, but resource sharing between VMs lets one server support hundreds of users. We implement a secure thin terminal that runs on standard PC hardware and provides a responsive interface to applications like banking, email, and document editing. We also show that our cloud rendering engine can provide secure online banking for 5-10 cents per user per month.

Posted Content
TL;DR: In this article, it was shown that any n-party protocol computing the sum with sparse communication graph must incur an additive error of Ω( √ n) with constant probability, in order to defend against potential coalitions of compromised users.
Abstract: We consider distributed private data analysis, where n parties each holding some sensitive data wish to compute some aggregate statistics over all parties’ data. We prove a tight lower bound for the private distributed summation problem. Our lower bound is strictly stronger than the prior lower-bound result by Beimel, Nissim, and Omri published in CRYPTO 2008. In particular, we show that any n-party protocol computing the sum with sparse communication graph must incur an additive error of Ω( √ n) with constant probability, in order to defend against potential coalitions of compromised users. Furthermore, we show that in the client-server communication model, where all users communicate solely with an untrusted server, the additive error must be Ω( √ n), regardless of the number of messages or rounds. Both of our lower-bounds, for the general setting and the client-to-server communication model, are strictly stronger than those of Beimel, Nissim and Omri, since we remove the assumption on the number of rounds (and also the number of messages in the client-to-server communication model). Our lower bounds generalize to the ( , δ) differential privacy notion, for reasonably small values of δ.

Proceedings Article
01 Aug 2012
TL;DR: In this paper, the authors proposed an algorithm that perturbs the structure of a social graph in order to provide link privacy, at the cost of slight reduction in the utility of the social graph.
Abstract: A growing body of research leverages social network based trust relationships to improve the functionality of the system. However, these systems expose users’ trust relationships, which is considered sensitive information in today’s society, to an adversary. In this work, we make the following contributions. First, we propose an algorithm that perturbs the structure of a social graph in order to provide link privacy, at the cost of slight reduction in the utility of the social graph. Second we define general metrics for characterizing the utility and privacy of perturbed graphs. Third, we evaluate the utility and privacy of our proposed algorithm using real world social graphs. Finally, we demonstrate the applicability of our perturbation algorithm on a broad range of secure systems, including Sybil defenses and secure routing.

Posted Content
TL;DR: In this article, a new generative model is developed to jointly reproduce the social structure and the node attributes of real social networks, and the model can accurately predict the social and attribute structure of real networks.
Abstract: Understanding social network structure and evolution has important implications for many aspects of network and system design including provisioning, bootstrapping trust and reputation systems via social networks, and defenses against Sybil attacks. Several recent results suggest that augmenting the social network structure with user attributes (e.g., location, employer, communities of interest) can provide a more fine-grained understanding of social networks. However, there have been few studies to provide a systematic understanding of these effects at scale. We bridge this gap using a unique dataset collected as the Google+ social network grew over time since its release in late June 2011. We observe novel phenomena with respect to both standard social network metrics and new attribute-related metrics (that we define). We also observe interesting evolutionary patterns as Google+ went from a bootstrap phase to a steady invitation-only stage before a public release. Based on our empirical observations, we develop a new generative model to jointly reproduce the social structure and the node attributes. Using theoretical analysis and empirical evaluations, we show that our model can accurately reproduce the social and attribute structure of real social networks. We also demonstrate that our model provides more accurate predictions for practical application contexts.

Journal ArticleDOI
TL;DR: The pigmentation patterns of shells in the genus Conus can be generated by a neural-network model of the mantle and the evolutionary history of these parameters are inferred to infer the pigmentationpatterns of ancestral species.
Abstract: The pigmentation patterns of shells in the genus Conus can be generated by a neural-network model of the mantle. We fit model parameters to the shell pigmentation patterns of 19 living Conus species for which a well resolved phylogeny is available. We infer the evolutionary history of these parameters and use these results to infer the pigmentation patterns of ancestral species. The methods we use allow us to characterize the evolutionary history of a neural network, an organ that cannot be preserved in the fossil record. These results are also notable because the inferred patterns of ancestral species sometimes lie outside the range of patterns of their living descendants, and illustrate how development imposes constraints on the evolution of complex phenotypes.

Proceedings ArticleDOI
25 Jun 2012
TL;DR: Opaak enables its users to establish identities with different online services while ensuring that these identities cannot be linked with each other or their real identity, allowing the service providers to prevent abuse in the form of spam or Sybil attacks, which are prevalent in such online services that offer anonymity.
Abstract: Trust and anonymity are both desirable properties on the Internet. However, online services and users often have to make the trade off between trust and anonymity due to the lack of usable frameworks for achieving them both. We propose Opaak, a practical anonymous authentication framework. Opaak enables its users to establish identities with different online services while ensuring that these identities cannot be linked with each other or their real identity. In addition, Opaak allows online service providers to control the rate at which users utilize their services while preserving their anonymity. Hence, allowing the service providers to prevent abuse in the form of spam or Sybil attacks, which are prevalent in such online services that offer anonymity. Opaak leverages the mobile phone as a scarce resource combined with anonymous credentials in order to provide these features. We target two kinds of applications for Opaak and identify their requirements in order to achieve both trust and anonymity. We develop efficient protocols for these applications based on anonymous credentials. In addition, we design an architecture that facilitates integration with existing mobile and web applications and allows application developers to transparently utilize our protocols. We implement a prototype on Android and evaluate its performance to demonstrate the practicality of our approach.

Journal ArticleDOI
TL;DR: This model bound the competitive ratio between a reactive defense algorithm (which is inspired by online learning theory) and the best fixed proactive defense and shows that this reactive strategy is robust to a lack of information about the attacker's incentives and knowledge.
Abstract: Despite the conventional wisdom that proactive security is superior to reactive security, we show that reactive security can be competitive with proactive security as long as the reactive defender learns from past attacks instead of myopically overreacting to the last attack. Our game-theoretic model follows common practice in the security literature by making worst case assumptions about the attacker: we grant the attacker complete knowledge of the defender's strategy and do not require the attacker to act rationally. In this model, we bound the competitive ratio between a reactive defense algorithm (which is inspired by online learning theory) and the best fixed proactive defense. Additionally, we show that, unlike proactive defenses, this reactive strategy is robust to a lack of information about the attacker's incentives and knowledge.

Proceedings Article
01 Jan 2012
TL;DR: The FreeMarket attack is presented, which automatically identifies and exploits such insecure IAB coding practices and produces a rewritten application for which all in-app purchases succeed without any payment.
Abstract: Google recently launched Android Market In-app Billing (IAB), a service that allows developers to sell digital content in their Android applications by delegating the billing responsibilities to Google. This feature has already gained immense popularity with developers—16 of the top 20 grossing apps in the Android Market rely on IAB for generating revenue. However, despite Google’s recommendations for preventing attacks on IAB applications,1 the majority of applications do not use adequate security measures to authenticate IAB purchases. In this work we present the FreeMarket attack, which automatically identifies and exploits such insecure IAB coding practices. Our attack produces a rewritten application for which all in-app purchases succeed without any payment. The rewritten application retains the full functionality of the original and can be executed on unmodified Android devices. We show that at least 174 applications in the Android Market (more than 50% of the applications we tested) are vulnerable to this attack. As part of this work, we develop a translation tool named Deja, which converts the proprietary Dalvik bytecode used by Android applications to standard Java bytecode, enabling the use of the ASM bytecode rewriting library.2 Deja uses SSA-based dataflow analysis to infer the operand types, which must be explicitly specified in Java bytecode, and correctly reasons about important differences between the two formats (e.g., the bytecode verification process).3 In the IAB protocol, Google digitally signs the message notifying an application of a successful purchase. Although Google advises developers to verify this signature on a remote server before acknowledging the purchase, many applications either do not perform any verification or perform the verification on the device using the java.security.Signature.verify API. The FreeMarket attack exploits this behavior by rewrit-

Book ChapterDOI
21 May 2012
TL;DR: Policy-Enhanced Private Set Intersection (PPSI) as mentioned in this paper allows two parties to share information while enforcing the desired privacy policies, and is secure in the malicious model under the CBDH assumption, the random oracle model, and the assumption that the underlying PSI protocol is secure against malicious adversaries.
Abstract: Companies, organizations, and individuals often wish to share information to realize valuable social and economic goals. Unfortunately, privacy concerns often stand in the way of such information sharing and exchange. This paper proposes a novel cryptographic paradigm called Policy-Enhanced Private Set Intersection (PPSI ), allowing two parties to share information while enforcing the desired privacy policies. Our constructions require minimal additional overhead over traditional Private Set Intersection (PSI) protocols, and yet we can handle rich policy semantics previously not possible with traditional PSI and Authorized Private Set Intersection (APSI) protocols. Our scheme involves running a standard PSI protocol over carefully crafted encodings of elements formed as part of a challenge-response mechanism. The structure of these encodings resemble techniques used for aggregating BLS signatures in bilinear groups. We prove that our scheme is secure in the malicious model, under the CBDH assumption, the random oracle model, and the assumption that the underlying PSI protocol is secure against malicious adversaries.

Proceedings Article
07 Aug 2012
TL;DR: Bubbles is introduced, a context-centric security system that explicitly captures user's privacy desires by allowing human contact lists to control access to data clustered by real-world events and prevents Android-style permission escalation attacks.
Abstract: Users today are unable to use the rich collection of third-party untrusted applications without risking significant privacy leaks. In this paper, we argue that current and proposed applications and data-centric security policies do not map well to users' expectations of privacy. In the eyes of a user, applications and peripheral devices exist merely to provide functionality and should have no place in controlling privacy. Moreover, most users cannot handle intricate security policies dealing with system concepts such as labeling of data, application permissions and virtual machines. Not only are current policies impenetrable to most users, they also lead to security problems such as privilege-escalation attacks and implicit information leaks. Our key insight is that users naturally associate data with real-world events, and want to control access at the level of human contacts. We introduce Bubbles, a context-centric security system that explicitly captures user's privacy desires by allowing human contact lists to control access to data clustered by real-world events. Bubbles infers information-flow rules from these simple context-centric access-control rules to enable secure use of untrusted applications on users' data. We also introduce a new programming model for untrusted applications that allows them to be functional while still upholding the users' privacy policies. We evaluate the model's usability by porting an existing medical application and writing a calendar app from scratch. Finally, we show the design of our system prototype running on Android that uses bubbles to automatically infer all dangerous permissions without any user intervention. Bubbles prevents Android-style permission escalation attacks without requiring users to specify complex information flow rules.

Book ChapterDOI
01 Jan 2012
TL;DR: This paper considers how eye tracking can enable the system to hypothesize if the user is familiar with the system he operates, or if he is an unfamiliar intruder, and investigates which conditions and measures are most suited for such an intrusion detection.
Abstract: User authentication is an important and usually final barrier to detect and prevent illicit access. Nonetheless it can be broken or tricked, leaving the system and its data vulnerable to abuse. In this paper we consider how eye tracking can enable the system to hypothesize if the user is familiar with the system he operates, or if he is an unfamiliar intruder. Based on an eye tracking experiment conducted with 12 users and various stimuli, we investigate which conditions and measures are most suited for such an intrusion detection. We model the user’s gaze behavior as a selector for information flow via the relative conditional gaze entropy. We conclude that this feature provides the most discriminative results with static and repetitive stimuli.

Journal ArticleDOI
01 Aug 2012
TL;DR: A new approach to learning and generalizing from observed malware behaviors based on tree automata inference is proposed and it is shown how inferred automata can be used for malware recognition and classification.
Abstract: We explore how formal methods and tools of the verification trade could be used for malware detection and analysis. In particular, we propose a new approach to learning and generalizing from observed malware behaviors based on tree automata inference. Our approach infers k-testable tree automata from system call dataflow dependency graphs. We show how inferred automata can be used for malware recognition and classification.

ReportDOI
13 Jan 2012
TL;DR: The creation of a security-enhancing hypervisor for PCs is proposed as a collaborative agenda for the research community by exploring how hypervisors can address end-user security issues and exploring how to architect a small, secure hypervisor that provides several of these facilities.
Abstract: : The purpose of this paper is to propose the creation of a security-enhancing hypervisor for PCs as a collaborative agenda for the research community. This agenda is not necessarily about answering fundamentally new research questions. Rather, it is a call to action about a rare chance for the community to have substantial impact. If researchers demonstrate compelling near-term benefits from a modest security layer, then OS vendors may adopt such a layer as a way to increase security without costly reengineering. The introduction of this secure foothold into the consumer software stack could then yield significant long-term benefits by providing a much better avenue for deploying security solutions. This agenda consists of two parts: (1) exploring how hypervisors can address end-user security issues and (2) exploring how to architect a small, secure hypervisor that provides several of these facilities. We believe that there are interesting and worthwhile challenges in both parts. The rest of this paper is organized as follows. We begin by explaining why hypervisors provide a highly attractive insertion point for security (Section 2) and summarizing work in this area (Section 3). We then discuss security facilities that a hypervisor can provide in Section 4, with a focus on trusted paths to online services. We conclude by discussing challenges associated with our proposal in Section 5.

Book
14 Sep 2012
TL;DR: Automatic Malware Analysis presents a virtualized malware analysis framework that addresses common challenges in malware analysis and captures intrinsic characteristics of malware, and is well suited for dealing with new malware samples and attack mechanisms.
Abstract: Malicious software (i.e., malware) has become a severe threat to interconnected computer systems for decades and has caused billions of dollars damages each year. A large volume of new malware samples are discovered daily. Even worse, malware is rapidly evolving becoming more sophisticated and evasive to strike against current malware analysis and defense systems. Automatic Malware Analysis presents a virtualized malware analysis framework that addresses common challenges in malware analysis. In regards to this new analysis framework, a series of analysis techniques for automatic malware analysis is developed. These techniques capture intrinsic characteristics of malware, and are well suited for dealing with new malware samples and attack mechanisms.

Proceedings ArticleDOI
12 Dec 2012
TL;DR: It is validated that the campaign donations of politicians are mainly influenced by his or her political power and affiliation to a political party, and a causal relationship between donations and votes cannot be identified.
Abstract: The USA is witnessing a heavy debate about the influence of political campaign contributions and votes cast on the floor of the United States Congress. We contribute quantitative arguments to this predominantly qualitative discussion by analyzing a dataset of political campaign contributions. We validate that the campaign donations of politicians are mainly influenced by his or her political power and affiliation to a political party. Approaching the question of whether donations influence votes, we employ supervised learning techniques to classify how a politician will vote based solely upon from whom he or she received donations. The statistical significance of the results are assessed within the context of the debate currently surrounding campaign finance reform. Our experimental findings exhibit a large predictive power of the donations, demonstrating high informativeness of the donations with respect to voting outcomes. However, observing the slightly superior accuracy of the party line as a predictor, a causal relationship between donations and votes cannot be identified.

Journal ArticleDOI
TL;DR: In this paper, a set of 30 behavioral touch features that can be extracted from raw touchscreen logs and demonstrate that different users populate distinct subspaces of this feature space are used to authenticate users.
Abstract: We investigate whether a classifier can continuously authenticate users based on the way they interact with the touchscreen of a smart phone. We propose a set of 30 behavioral touch features that can be extracted from raw touchscreen logs and demonstrate that different users populate distinct subspaces of this feature space. In a systematic experiment designed to test how this behavioral pattern exhibits consistency over time, we collected touch data from users interacting with a smart phone using basic navigation maneuvers, i.e., up-down and left-right scrolling. We propose a classification framework that learns the touch behavior of a user during an enrollment phase and is able to accept or reject the current user by monitoring interaction with the touch screen. The classifier achieves a median equal error rate of 0% for intra-session authentication, 2%-3% for inter-session authentication and below 4% when the authentication test was carried out one week after the enrollment phase. While our experimental findings disqualify this method as a standalone authentication mechanism for long-term authentication, it could be implemented as a means to extend screen-lock time or as a part of a multi-modal biometric authentication system.

01 Jan 2012
TL;DR: This work aims to understand how security policies and their systems are currently applied to web applications, and to advance the mechanisms used to apply policies toweb applications.
Abstract: Web applications are generally more exposed to untrusted user content than traditional applications. Thus, web applications face a variety of new and unique threats, especially that of content injection. One method for preventing these types of attacks is web application security policies. These policies specify the behavior or structure of the web application. The goal of this work is twofold. First, we aim to understand how security policies and their systems are currently applied to web applications. Second, we aim to advance the mechanisms used to apply policies to web applications. We focus on the first part through two studies, examining two classes of current web application security policies. We focus on the second part by studying and working towards two new ways of applying policies. These areas will advance the state of the art in understanding and building web application security policies and provide a foundation for future work in securing web applications.

Posted Content
TL;DR: This work clusters a corpus of 188,389 Android applications and 27,029 Facebook applications to find patterns in permission requests and finds that Facebook permission requests follow a clear structure that can be fitted well with only five patterns, whereas Android applications demonstrate more complex permission requests.
Abstract: Android and Facebook provide third-party applications with access to users’ private data and the ability to perform potentially sensitive operations (e.g., post to a user’s wall or place phone calls). As a security measure, these platforms restrict applications’ privileges with permission systems: users must approve the permissions requested by applications before the applications can make privacy- or security-relevant API calls. However, recent studies have shown that users often do not understand permission requests and lack a notion of typicality of requests. As a rst step towards simplifying