scispace - formally typeset
Search or ask a question

Showing papers by "Mikhail J. Atallah published in 2006"


Proceedings ArticleDOI
26 Sep 2006
TL;DR: A better way to use synonym substitution is proposed, one that is no longer entirely guided by the mark-insertion process, but is also guided by a resilience requirement, subject to a maximum allowed distortion constraint.
Abstract: Information-hiding in natural language text has mainly consisted of carrying out approximately meaning-preserving modifications on the given cover text until it encodes the intended mark. A major technique for doing so has been synonym-substitution. In these previous schemes, synonym substitutions were done until the text "confessed", i.e., carried the intended mark message. We propose here a better way to use synonym substitution, one that is no longer entirely guided by the mark-insertion process: It is also guided by a resilience requirement, subject to a maximum allowed distortion constraint. Previous schemes for information hiding in natural language text did not use numeric quantification of the distortions introduced by transformations, they mainly used heuristic measures of quality based on conformity to a language model (and not in reference to the original cover text). When there are many alternatives to carry out a substitution on a word, we prioritize these alternatives according to a quantitative resilience criterion and use them in that order. In a nutshell, we favor the more ambiguous alternatives. In fact not only do we attempt to achieve the maximum ambiguity, but we want to simultaneously be as close as possible to the above-mentioned distortion limit, as that prevents the adversary from doing further transformations without exceeding the damage threshold; that is, we continue to modify the document even after the text has "confessed" to the mark, for the dual purpose of maximizing ambiguity while deliberately getting as close as possible to the distortion limit. The quantification we use makes possible an application of the existing information-theoretic framework, to the natural language domain, which has unique challenges not present in the image or audio domains. The resilience stems from both (i) the fact that the adversary does not know where the changes were made, and (ii) the fact that automated disambiguation is a major difficulty faced by any natural language processing system (what is bad news for the natural language processing area, is good news for our scheme's resilience). In addition to the above mentioned design and analysis, another contribution of this paper is the description of the implementation of the scheme and of the experimental data obtained.

149 citations


Journal ArticleDOI
TL;DR: In this article, the authors present protocols that protect both sensitive credentials and sensitive policies in trust negotiations in an open environment such as the Internet, where the decision to collaborate with a stranger (e.g., by granting access to a resource) is often based on the characteristics (rather than the identity) of the requester via digital credentials.
Abstract: In an open environment such as the Internet, the decision to collaborate with a stranger (e.g., by granting access to a resource) is often based on the characteristics (rather than the identity) of the requester, via digital credentials: access is granted if Alice's credentials satisfy Bob's access policy. The literature contains many scenarios in which it is desirable to carry out such trust negotiations in a privacy-preserving manner, i.e., so as minimize the disclosure of credentials and/or of access policies. Elegant solutions were proposed for achieving various degrees of privacy-preservation through minimal disclosure. In this paper, we present protocols that protect both sensitive credentials and sensitive policies. That is, Alice gets the resource only if she satisfies the policy, Bob does not learn anything about Alice's credentials (not even whether Alice got access), and Alice learns neither Bob's policy structure nor which credentials caused her to gain access. Our protocols are efficient in terms of communication and in rounds of interaction

126 citations


Proceedings ArticleDOI
01 Nov 2006
TL;DR: An efficient protocol for solving linear programming problems in the honest-but-curious model, such that neither party reveals anything about their private input to the other party (other than what can be inferred from the result).
Abstract: The growth of the Internet has created tremendous opportunities for online collaborations. These often involve collaborative optimizations where the two parties are, for example, jointly minimizing costs without violating their own particular constraints (e.g., one party may have too much inventory, another too little inventory but too much production capacity, etc). Many of these optimizations can be formulated as linear programming problems, or, rather, as collaborative linear programming, in which two parties need to jointly optimize based on their own private inputs. It is often important to have online collaboration techniques and protocols that carry this out without either party revealing to the other anything about their own private inputs to the optimization (other than, unavoidably, what can be deduced from the collaboratively computed optimal solution). For example, two organizations who jointly invest in a project may want to minimize some linear objective function while satisfying both organizations' private and confidential constraints. Constraints are usually private when they reveal too much about the organizations' financial health, its future business strategy, etc. Linear programming problems have been widely studied in the literature. However, the existing solutions (e.g., the simplex method) do not extend to the above-mentioned framework in which the linear constraints are shared by the two parties, who do not want to disclose their own to the other party. In this paper, we give an efficient protocol for solving linear programming problems in the honest-but-curious model, such that neither party reveals anything about their private input to the other party (other than what can be inferred from the result). The amount of communication and computation done by our protocol is proportional to the time complexity of the simplex method, a widely used linear programming algorithm. We also provide a practical solution that prevents certain malicious behavior of the participants. The use of the known general circuit-simulation solutions to secure function evaluation is unacceptable for the simplex method, as it implies an exponential size circuit

72 citations


Proceedings ArticleDOI
23 Apr 2006
TL;DR: The design and implementation of a scheme for hiding information in translated natural language text is described, and experimental results using the implemented system are presented.
Abstract: This paper describes the design and implementation of a scheme for hiding information in translated natural language text, and presents experimental results using the implemented system. Unlike the previous work, which required the presence of both the source and the translation, the protocol presented in this paper requires only the translated text for recovering the hidden message. This is a significant improvement, as transmitting the source text was both wasteful of resources and less secure. The security of the system is now improved not only because the source text is no longer available to the adversary, but also because a broader repertoire of defenses (such as mixing human and machine translation) can now be used.

68 citations


Proceedings ArticleDOI
28 Oct 2006
TL;DR: This paper shows how to use lower-level (in this case word-level) marking to improve the resilience and embedding properties of higher level schemes, and introduces a novel and powerful sentence level watermarking technique that relies on multiple features of each sentence and exploits the notion of orthogonality between features.
Abstract: Compared to other media, natural language text presents unique challenges for information hiding. These challenges require the design of a robust algorithm that can work under following constraints: (i) low embedding bandwidth, i.e., number of sentences is comparable with message length, (ii) not all transformations can be applied to a given sentence (iii) the number of alternative forms for a sentence is relatively small, a limitation governed by the grammar and vocabulary of the natural language, as well as the requirement to preserve the style and fluency of the document. The adversary can carry out all the transformations used for embedding to remove the embedded message. In addition, the adversary can also permute the sentences, select and use a subset of sentences, and insert new sentences. We give a scheme that overcomes these challenges, together with a partial implementation and its evaluation for the English language. The present application of this scheme works at the sentence level while also using a word-level watermarking technique that was recently designed and built into a fully automatic system ("Equimark"). Unlike Equimark, whose resilience relied on the introduction of ambiguities, the present paper's sentence-level technique is more tuned to situations where very little change to the text is allowable (i.e., when style is important). Secondarily, this paper shows how to use lower-level (in this case word-level) marking to improve the resilience and embedding properties of higher level (in this case sentence level) schemes. We achieve this by using the word-based methods as a separate channel from the sentence-based methods, thereby improving the results of either one alone. The sentence level watermarking technique we introduce is novel and powerful, as it relies on multiple features of each sentence and exploits the notion of orthogonality between features.

63 citations


Proceedings Article
01 Jan 2006
TL;DR: This paper uses novel techniques to implement a non-standard trust negotiation strategy specifically suited to this framework, which is a substantial extension of the state-of-the-art in privacypreserving trust negotiations.
Abstract: In an open environment such as the Internet, the decision to collaborate with a stranger (e.g., by granting access to a resource) is often based on the characteristics (rather than the identity) of the requester, via digital credentials: Access is granted if Alice’s credentials satisfy Bob’s access policy. The literature contains many examples where protecting the credentials and the access control policies is useful, and there are numerous protocols that achieve this. In many of these schemes, the server does not learn whether the client obtained access (e.g., to a message, or a service via an eticket). A consequence of this property is that the client can use all of her credentials without fear of “probing” attacks by the server, because the server cannot glean information about which credentials the client has (when this property is lacking, the literature uses a framework where the very use of a credential is subject to a policy specific to that credential). The main result of this paper is a protocol for negotiating trust between Alice and Bob without revealing either credentials or policies, when each credential has its own access policy associated with it (e.g., “a top-secret clearance credential can only be used when the other party is a government employee and has a top-secret clearance”). Our protocol carries out this privacy-preserving trust negotiation between Alice and Bob, while enforcing each credential’s policy (thereby protecting sensitive credentials). Note that there can be a deep nesting of dependencies between credential policies, and that there can be (possibly overlapping) policy cycles of these dependencies. Our result is not achieved through the routine use of standard techniques to implement, in this framework, one of the known strategies for trust negotiations (such as the “eager strategy”). Rather, this paper uses novel techniques to implement a non-standard trust negotiation strategy specifically suited to this framework (and in fact unusable outside of this framework, as will become clear). Our work is therefore Portions of this work were supported by Grants IIS-0325345, IIS0219560, IIS-0312357, and IIS-0242421 from the National Science Foundation, Contract N00014-02-1-0364 from the Office of Naval Research, by sponsors of the Center for Education and Research in Information Assurance and Security, and by Purdue Discovery Park’s e-enterprise Center. a substantial extension of the state-of-the-art in privacypreserving trust negotiations.

62 citations


Proceedings ArticleDOI
07 Jun 2006
TL;DR: The present paper presents efficient key derivation techniques for hierarchies that are not trees, using a scheme that is very different from the above-mentioned paper, and makes a novel use of the notion of the dimension d of an access graph.
Abstract: Access hierarchies are useful in many applications and are modeled as a set of access classes organized by a partial order. A user who obtains access to a class in such a hierarchy is entitled to access objects stored at that class, as well as objects stored at its descendant classes. Efficient schemes for this framework assign only one key to a class and use key derivation to permit access to descendant classes. Ideally, the key derivation uses simple primitives such as cryptographic hash computations and modular additions. A straightforward key derivation time is then linear in the length of the path between the user's class and the class of the object that the user wants to access. Recently, work presented in [2] has given an efficient solution that significantly lowers this key derivation time, while using only hash functions and modular additions. Two fastkey-derivation techniques in that paper were given for trees, achieving O(log log n) and O(1) key derivation times, respectively, where n is the number of access classes. The present paper presents efficient key derivation techniques for hierarchies that are not trees, using a scheme that is very different from the above-mentioned paper. The construction we give in the present paper is recursive and uses the onedimensional case solution as its base. It makes a novel use of the notion of the dimension d of an access graph, and provides a solution through which no key derivation requires more than 2d+1 hash function computations, even for "unbalanced" hierarchies whose depth is linear in their number of access classes n. The significance of this result is strengthened by the fact that many access graphs have a low d value (e.g., trees correspond to the case d = 2). Our scheme has the desirable property (as did [2] for trees) that addition and deletion of edges and nodes in the access hierarchy can be "contained".

44 citations


Proceedings ArticleDOI
02 Feb 2006
TL;DR: This paper gives an overview of the research and implementation challenges in building an end-to-end natural language processing based watermarking system and evaluated the quality of the watermarked text using an objective evaluation metric, the BLEU score.
Abstract: This paper gives an overview of the research and implementation challenges we encountered in building an end-to-end natural language processing based watermarking system. With natural language watermarking, we mean embedding the watermark into a text document, using the natural language components as the carrier, in such a way that the modifications are imperceptible to the readers and the embedded information is robust against possible attacks. Of particular interest is using the structure of the sentences in natural language text in order to insert the watermark. We evaluated the quality of the watermarked text using an objective evaluation metric, the BLEU score. BLEU scoring is commonly used in the statistical machine translation community. Our current system prototype achieves 0.45 BLEU score on a scale [0,1].

44 citations


Journal ArticleDOI
TL;DR: This work introduces the issue of rights protection for discrete streaming data through watermarking and proposes a solution and analyzes its resilience to various types of attacks as well as some of the important expected domain-specific transforms, such as sampling and summarization.
Abstract: Today's world of increasingly dynamic environments naturally results in more and more data being available as fast streams. Applications such as stock market analysis, environmental sensing, Web clicks, and intrusion detection are just a few of the examples where valuable data is streamed. Often, streaming information is offered on the basis of a nonexclusive, single-use customer license. One major concern, especially given the digital nature of the valuable stream, is the ability to easily record and potentially "replay" parts of it in the future. If there is value associated with such future replays, it could constitute enough incentive for a malicious customer (Mallory) to record and duplicate data segments, subsequently reselling them for profit. Being able to protect against such infringements becomes a necessity. In this work, we introduce the issue of rights protection for discrete streaming data through watermarking. This is a novel problem with many associated challenges including: operating in a finite window, single-pass, (possibly) high-speed streaming model, and surviving natural domain specific transforms and attacks (e.g., extreme sparse sampling and summarizations), while at the same time keeping data alterations within allowable bounds. We propose a solution and analyze its resilience to various types of attacks as well as some of the important expected domain-specific transforms, such as sampling and summarization. We implement a proof of concept software (wms.*) and perform experiments on real sensor data from the NASA Infrared Telescope Facility at the University of Hawaii, to assess encoding resilience levels in practice. Our solution proves to be well suited for this new domain. For example, we can recover an over 97 percent confidence watermark from a highly down-sampled (e.g., less than 8 percent) stream or survive stream summarization (e.g., 20 percent) and random alteration attacks with very high confidence levels, often above 99 percent.

42 citations


Journal ArticleDOI
TL;DR: Findings from an empirical study are presented to measure and compare the accuracy and effectiveness of a suite of automatic event reconstruction techniques and quantify the rates of false positives and false negatives, and scalability in terms of both computational burden and memory-usage.

28 citations


Book ChapterDOI
04 Dec 2006
TL;DR: This paper studies the notion of point-based policies for trust management, and gives protocols for realizing them in a disclosure-minimizing fashion by computing a subset of Alice's credentials without revealing any of the two parties' private information.
Abstract: This paper studies the notion of point-based policies for trust management, and gives protocols for realizing them in a disclosure-minimizing fashion. Specifically, Bob values each credential with a certain number of points, and requires a minimum total threshold of points before granting Alice access to a resource. In turn, Alice values each of her credentials with a privacy score that indicates her reluctance to reveal that credential. Bob's valuation of credentials and his threshold are private. Alice's privacy-valuation of her credentials is also private. Alice wants to find a subset of her credentials that achieves Bob's required threshold for access, yet is of as small a value to her as possible. We give protocols for computing such a subset of Alice's credentials without revealing any of the two parties' above-mentioned private information.

Journal Article
TL;DR: In this article, the notion of point-based policies for trust management is studied, and protocols for realizing them in a disclosure-minimizing fashion are given for computing a subset of Alice's credentials without revealing any of the two parties' above mentioned private information.
Abstract: This paper studies the notion of point-based policies for trust management, and gives protocols for realizing them in a disclosure-minimizing fashion. Specifically, Bob values each credential with a certain number of points, and requires a minimum total threshold of points before granting Alice access to a resource. In turn, Alice values each of her credentials with a privacy score that indicates her reluctance to reveal that credential. Bob's valuation of credentials and his threshold are private. Alice's privacy-valuation of her credentials is also private. Alice wants to find a subset of her credentials that achieves Bob's required threshold for access, yet is of as small a value to her as possible. We give protocols for computing such a subset of Alice's credentials without revealing any of the two parties' above-mentioned private information.

Journal Article
TL;DR: In this paper, the worst-case complexity of a two-party protocol for non-uniform distributions was improved by using tools that are of independent interest, such as tools from independent interest.
Abstract: Participants in e-commerce and other forms of online collaborations tend to be selfish and rational, and therefore game theory has been recognized as particularly relevant to this area. In many common games, the joint strategy of the players is described by a list of pairs of actions, and one of those pairs is chosen according to a specified correlated probability distribution. In traditional game theory, a trusted third party mediator carries out this random selection, and reveals to each player its recommended action. In such games that have a correlated equilibrium, each player follows the mediator's recommendation because deviating from it cannot increase a player's expected payoff. Dodis, Halevi, and Rabin [1] described a two-party protocol that eliminates, through cryptographic means, the third party mediator. That protocol was designed and works well for a uniform distribution, but can be quite inefficient if applied to non-uniform distributions. Teague [2] has subsequently built on this work and extended it to the case where the probabilistic strategy no longer assigns equal probabilities to all the pairs of moves Our present paper improves on the work of Teague by providing, for the same problem, a protocol whose worst-case complexity is exponentially better. The protocol also uses tools that are of independent interest.

Book ChapterDOI
27 Feb 2006
TL;DR: The present paper improves on the work of Teague by providing a protocol whose worst-case complexity is exponentially better, and which uses tools that are of independent interest.
Abstract: Participants in e-commerce and other forms of online collaborations tend to be selfish and rational, and therefore game theory has been recognized as particularly relevant to this area. In many common games, the joint strategy of the players is described by a list of pairs of actions, and one of those pairs is chosen according to a specified correlated probability distribution. In traditional game theory, a trusted third party mediator carries out this random selection, and reveals to each player its recommended action. In such games that have a correlated equilibrium, each player follows the mediator's recommendation because deviating from it cannot increase a player's expected payoff. Dodis, Halevi, and Rabin[1] described a two-party protocol that eliminates, through cryptographic means, the third party mediator. That protocol was designed and works well for a uniform distribution, but can be quite inefficient if applied to non-uniform distributions. Teague[2] has subsequently built on this work and extended it to the case where the probabilistic strategy no longer assigns equal probabilities to all the pairs of moves. Our present paper improves on the work of Teague by providing, for the same problem, a protocol whose worst-case complexity is exponentially better. The protocol also uses tools that are of independent interest.

Patent
10 Nov 2006
TL;DR: A method and system for hiding an encryption key was proposed in this paper, where a message enters a source vertex; flows along a path from the source vertex to a sink vertex; and leaves the sink vertex, where the output message is an encrypted or decrypted version of the input message using the private encryption key.
Abstract: A method and system for hiding an encryption key. The method including creating a directed graph having a plurality of vertices and edges, including a source and a sink vertex. Each vertex has a vertex value. The source vertices have a common source value, and the sink vertices have a common sink value, the sink value being a function of the source value and the encryption key. Each edge has an edge value that is a function of r(in) −1 and r(out) where r(in) −1 is the functional inverse of the vertex value of the predecessor vertex and r(out) is the vertex value of the successor vertex. A message enters a source vertex; flows along a path from the source vertex to a sink vertex; and leaves the sink vertex, where the output message is an encrypted or decrypted version of the input message using the private encryption key.

Book ChapterDOI
15 Aug 2006
TL;DR: This work survey security technologies that mitigate this problem of reluctance to share data, and discuss research directions towards enforcing the data owner’s approved purposes on the data used in collaborative computing.
Abstract: Even though collaborative computing can yield substantial economic, social, and scientific benefits, a serious impediment to fully achieving that potential is a reluctance to share data, for fear of losing control over its subsequent dissemination and usage. An organization’s most valuable and useful data is often proprietary/ confidential, or the law may forbid its disclosure or regulate the form of that disclosure. We survey security technologies that mitigate this problem, and discuss research directions towards enforcing the data owner’s approved purposes on the data used in collaborative computing. These include techniques for cooperatively computing answers without revealing any private data, even though the computed answers depend on all the participants’ private data. They also include computational outsourcing, where computationally weak entities use computationally powerful entities to carry out intensive computing tasks without revealing to them either their inputs or the computed outputs.

Journal ArticleDOI
01 Nov 2006
TL;DR: This paper presents and compares two schemes that address the problem of portable and flexible privacy preserving access rights that permit access to a large collection of digital goods and shows that much can be achieved if one allows for even a negligible amount of false positives.
Abstract: We explore the problem of portable and flexible privacy preserving access rights that permit access to a large collection of digital goods. Privacy-preserving access control means that the service provider can neither learn what access rights a customer has nor link a request to access an item to a particular customer, thus maintaining privacy of both customer activity and customer access rights. Flexible access rights allow a customer to choose a subset of items or groups of items from the repository, obtain access to and be charged only for the items selected. And portability of access rights means that the rights themselves can be stored on small devices of limited storage space and computational capabilities such as smartcards or sensors, and therefore the rights must be enforced using the limited resources available. In this paper, we present and compare two schemes that address the problem of such access rights. We show that much can be achieved if one allows for even a negligible amount of false positives – items that were not requested by the customer, but inadvertently were included in the customer access right representation due to constrained space resources. But minimizing false positives is one of many other desiderata that include protection against sharing of false positives information by unscrupulous users, providing the users with transaction untraceability and unlinkability, and forward compatibility of the scheme. Our first scheme does not place any constraints on the amount of space available on the limited-capacity storage device, and searches for the best representation that meets the requirements. The second scheme, on the other hand, has (modest) requirements on the storage space available, but guarantees a low rate of false positives: with O(mc) storage space available on the smartcard (where m is the number of items or groups of items included in the subscription and c is a selectable parameter), it achieves a rate of false positives of m −c .