scispace - formally typeset
Search or ask a question
Author

Dakshi Agrawal

Bio: Dakshi Agrawal is an academic researcher from IBM. The author has contributed to research in topics: Wireless network & Service provider. The author has an hindex of 33, co-authored 145 publications receiving 6337 citations. Previous affiliations of Dakshi Agrawal include Bell Labs & University of Illinois at Urbana–Champaign.


Papers
More filters
Proceedings ArticleDOI
Dakshi Agrawal1, Charu C. Aggarwal1
01 May 2001
TL;DR: It is proved that the EM algorithm converges to the maximum likelihood estimate of the original distribution based on the perturbed data, and proposed metrics for quantification and measurement of privacy-preserving data mining algorithms are proposed.
Abstract: The increasing ability to track and collect large amounts of data with the use of current hardware technology has lead to an interest in the development of data mining algorithms which preserve user privacy. A recently proposed technique addresses the issue of privacy preservation by perturbing the data and reconstructing distributions at an aggregate level in order to perform the mining. This method is able to retain privacy while accessing the information implicit in the original attributes. The distribution reconstruction process naturally leads to some loss of information which is acceptable in many practical situations. This paper discusses an Expectation Maximization (EM) algorithm for distribution reconstruction which is more effective than the currently available method in terms of the level of information loss. Specifically, we prove that the EM algorithm converges to the maximum likelihood estimate of the original distribution based on the perturbed data. We show that when a large amount of data is available, the EM algorithm provides robust estimates of the original distribution. We propose metrics for quantification and measurement of privacy-preserving data mining algorithms. Thus, this paper provides the foundations for measurement of the effectiveness of privacy preserving data mining algorithms. Our privacy metrics illustrate some interesting results on the relative effectiveness of different perturbing distributions.

1,091 citations

Book ChapterDOI
13 Aug 2002
TL;DR: It is shown that not only can EM emanations be used to attack cryptographic devices where the power side-channel is unavailable, they can even beused to break power analysis countermeasures.
Abstract: We present results of a systematic investigation of leakage of compromising information via electromagnetic (EM) emanations from CMOS devices. These emanations are shown to consist of a multiplicity of signals, each leaking somewhat different information about the underlying computation. We show that not only can EM emanations be used to attack cryptographic devices where the power side-channel is unavailable, they can even be used to break power analysis countermeasures.

778 citations

Proceedings ArticleDOI
20 May 2007
TL;DR: These results show that Trojans that are 3-4 orders of magnitude smaller than the main circuit can be detected by signal processing techniques and provide a starting point to address this important problem.
Abstract: Hardware manufacturers are increasingly outsourcing their IC fabrication work overseas due to their much lower cost structure. This poses a significant security risk for ICs used for critical military and business applications. Attackers can exploit this loss of control to substitute Trojan ICs for genuine ones or insert a Trojan circuit into the design or mask used for fabrication. We show that a technique borrowed from side-channel cryptanalysis can be used to mitigate this problem. Our approach uses noise modeling to construct a set of fingerprints/or an IC family utilizing side- channel information such as power, temperature, and electromagnetic (EM) profiles. The set of fingerprints can be developed using a few ICs from a batch and only these ICs would have to be invasively tested to ensure that they were all authentic. The remaining ICs are verified using statistical tests against the fingerprints. We describe the theoretical framework and present preliminary experimental results to show that this approach is viable by presenting results obtained by using power simulations performed on representative circuits with several different Trojan circuitry. These results show that Trojans that are 3-4 orders of magnitude smaller than the main circuit can be detected by signal processing techniques. While scaling our technique to detect even smaller Trojans in complex ICs with tens or hundreds of millions of transistors would require certain modifications to the IC design process, our results provide a starting point to address this important problem.

741 citations

Proceedings ArticleDOI
18 May 1998
TL;DR: A space-time coded orthogonal frequency division multiplexing (OFDM) modulated physical layer is designed which combines coding and modulation and is attractive for delay-sensitive applications.
Abstract: There has been an increasing interest in providing high data-rate services such as video-conferencing, multimedia Internet access and wide area network over wideband wireless channels. Wideband wireless channels available in the PCS band (2 GHz) have been envisioned to be used by mobile (high Doppler) and stationary (low Doppler) units in a variety of delay spread profiles. This is a challenging task, given the limited link budget and severity of wireless environment, and calls for the development of novel robust bandwidth efficient techniques which work reliably at low SNRs. To this end, we design a space-time coded orthogonal frequency division multiplexing (OFDM) modulated physical layer. This combines coding and modulation. Space-time codes were previously proposed for narrowband wireless channels. These codes have high spectral efficiency and operate at very low SNR (within 2-3 dB of the capacity). On the other hand, OFDM has matured as a modulation scheme for wideband channels. We combine these two in a natural manner and propose a system achieving data rates of 1.5-3 Mbps over a 1 MHz bandwidth channel. This system requires 18-23 dB (resp. 9-14 dB) receive SNR at a frame error probability of 10/sup -2/ with two transmit and one receive antennas (resp. two transmit and two receive antennas). As space-time coding does not require any form of interleaving, the proposed system is attractive for delay-sensitive applications.

599 citations

Journal ArticleDOI
25 Jun 2000
TL;DR: A numerical optimization procedure for finding good packings in the complex Grassmannian space is described and the best signal constellations found by this procedure improve significantly upon previously known results.
Abstract: In this correspondence, we show that the problem of designing efficient multiple-antenna signal constellations for fading channels can be related to the problem of finding packings with large minimum distance in the complex Grassmannian space. We describe a numerical optimization procedure for finding good packings in the complex Grassmannian space and report the best signal constellations found by this procedure. These constellations improve significantly upon previously known results.

163 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors consider the design of channel codes for improving the data rate and/or the reliability of communications over fading channels using multiple transmit antennas and derive performance criteria for designing such codes under the assumption that the fading is slow and frequency nonselective.
Abstract: We consider the design of channel codes for improving the data rate and/or the reliability of communications over fading channels using multiple transmit antennas. Data is encoded by a channel code and the encoded data is split into n streams that are simultaneously transmitted using n transmit antennas. The received signal at each receive antenna is a linear superposition of the n transmitted signals perturbed by noise. We derive performance criteria for designing such codes under the assumption that the fading is slow and frequency nonselective. Performance is shown to be determined by matrices constructed from pairs of distinct code sequences. The minimum rank among these matrices quantifies the diversity gain, while the minimum determinant of these matrices quantifies the coding gain. The results are then extended to fast fading channels. The design criteria are used to design trellis codes for high data rate wireless communication. The encoding/decoding complexity of these codes is comparable to trellis codes employed in practice over Gaussian channels. The codes constructed here provide the best tradeoff between data rate, diversity advantage, and trellis complexity. Simulation results are provided for 4 and 8 PSK signal sets with data rates of 2 and 3 bits/symbol, demonstrating excellent performance that is within 2-3 dB of the outage capacity for these channels using only 64 state encoders.

7,105 citations

Book ChapterDOI
04 Mar 2006
TL;DR: In this article, the authors show that for several particular applications substantially less noise is needed than was previously understood to be the case, and also show the separation results showing the increased value of interactive sanitization mechanisms over non-interactive.
Abstract: We continue a line of research initiated in [10,11]on privacy-preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user. Previous work focused on the case of noisy sums, in which f = ∑ig(xi), where xi denotes the ith row of the database and g maps database rows to [0,1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. Roughly speaking, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case. The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation results showing the increased value of interactive sanitization mechanisms over non-interactive.

6,211 citations

ReportDOI
13 Aug 2004
TL;DR: This second-generation Onion Routing system addresses limitations in the original design by adding perfect forward secrecy, congestion control, directory servers, integrity checking, configurable exit policies, and a practical design for location-hidden services via rendezvous points.
Abstract: We present Tor, a circuit-based low-latency anonymous communication service. This second-generation Onion Routing system addresses limitations in the original design by adding perfect forward secrecy, congestion control, directory servers, integrity checking, configurable exit policies, and a practical design for location-hidden services via rendezvous points. Tor works on the real-world Internet, requires no special privileges or kernel modifications, requires little synchronization or coordination between nodes, and provides a reasonable tradeoff between anonymity, usability, and efficiency. We briefly describe our experiences with an international network of more than 30 nodes. We close with a list of open problems in anonymous communication.

3,960 citations

Journal ArticleDOI
TL;DR: This paper shows with two simple attacks that a \kappa-anonymized dataset has some subtle, but severe privacy problems, and proposes a novel and powerful privacy definition called \ell-diversity, which is practical and can be implemented efficiently.
Abstract: Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records with respect to certain identifying attributes.In this article, we show using two simple attacks that a k-anonymized dataset has some subtle but severe privacy problems. First, an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. This is a known problem. Second, attackers often have background knowledge, and we show that k-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks, and we propose a novel and powerful privacy criterion called e-diversity that can defend against such attacks. In addition to building a formal foundation for e-diversity, we show in an experimental evaluation that e-diversity is practical and can be implemented efficiently.

3,780 citations

Journal Article
TL;DR: The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.
Abstract: We continue a line of research initiated in [10, 11] on privacy-preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user. Previous work focused on the case of noisy sums, in which f = Σ i g(x i ), where x i denotes the ith row of the database and g maps database rows to [0,1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. Roughly speaking, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case. The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation results showing the increased value of interactive sanitization mechanisms over non-interactive.

3,629 citations