scispace - formally typeset
Search or ask a question

Showing papers on "Entropy (information theory) published in 2005"


Proceedings ArticleDOI
07 Aug 2005
TL;DR: A mutual information criteria is proposed, and it is proved that finding the configuration that maximizes mutual information is NP-complete, and a polynomial-time approximation is described that is within (1 -- 1/e) of the optimum by exploiting the submodularity of the criterion.
Abstract: When monitoring spatial phenomena, which are often modeled as Gaussian Processes (GPs), choosing sensor locations is a fundamental task. A common strategy is to place sensors at the points of highest entropy (variance) in the GP model. We propose a mutual information criteria, and show that it produces better placements. Furthermore, we prove that finding the configuration that maximizes mutual information is NP-complete. To address this issue, we describe a polynomial-time approximation that is within (1 -- 1/e) of the optimum by exploiting the submodularity of our criterion. This algorithm is extended to handle local structure in the GP, yielding significant speedups. We demonstrate the advantages of our approach on two real-world data sets.

537 citations


Journal ArticleDOI
01 Apr 2005-Oikos
TL;DR: A link is noted between a common family of diversity indices and non-additive statistical mechanics that makes the Shannon index and the Simpson diversity (or Gini coefficient) special cases of a more general index.
Abstract: Many indices for measuring species diversity have been proposed. In this article, a link is noted between a common family of diversity indices and non-additive statistical mechanics. This makes the Shannon index and the Simpson diversity (or Gini coefficient) special cases of a more general index. The general index includes a parameter q that can be interpreted from a statistical mechanics perspective for systems with an underlying (multi)fractal structure. A q-generalised version of the Zipf–Mandelbrot distribution sometimes used to characterise rank–abundance relationships may be obtained by maximising this entropy.

377 citations


Journal ArticleDOI
TL;DR: In this article, the authors present exact results on universal quantities derived from the local density matrix ρ, for a free massive Dirac field in two dimensions, which can be written exactly in terms of the solutions of non-linear differential equations of the Painleve V type.
Abstract: We present some exact results on universal quantities derived from the local density matrix ρ, for a free massive Dirac field in two dimensions. We first find trρn in a novel fashion, which involves the correlators of suitable operators in the sine–Gordon model. These, in turn, can be written exactly in terms of the solutions of non-linear differential equations of the Painleve V type. Equipped with the previous results, we find the leading terms for the entanglement entropy, both for short and long distances, and showing that in the intermediate regime it can be expanded in a series of multiple integrals. The previous results have been checked by direct numerical calculations on the lattice, finding perfect agreement. Finally, we comment on a possible generalization of the entanglement entropy c-theorem to the alpha-entropies.

373 citations


Journal ArticleDOI
TL;DR: In this paper, a more rigorous and general mathematical derivation of MEP from MaxEnt is presented, and the relationship between MEP and the fluctuation theorem concerning the probability of second law violating phase-space paths is clarified.
Abstract: Recently the author used an information theoretical formulation of non-equilibrium statistical mechanics (MaxEnt) to derive the fluctuation theorem (FT) concerning the probability of second law violating phase-space paths. A less rigorous argument leading to the variational principle of maximum entropy production (MEP) was also given. Here a more rigorous and general mathematical derivation of MEP from MaxEnt is presented, and the relationship between MEP and the FT is thereby clarified. Specifically, it is shown that the FT allows a general orthogonality property of maximum information entropy to be extended to entropy production itself, from which MEP then follows. The new derivation highlights MEP and the FT as generic properties of MaxEnt probability distributions involving anti-symmetric constraints, independently of any physical interpretation. Physically, MEP applies to the entropy production of those macroscopic fluxes that are free to vary under the imposed constraints, and corresponds to selection of the most probable macroscopic flux configuration. In special cases MaxEnt also leads to various upper bound transport principles. The relationship between MaxEnt and previous theories of irreversible processes due to Onsager, Prigogine and Ziegler is also clarified in the light of these results.

347 citations


01 Jan 2005
TL;DR: The universality of the normal cloud model is proved, which is more superior and easier, and can fit the fuzziness and gentleness of human cognitive processing and be more applicable and universal in the representation of uncertain notions.
Abstract: The distribution function is an important tool in the study of the stochastic variances. The normal distribution is very popular in the nature and our society. The idea of membership functions is the foundation of the fuzzy sets theory. While the fuzzy theory is widely used, the completely certain membership function that has no any fuzziness at all has been the bottleneck of the applications of this theory.Cloud models are the effective tools in transforming between qualitative concepts and their quantitative expressions. It can represent the fuzziness and randomness and their relations of uncertain concepts. Also cloud models can show the concept granularity in multi-scale spaces by the digital characteristic Entropy (En). The normal cloud model not only broadens the form conditions of the normal distribution but also makes the normal membership function be the expectation of the random membership degree. In this paper, the universality of the normal cloud model is proved, which is more superior and easier, and can fit the fuzziness and gentleness of human cognitive processing.It would be more applicable and universal in the representation of uncertain notions.

309 citations


Proceedings ArticleDOI
13 Jun 2005
TL;DR: An entropy-based approach is developed that determines and reports entropy contents of traffic parameters such as IP addresses that indicate a massive network event.
Abstract: Detecting massive network events like worm outbreaks in fast IP networks such as Internet backbones, is hard. One problem is that the amount of traffic data does not allow real-time analysis of details. Another problem is that the specific characteristics of these events are not known in advance. There is a need for analysis methods that are real-time capable and can handle large amounts of traffic data. We have developed an entropy-based approach that determines and reports entropy contents of traffic parameters such as IP addresses. Changes in the entropy content indicate a massive network event. We give analyses on two Internet worms as proof-of-concept. While our primary focus is detection of fast worms, our approach should also be able to detect other network events. We discuss implementation alternatives and give benchmark results. We also show that our approach scales very well.

273 citations


Proceedings ArticleDOI
21 Aug 2005
TL;DR: In this article, the authors exploit an information theoretic model that combines information theory with statistical techniques from area of text mining and natural language processing to identify the most interesting and important nodes in a graph.
Abstract: A major problem in social network analysis and link discovery is the discovery of hidden organizational structure and selection of interesting influential members based on low-level, incomplete and noisy evidence data. To address such a challenge, we exploit an information theoretic model that combines information theory with statistical techniques from area of text mining and natural language processing. The Entropy model identifies the most interesting and important nodes in a graph. We show how entropy models on graphs are relevant to study of information flow in an organization. We review the results of two different experiments which are based on entropy models. The first version of this model has been successfully tested and evaluated on the Enron email dataset.

247 citations


Journal Article
TL;DR: In this article, it was shown that Shannon entropy can be generalized to smooth Renyi entropies, which are tight bounds for data compression and randomness extraction in the case of independent repetitions.
Abstract: Shannon entropy is a useful and important measure in information processing, for instance, data compression or randomness extraction, under the assumption-which can typically safely be made in communication theory-that a certain random experiment is independently repeated many times. In cryptography, however, where a system's working has to be proven with respect to a malicious adversary, this assumption usually translates to a restriction on the latter's knowledge or behavior and is generally not satisfied. An example is quantum key agreement, where the adversary can attack each particle sent through the quantum channel differently or even carry out coherent attacks, combining a number of particles together. In information-theoretic key agreement, the central functionalities of information reconciliation and privacy amplification have, therefore, been extensively studied in the scenario of general distributions: Partial solutions have been given, but the obtained bounds are arbitrarily far from tight, and a full analysis appeared to be rather involved to do. We show that, actually, the general case is not more difficult than the scenario of independent repetitions-in fact, given our new point of view, even simpler. When one analyzes the possible efficiency of data compression and randomness extraction in the case of independent repetitions, then Shannon entropy H is the answer. We show that H can, in these two contexts, be generalized to two very simple quantitiesH e 0 and H e ∞, called smooth Renyi entropies-which are tight bounds for data compression (hence, information reconciliation) and randomness extraction (privacy amplification), respectively. It is shown that the two new quantities, and related notions, do not only extend Shannon entropy in the described contexts, but they also share central properties of the latter such as the chain rule as well as sub-additivity and monotonicity.

229 citations


Journal ArticleDOI
TL;DR: The algorithms developed here extend standard active sensing methodology to dynamically evolving objects and continuous state spaces of high dimension and yield more than a ten fold gain in sensor efficiency when compared to periodic scanning.

219 citations


Proceedings ArticleDOI
22 May 2005
TL;DR: This paper constructs schemes with which Alice and Bob can prevent an adversary from learning any useful information about W, and designs strong randomness extractors with the property that the source W can be recovered from the extracted randomness and any string W' which is close to W.
Abstract: This paper explores what kinds of information two parties must communicate in order to correct errors which occur in a shared secret string W. Any bits they communicate must leak a significant amount of information about W --- that is, from the adversary's point of view, the entropy of W will drop significantly. Nevertheless, we construct schemes with which Alice and Bob can prevent an adversary from learning any useful information about W. Specifically, if the entropy of W is sufficiently high, then there is no function f(W) which the adversary can learn from the error-correction information with significant probability.This leads to several new results: (a) the design of noise-tolerant "perfectly one-way" hash functions in the sense of Canetti et al. [7], which in turn leads to obfuscation of proximity queries for high entropy secrets W; (b) private fuzzy extractors [11], which allow one to extract uniformly random bits from noisy and nonuniform data W, while also insuring that no sensitive information about W is leaked; and (c) noise tolerance and stateless key re-use in the Bounded Storage Model, resolving the main open problem of Ding [10].The heart of our constructions is the design of strong randomness extractors with the property that the source W can be recovered from the extracted randomness and any string W' which is close to W.

213 citations


Journal ArticleDOI
TL;DR: An evolutionary model based on entropy as a selective criterion is formulated and it is shown that it predicts the direction of changes in network structure over evolutionary time and accounts for the high degree of robustness and the heterogenous connectivity distribution, which is often observed in biological and technological networks.
Abstract: This article introduces the concept of network entropy as a characteristic measure of network topology. We provide computational and analytical support for the hypothesis that network entropy is a quantitative measure of robustness. We formulate an evolutionary model based on entropy as a selective criterion and show that (a) it predicts the direction of changes in network structure over evolutionary time and (b) it accounts for the high degree of robustness and the heterogenous connectivity distribution, which is often observed in biological and technological networks. Our model is based on Darwinian principles of evolution and preferentially selects networks according to a global fitness criterion, rather than local preferences in classical models of network growth. We predict that the evolutionarily stable states of evolved networks will be characterized by extremal values of network entropy.

Journal ArticleDOI
TL;DR: The paper provides a general method to derive a channel model which is consistent with one's state of knowledge and useful both in terms of designing a system based on criteria such as quality of service and in optimizing transmissions in multiuser networks.
Abstract: We devise theoretical grounds for constructing channel models for multiple-input multiple-output (MIMO) systems based on information-theoretic tools. The paper provides a general method to derive a channel model which is consistent with one's state of knowledge. The framework we give here has already been fruitfully explored with success in the context of Bayesian spectrum analysis and parameter estimation. For each channel model, we conduct an asymptotic analysis (in the number of antennas) of the achievable transmission rate using tools from random matrix theory. A central limit theorem is provided on the asymptotic behavior of the mutual information and validated in the finite case by simulations. The results are useful both in terms of designing a system based on criteria such as quality of service and in optimizing transmissions in multiuser networks.

Journal ArticleDOI
TL;DR: A simulation study indicates that the test involving the proposed entropy estimate has higher power than other well-known competitors under heavy tailed alternatives which are frequently used in many financial applications.
Abstract: This paper proposes a new class of estimators of an unknown entropy of random vector Its asymptotic unbiasedness and consistency are proved Further, this class of estimators is used to build both goodness-of-fit and independence tests based on sample entropy A simulation study indicates that the test involving the proposed entropy estimate has higher power than other well-known competitors under heavy tailed alternatives which are frequently used in many financial applications

Journal ArticleDOI
TL;DR: This paper presents a new approach to traffic matrix estimation using a regularization based on "entropy penalization", and chooses the traffic matrix consistent with the measured data that is information-theoretically closest to a model in which source/destination pairs are stochastically independent.
Abstract: Traffic matrices are required inputs for many IP network management tasks, such as capacity planning, traffic engineering, and network reliability analysis. However, it is difficult to measure these matrices directly in large operational IP networks, so there has been recent interest in inferring traffic matrices from link measurements and other more easily measured data. Typically, this inference problem is ill-posed, as it involves significantly more unknowns than data. Experience in many scientific and engineering fields has shown that it is essential to approach such ill-posed problems via "regularization". This paper presents a new approach to traffic matrix estimation using a regularization based on "entropy penalization". Our solution chooses the traffic matrix consistent with the measured data that is information-theoretically closest to a model in which source/destination pairs are stochastically independent. It applies to both point-to-point and point-to-multipoint traffic matrix estimation. We use fast algorithms based on modern convex optimization theory to solve for our traffic matrices. We evaluate our algorithm with real backbone traffic and routing data, and demonstrate that it is fast, accurate, robust, and flexible.

Journal ArticleDOI
TL;DR: Methods of selecting the appropriate granule size and efficient computation of rough entropy are described, which results in minimization of roughness in both object and background regions; thereby determining the threshold of partitioning.

Journal ArticleDOI
TL;DR: It is shown how this generalization that unifies Renyi and Tsallis entropy in a coherent picture naturally comes into being if the q-formalism of generalized logarithm and exponential functions is used, and how together with Sharma–Mittal's measure another possible extension emerges which however does not obey a pseudo-additive law and lacks of other properties relevant for a generalized thermostatistics.

Journal ArticleDOI
TL;DR: The complexity measure shows local minima at the closed-shell atoms indicating that for the above atoms complexity decreases with respect to neighboring atoms, and it is seen that complexity fluctuates around an average value, indicating that the atom cannot grow in complexity as Z increases.
Abstract: Shannon information entropies in position and momentum spaces and their sum are calculated as functions of Z(2 < or = Z < or = 54) in atoms. Roothaan-Hartree-Fock electron wave functions are used. The universal property S = a + b ln Z is verified. In addition, we calculate the Kullback-Leibler relative entropy, the Jensen-Shannon divergence, Onicescu's information energy, and a complexity measure recently proposed. Shell effects at closed-shell atoms are observed. The complexity measure shows local minima at the closed-shell atoms indicating that for the above atoms complexity decreases with respect to neighboring atoms. It is seen that complexity fluctuates around an average value, indicating that the atom cannot grow in complexity as Z increases. Onicescu's information energy is correlated with the ionization potential. Kullback distance and Jensen-Shannon distance are employed to compare Roothaan-Hartree-Fock density distributions with other densities of previous works.

Book
29 Sep 2005
TL;DR: This chapter discusses neural networks, Shannon's information theory, and applications to neural networks in the context of unsupervised and supervised learning.
Abstract: I INTRODUCTION TO NEURAL NETWORKS 1. General introduction 2. Layered networks 3. Recurrent networks with binary neurons II ADVANCED NEURAL NETWORKS 4. Competitive unsupervised learning processes 5. Bayesian techniques in supervised learning 6. Gaussian processes 7. Support vector machines for binary classification III INFORMATION THEORY AND NEURAL NETWORKS 8. Measuring information 9. Identification of entropy as an information measure 10. Building blocks of Shannon's information theory 11. Information theory and statistical inference 12. Applications to neural networks IV MACROSCOPIC ANALYSIS OF DYNAMICS 13. Network operation: macroscopic dynamics 14. Dynamics of online learning in binary perceptrons 15. Dynamics of online gradient descent learning V EQUILIBRIUM STATISTICAL MECHANICS OF NEURAL NETWORKS 16. Basics of equilibrium statistical mechanics 17. Network operation: equilibrium analysis 18. Gardner theory of task realizability APPENDICES A. Historical and bibliographical notes B. Probability theory in a nutshell C. Conditions for central limit theorem to apply D. Some simple summation identities E. Gaussian integrals and probability distributions F. Matrix identities G. The delta-distribution H. Inequalities based on convexity I. Metrics for parametrized probability distributions J. Saddle-point integration REFERENCES

Journal ArticleDOI
TL;DR: The spherical confinement model leads to the ST values which satisfy the lower bound up to the limits of extreme confinements with the interesting new result displaying regions over which a set of upper and lower bounds to the information entropy sum can be locally prescribed.
Abstract: The Shannon information entropy of 1-normalized electron density in position and momentum space Sr and Sp, and the sum ST, respectively, are reported for the ground-state H, He+, Li2+, H-, He, Li+, Li, and B atoms confined inside an impenetrable spherical boundary defined by radius R. We find new characteristic features in ST denoted by well-defined minimum and maximum as a function of confinement. The results are analyzed in the background of the irreducible lower bound stipulated by the entropy uncertainty principle [I. Bialynicki-Birula and J. Mycielski, Commun. Math. Phys. 44, 129 (1975)]. The spherical confinement model leads to the ST values which satisfy the lower bound up to the limits of extreme confinements with the interesting new result displaying regions over which a set of upper and lower bounds to the information entropy sum can be locally prescribed. Similar calculations on the H atom in 2s excited states are presented and their novel characteristics are discussed.

Journal ArticleDOI
TL;DR: The maximum entropy regularization method (MEM) is regarded as a complement to TikR by securing a positive pair distance distribution and enhancing the accuracy of TIKR.

Journal ArticleDOI
TL;DR: A new measure of class heterogeneity of intervals from the viewpoint of class probability itself is proposed, based on the definition of heterogeneity, and a new criterion to evaluate a discretization scheme and analyze its property theoretically is presented.
Abstract: Discretization, as a preprocessing step for data mining, is a process of converting the continuous attributes of a data set into discrete ones so that they can be treated as the nominal features by machine learning algorithms. Those various discretization methods, that use entropy-based criteria, form a large class of algorithm. However, as a measure of class homogeneity, entropy cannot always accurately reflect the degree of class homogeneity of an interval. Therefore, in this paper, we propose a new measure of class heterogeneity of intervals from the viewpoint of class probability itself. Based on the definition of heterogeneity, we present a new criterion to evaluate a discretization scheme and analyze its property theoretically. Also, a heuristic method is proposed to find the approximate optimal discretization scheme. Finally, our method is compared, in terms of predictive error rate and tree size, with Ent-MDLC, a representative entropy-based discretization method well-known for its good performance. Our method is shown to produce better results than those of Ent-MDLC, although the improvement is not significant. It can be a good alternative to entropy-based discretization methods.

Journal ArticleDOI
TL;DR: In this paper, the inequalities above are extended to Renyi entropy, p/sup th/ moment, and generalized Fisher information and generalized Gaussian random densities are introduced and shown to be the extremal densities for the new inequalities.
Abstract: The moment-entropy inequality shows that a continuous random variable with given second moment and maximal Shannon entropy must be Gaussian. Stam's inequality shows that a continuous random variable with given Fisher information and minimal Shannon entropy must also be Gaussian. The Crame/spl acute/r-Rao inequality is a direct consequence of these two inequalities. In this paper, the inequalities above are extended to Renyi entropy, p/sup th/ moment, and generalized Fisher information. Generalized Gaussian random densities are introduced and shown to be the extremal densities for the new inequalities. An extension of the Crame/spl acute/r-Rao inequality is derived as a consequence of these moment and Fisher information inequalities.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: A novel unsupervised, information-theoretic, adaptive filter (UINTA) that improves the predictability of pixel intensities from their neighborhoods by decreasing the joint entropy between them and can thereby restore a wide spectrum of images and applications.
Abstract: The restoration of images is an important and widely studied problem in computer vision and image processing. Various image filtering strategies have been effective, but invariably make strong assumptions about the properties of the signal and/or degradation. Therefore, these methods typically lack the generality to be easily applied to new applications or diverse image collections. This paper describes a novel unsupervised, information-theoretic, adaptive filter (UINTA) that improves the predictability of pixel intensities from their neighborhoods by decreasing the joint entropy between them. Thus UINTA automatically discovers the statistical properties of the signal and can thereby restore a wide spectrum of images and applications. This paper describes the formulation required to minimize the joint entropy measure, presents several important practical considerations in estimating image-region statistics, and then presents results on both real and synthetic data.

Journal ArticleDOI
TL;DR: It is demonstrated that the elimination of categories is retained when quadratic entropy is applied to ultrametric dissimilarities, and all categories are retained in order to reach its maximal value.

Journal ArticleDOI
TL;DR: A distributed architecture is introduced, within a probabilistic framework for vision-based 3-D mapping, whereby each robot is committed to cooperate with other robots through information sharing, thus preventing the robot to overwhelm communication resources with redundant information.

Journal ArticleDOI
Murali Rao1
TL;DR: Some more mathematical properties of CRE are developed, its relation to the L log L class, and the Weibull distribution are shown, which will be of use in statistical estimates.
Abstract: An alternative notion of entropy called CRE is proposed in [Ra1] Rao et al. (IEEE Trans. Inf. Theory 50, 2004). This preserves many of the properties of Shannon Entropy and possesses mathematical properties, which we hope will be of use in statistical estimates. In this article, we develop some more mathematical properties of CRE, show its relation to the L log L class, and characterize among others the Weibull distribution.

Journal ArticleDOI
TL;DR: In this article, the Renyi entropy was extracted from local measurements on two pairs of polarization-entangled photons and used to demonstrate the violation of entropic inequalities, including the Bell-Clauser-Horne-Shimony-Holt inequalities.
Abstract: Nonlinear properties of quantum states, such as entropy or entanglement, quantify important physical resources and are frequently used in quantum-information science. They are usually calculated from a full description of a quantum state, even though they depend only on a small number of parameters that specify the state. Here we extract a nonlocal and a nonlinear quantity, namely, the Renyi entropy, from local measurements on two pairs of polarization-entangled photons. We also introduce a "phase marking" technique which allows the selection of uncorrupted outcomes even with nondeterministic sources of entangled photons. We use our experimental data to demonstrate the violation of entropic inequalities. They are examples of nonlinear entanglement witnesses and their power exceeds all linear tests for quantum entanglement based on all possible Bell-Clauser-Horne-Shimony-Holt inequalities.

Proceedings ArticleDOI
01 Jan 2005
TL;DR: This paper discusses several important factors which can influence the accuracy of the results obtained from application of the well-known approximate entropy method and the more recently developed sample entropy method to fast neurophysiological signals.
Abstract: This paper discusses several important factors which can influence the accuracy of the results obtained from application of the well-known approximate entropy (ApEn) method and the more recently developed sample entropy (SampEn) method to fast neurophysiological signals. Based on the performance of these methods on both computer simulation and experimental data, parameter selection criteria are suggested for application to fast dynamic signals

Journal ArticleDOI
TL;DR: It is shown that a $\gamma$-multiplicative approximation to the entropy can be obtained in $O(n^{(1+\eta)/\gamma^2} \log n)$ time for distributions with entropy $\Omega(\gamma/\eta)$, where $n$ is the size of the domain of the distribution and $\eta$ is an arbitrarily small positive constant.
Abstract: We consider the problem of approximating the entropy of a discrete distribution under several different models of oracle access to the distribution. In the evaluation oracle model, the algorithm is given access to the explicit array of probabilities specifying the distribution. In this model, linear time in the size of the domain is both necessary and sufficient for approximating the entropy. In the generation oracle model, the algorithm has access only to independent samples from the distribution. In this case, we show that a $\gamma$-multiplicative approximation to the entropy can be obtained in $O(n^{(1+\eta)/\gamma^2} \log n)$ time for distributions with entropy $\Omega(\gamma/\eta)$, where $n$ is the size of the domain of the distribution and $\eta$ is an arbitrarily small positive constant. We show that this model does not permit a multiplicative approximation to the entropy in general. For the class of distributions to which our upper bound applies, we obtain a lower bound of $\Omega(n^{1/(2\gamma^2)})$. We next consider a combined oracle model in which the algorithm has access to both the generation and the evaluation oracles of the distribution. In this model, significantly greater efficiency can be achieved: we present an algorithm for $\gamma$-multiplicative approximation to the entropy that runs in $O((\gamma^2 \log^2{n})/(h^2 (\gamma-1)^2))$ time for distributions with entropy $\Omega(h)$; for such distributions, we also show a lower bound of $\Omega((\log n)/(h(\gamma^2-1)+\gamma^2))$. Finally, we consider two special families of distributions: those in which the probabilities of the elements decrease monotonically with respect to a known ordering of the domain, and those that are uniform over a subset of the domain. In each case, we give more efficient algorithms for approximating the entropy.

Posted Content
TL;DR: It is shown that any property testing algorithm in the combined oracle model for calculating a permutation invariant functions can be simulated in the random order model in a single pass and addresses a question raised by Feigenbaum et al regarding the relationship between property testing and stream algorithms.
Abstract: In many problems in data mining and machine learning, data items that need to be clustered or classified are not points in a high-dimensional space, but are distributions (points on a high dimensional simplex). For distributions, natural measures of distance are not the $\ell_p$ norms and variants, but information-theoretic measures like the Kullback-Leibler distance, the Hellinger distance, and others. Efficient estimation of these distances is a key component in algorithms for manipulating distributions. Thus, sublinear resource constraints, either in time (property testing) or space (streaming) are crucial. We start by resolving two open questions regarding property testing of distributions. Firstly, we show a tight bound for estimating bounded, symmetric f-divergences between distributions in a general property testing (sublinear time) framework (the so-called combined oracle model). This yields optimal algorithms for estimating such well known distances as the Jensen-Shannon divergence and the Hellinger distance. Secondly, we close a $(\log n)/H$ gap between upper and lower bounds for estimating entropy $H$ in this model. In a stream setting (sublinear space), we give the first algorithm for estimating the entropy of a distribution. Our algorithm runs in polylogarithmic space and yields an asymptotic constant factor approximation scheme. We also provide other results along the space/time/approximation tradeoff curve.