Approximating the number of differences between remote sets
read more
Citations
Collaborative data gathering in wireless sensor networks using measurement co-occurrence
Collaborative Data Gathering in Wireless Sensor Networks Using Measurement Co-Occurrence
Nye's Trie and Floret Estimators: Techniques for Detecting and Repairing Divergence in the SCADS Distributed Storage Toolkit
References
Space/time trade-offs in hash coding with allowable errors
The String-to-String Correction Problem
Summary cache: a scalable wide-area web cache sharing protocol
On the resemblance and containment of documents
Some complexity questions related to distributive computing(Preliminary Report)
Related Papers (5)
Statistical network protocol identification with unknown pattern extraction
Frequently Asked Questions (10)
Q2. What is the effect of false positives on the filter?
a false positive can prevent a valid set element (i.e., an element that is in the set intersection) from fitting in the resulting filter by reducing to zero (or causing to reduce to zero at some later time) one of the hash locations of the valid element.
Q3. how many entropys can be used to prove the theorem?
Using the normal distribution to upper bound this entropy givesH(pi) ≤ log(2πe)2∞ ∑i=0pii 2 −(∞ ∑i=0ipi)2+ 112 , (9)which can be manipulated to prove the theorem.
Q4. How many bits of communication are needed to partition f?
Yao showed that at least log2(d(f)) − 2 bits of communication are needed to correctly communicate f , with d(f) being the minimum number of monochromatic rectangles needed to partition f on M × N .
Q5. how many bits is the size of a m wrapped filter?
The compressed size of a length m wrapped filter with k hash functions encoding n elements is (asymptotically) at most1.42(1 − 1m)kn + 0.12m bits.
Q6. What is the probability of a false positive of a Bloom filter?
The probability of a false positive of a Bloom filter for a set S is denoted Pf (S) and depends on the number of elements in the set |S|, the length of the Bloom filter m, and the number of (independent) hash functions k used to compute the Bloom filter.
Q7. What is the probability of a given wrapped filter having weight i?
Proof: Given an initial weight w = kn, the probability of a given wrapped filter location having weight i is given by a binomial distributionpi =(wi)( 1 − 1m)w−i( 1m)i.Utilizing any scheme of entropy coding, the authors compress the average filter element to its entropy rate, H(pi) = ∑w i=0 pi log(pi).
Q8. What is the probability of two sets of vertices not corresponding to an edge?
The authors can compute the probability qG′(P ) of two randomly chosen vertices not corresponding to an edge as follows:qG′(P ) =2k ∑i=0(ni)αi(1 − α)n−i,4 where two sets contain a given element with probability α = p2 + (1 − p)2.
Q9. How many bits of communication does a Bloom filter require?
The price for this feature is that each entry can now take any of kn values (where n = |S| is the size of the set being wrapped), requiring a worst-case of m log(kn) bits of storage memory and communication for a filter of size m; in contrast, Bloom filters require only m bits of communication.
Q10. What are the qualities of a wrapped filter?
All these qualities make wrapped filters particularly suitable for the many network applications where there is a need to quickly and efficiently measure the consistency of distributed information.