scispace - formally typeset
Search or ask a question

Showing papers by "Abraham D. Flaxman published in 2009"


Journal ArticleDOI
TL;DR: A new method for obtaining hierarchical clustering based on the optimization of a cost function over trees of limited depth is proposed, and a message-passing method is derived that allows one to use it efficiently.
Abstract: We propose a new method for hierarchical clustering based on the optimisation of a cost function over trees of limited depth, and we derive a message--passing method that allows to solve it efficiently. The method and algorithm can be interpreted as a natural interpolation between two well-known approaches, namely single linkage and the recently presented Affinity Propagation. We analyze with this general scheme three biological/medical structured datasets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.

12 citations


Journal ArticleDOI
TL;DR: In this article, a message-passing method was proposed for hierarchical clustering based on the optimization of a cost function over trees of limited depth, and a messagepassing algorithm was derived to use it efficiently.
Abstract: We propose a new method for obtaining hierarchical clustering based on the optimization of a cost function over trees of limited depth, and we derive a message-passing method that allows one to use it efficiently. The method and the associated algorithm can be interpreted as a natural interpolation between two well-known approaches, namely that of single linkage and the recently presented affinity propagation. We analyse using this general scheme three biological/medical structured data sets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.

10 citations


Posted Content
TL;DR: In this article, it was shown that for all densities above a density that is slightly above the satisfiability threshold (more precisely at ratio (1+ \eps)2^k \ln 2, \eps=\eps(k) tending to 0 as k grows) the diameter is almost surely O(k2^{-k}n).
Abstract: It is known that random k-CNF formulas have a so-called satisfiability threshold at a density (namely, clause-variable ratio) of roughly 2^k\ln 2: at densities slightly below this threshold almost all k-CNF formulas are satisfiable whereas slightly above this threshold almost no k-CNF formula is satisfiable. In the current work we consider satisfiable random formulas, and inspect another parameter -- the diameter of the solution space (that is the maximal Hamming distance between a pair of satisfying assignments). It was previously shown that for all densities up to a density slightly below the satisfiability threshold the diameter is almost surely at least roughly n/2 (and n at much lower densities). At densities very much higher than the satisfiability threshold, the diameter is almost surely zero (a very dense satisfiable formula is expected to have only one satisfying assignment). In this paper we show that for all densities above a density that is slightly above the satisfiability threshold (more precisely at ratio (1+ \eps)2^k \ln 2, \eps=\eps(k) tending to 0 as k grows) the diameter is almost surely O(k2^{-k}n). This shows that a relatively small change in the density around the satisfiability threshold (a multiplicative (1 + \eps) factor), makes a dramatic change in the diameter. This drop in the diameter cannot be attributed to the fact that a larger fraction of the formulas is not satisfiable (and hence have diameter 0), because the non-satisfiable formulas are excluded from consideration by our conditioning that the formula is satisfiable.

1 citations