scispace - formally typeset
Search or ask a question

Showing papers by "Yehuda Afek published in 2011"


Journal ArticleDOI
14 Jan 2011-Science
TL;DR: Modeling of development in the fruit fly yields an algorithm useful in designing wireless communication networks that combines two attractive features, and suggests that simple and efficient algorithms can be developed on the basis of biologically derived insights.
Abstract: Computational and biological systems are often distributed so that processors (cells) jointly solve a task, without any of them receiving all inputs or observing all outputs. Maximal independent set (MIS) selection is a fundamental distributed computing procedure that seeks to elect a set of local leaders in a network. A variant of this problem is solved during the development of the fly's nervous system, when sensory organ precursor (SOP) cells are chosen. By studying SOP selection, we derived a fast algorithm for MIS selection that combines two attractive features. First, processors do not need to know their degree; second, it has an optimal message complexity while only using one-bit messages. Our findings suggest that simple and efficient algorithms can be developed on the basis of biologically derived insights.

177 citations


Book ChapterDOI
20 Sep 2011
TL;DR: This work considers the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing, and proves a lower bound that shows that in this model, it is not possible to locally converge to an MIS in sub-polynomial time.
Abstract: We consider the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing. The model consists of an anonymous broadcast network in which nodes have no knowledge about the topology of the network or even an upper bound on its size. Furthermore, it is assumed that nodes wake up asynchronously. At each time slot a node can either beep (i.e., emit a signal) or be silent. At a particular time slot, beeping nodes receive no feedback, while silent nodes can only differentiate between none of its neighbors beeping, or at least one neighbor beeping. We start by proving a lower bound that shows that in this model, it is not possible to locally converge to an MIS in sub-polynomial time. We then study four different relaxations of the model which allow us to circumvent the lower bound and compute an MIS in polylogarithmic time. First, we show that if a polynomial upper bound on the network size is known, it is possible to find an MIS in O(log3 n) time. Second, if sleeping nodes are awoken by neighboring beeps, then we can also find an MIS in O(log3 n) time. Third, if in addition to this wakeup assumption we allow beeping nodes to receive feedback to identify if at least one neighboring node is beeping concurrently (i.e., sender-side collision detection) we can find an MIS in O(log2 n) time. Finally, if instead we endow nodes with synchronous clocks, it is also possible to compute an MIS in O(log2 n) time.

54 citations


Proceedings ArticleDOI
04 Jun 2011
TL;DR: It is shown that a simple adjustment within the allocator to control the spacing of blocks can provide better index coverage, which in turn reduces the superfluous conflict miss rate in various applications, improving performance with no observed negative consequences.
Abstract: Poor placement of data blocks in memory may negatively impact application performance because of an increase in the cache conflict miss rate [18]. For dynamically allocated structures this placement is typically determined by the memory allocator. Cache index-oblivious allocators may inadvertently place blocks on a restricted fraction of the available cache indexes, artificially and needlessly increasing the conflict miss rate. While some allocators are less vulnerable to this phenomena, no general-purpose malloc allocator is index-aware and methodologically addresses this concern. We demonstrate that many existing state-of-the-art allocators are index-oblivious, admitting performance pathologies for certain block sizes. We show that a simple adjustment within the allocator to control the spacing of blocks can provide better index coverage, which in turn reduces the superfluous conflict miss rate in various applications, improving performance with no observed negative consequences. The result is an index-aware allocator. Our technique is general and can easily be applied to most memory allocators and to various processor architectures.Furthermore, we can reduce inter-thread and inter-process conflict misses for processors where threads concurrently share the level-1 cache such as the Sun UltraSPARC-T2™ and Intel "Nehalem" by coloring the placement of blocks so that allocations for different threads and processes start on different cache indexes.

31 citations


Book ChapterDOI
13 Dec 2011
TL;DR: It is shown empirically that the COP approach can enhance a software transactional memory (STM) framework to deliver more efficient concurrent data structures from serial source code and deliver performance comparable to that of more complex fine-grained structures.
Abstract: It is well known that guaranteeing program consistency when accessing shared data comes at the price of degraded performance and scalability. This paper initiates the investigation of consistency oblivious programming (COP). In COP, sections of concurrent code that meet certain criteria are executed without checking for consistency. However, checkpoints are added before any shared data modification to verify the algorithm was on the right track, and if not, it is re-executed in a more conservative and expensive consistent way. We show empirically that the COP approach can enhance a software transactional memory (STM) framework to deliver more efficient concurrent data structures from serial source code. In some cases the COP code delivers performance comparable to that of more complex fine-grained structures.

27 citations


01 Dec 2011
TL;DR: The 15th International Conference on OPODIS 2011, Toulouse, France, December 13-16, 2011 as mentioned in this paper, was the first one to address the problem of OPODIs 2011.
Abstract: 15th International Conference, OPODIS 2011, Toulouse, France, December 13-16, 2011. Proceedings

26 citations


Proceedings ArticleDOI
06 Jun 2011
TL;DR: This paper considers the power of objects in the unbounded concurrency shared memory model, where there is an infinite set of processes and the number of processes active concurrently may increase without bound, and divides the infinite-consensus class of objects into two, those that can solve consensus for unbounding concurrency, and those that cannot.
Abstract: We consider the power of objects in the unbounded concurrency shared memory model, where there is an infinite set of processes and the number of processes active concurrently may increase without bound. By studying this model we obtain new results and observations that are relevant and meaningful to the standard bounded concurrency model.First we resolve an open problem from 2006 and provide, contrary to what was conjectured, an unbounded concurrency wait-free implementation of a swap object from 2-consensus objects. This construction resolves another puzzle that has eluded us for a long time, that of considerably simplifying a 16 year old complicated bounded concurrency swap construction.A further insight to the traditional bounded concurrency model that we obtain by studying the unbounded concurrency model, is a refinement of the top level of the wait-free hierarchy, the class of infinite-consensus number objects. First we resolve an open question of Merritt and Taubenfeld from 2003, showing that having n-consensus objects for all n does not imply consensus under unbounded concurrency. I.e., consensus alone, treated as a black box, cannot be "boosted" in this way. We continue to show an infinite-number consensus object that while able to perform consensus for any n-bounded concurrency (n unknown in advance) cannot solve consensus in the face of unbounded concurrency. This divides the infinite-consensus class of objects into two, those that can solve consensus for unbounded concurrency, and those that cannot.

10 citations


Book ChapterDOI
20 Sep 2011
TL;DR: A new highly scalable, high throughput asymmetric rendezvous system that outperforms prior synchronous queue and elimination array implementations under both symmetric and asymmetric workloads is presented.
Abstract: In an asymmetric rendezvous system, such as an unfair synchronous queue and an elimination array, threads of two types, consumers and producers, show up and are matched, each with a unique thread of the other type. Here we present a new highly scalable, high throughput asymmetric rendezvous system that outperforms prior synchronous queue and elimination array implementations under both symmetric and asymmetric workloads (more operations of one type than the other). Consequently, we also present a highly scalable elimination-based stack.

9 citations


Book ChapterDOI
09 May 2011
TL;DR: New algorithms and techniques are introduced that drastically reduce this space requirement by over 80%, with only a slight increase in the time overhead, thus making real-time compressed traffic inspection a viable option for network devices.
Abstract: Compressing web traffic using standard GZIP is becoming both popular and challenging due to the huge increase in wireless web devices, where bandwidth is limited. Security and other content based networking devices are required to decompress the traffic of tens of thousands concurrent connections in order to inspect the content for different signatures. The major limiting factor in this process is the high memory requirements of 32KB per connection that leads to hundreds of megabytes to gigabytes of main memory consumption. This requirement inhibits most devices from handling compressed traffic, which in turn either limits traffic compression or introduces security holes and other dysfunctionalities. In this paper we introduce new algorithms and techniques that drastically reduce this space requirement by over 80%, with only a slight increase in the time overhead, thus making real-time compressed traffic inspection a viable option for network devices.

7 citations


Posted Content
TL;DR: The oblivious protocols are introduced, a new framework for distributed computation with limited communication that can be extended to the well-known Adaptive Renaming problem, using a name-space that is as small as that of the optimal nonoblivious protocol.
Abstract: Communication is a crucial ingredient in every kind of collaborative work But what is the least possible amount of communication required for a given task? We formalize this question by introducing a new framework for distributed computation, called {\em oblivious protocols} We investigate the power of this model by considering two concrete examples, the {\em musical chairs} task $MC(n,m)$ and the well-known {\em Renaming} problem The $MC(n,m)$ game is played by $n$ players (processors) with $m$ chairs Players can {\em occupy} chairs, and the game terminates as soon as each player occupies a unique chair Thus we say that player $P$ is {\em in conflict} if some other player $Q$ is occupying the same chair, ie, termination means there are no conflicts By known results from distributed computing, if $m \le 2n-2$, no strategy of the players can guarantee termination However, there is a protocol with $m = 2n-1$ chairs that always terminates Here we consider an oblivious protocol where in every time step the only communication is this: an adversarial {\em scheduler} chooses an arbitrary nonempty set of players, and for each of them provides only one bit of information, specifying whether the player is currently in conflict or not A player notified not to be in conflict halts and never changes its chair, whereas a player notified to be in conflict changes its chair according to its deterministic program Remarkably, even with this minimal communication termination can be guaranteed with only $m=2n-1$ chairs Likewise, we obtain an oblivious protocol for the Renaming problem whose name-space is small as that of the optimal nonoblivious distributed protocol Other aspects suggest themselves, such as the efficiency (program length) of our protocols We make substantial progress here as well, though many interesting questions remain open

2 citations


Proceedings ArticleDOI
30 May 2011
TL;DR: Lock Stealing is presented, a novel contention management algorithm for minimizing the effect of context switches by enabling threads to acquire locks which are held by other threads.
Abstract: Lock-based software transactional memory algorithms do not perform well in workloads with a high rate of context switches, which is caused for example by scheduling events or page faults. This occurs since threads that are switched-out by the operating system while holding locks block other threads from progressing, causing their transactions to abort repeatedly. We present here Lock Stealing, a novel contention management algorithm for minimizing the effect of context switches by enabling threads to acquire locks which are held by other threads. While some methods addressing this problem exist (e.g., schedctl in Solaris) they are best effort and only cover scheduling related context switches. In addition, they are platform specific and thus are not suitable or available in managed runtimes such as Java or .NET. In contrast, our approach is solely based on user-level code and is de-coupled from specific operating system events. We evaluate the performance of our approach on a set of benchmarks and observe improvements in both micro benchmarks and more elaborate test applications.

1 citations


20 Sep 2011
TL;DR: In this article, the authors consider the musical chairs task MC(n,m), where each player can only observe whether the player itself is in conflict or not, and nothing else, and show that even with minimal communication termination can be guaranteed with only m = 2n-1 chairs.
Abstract: We introduce oblivious protocols, a new framework for distributed computation with limited communication.Within this model we consider the musical chairs task MC(n,m), involving n players (processors) and m chairs. Initially, players occupy arbitrary chairs. Two players are in conflict if they both occupy the same chair. The task terminates when there are no conflicts and each player occupies a different chair. Our oblivious protocols use only limited communication, and do so in an asynchronous fashion. Essentially, a player can only observe whether the player itself is in conflict or not, and nothing else. A player observing no conflict halts and never changes its chair, whereas a player observing a conflict changes its chair according to its deterministic program. Known results imply that even with more general communication primitives, no strategy of the players can guarantee termination if m < 2n-1. We show that even with this minimal communication termination can be guaranteed with only m = 2n-1 chairs. Our oblivious protocol can be extended to the well-known Adaptive Renaming problem, using a name-space that is as small as that of the optimal nonoblivious protocol. We also make substantial progress in optimizing other parameters (such as program length) for our protocols, though many interesting questions remain open.

Posted Content
TL;DR: A simple new randomized distributed MIS algorithm which uses only 1 bit unary messages, allows for asynchronous wake up, does not assume any knowledge of the network topology, and assumes only a loose bound on the network size is presented.
Abstract: Humans are very good at optimizing solutions for specific problems. Biological processes, on the other hand, have evolved to handle multiple constrained distributed environments and so they are robust and adaptable. Inspired by observations made in a biological system we have recently presented a simple new randomized distributed MIS algorithm \cite{ZScience}. Here we extend these results by removing a number of strong assumptions that we made, making the algorithms more practical. Specifically we present an $O(\log^2 n)$ rounds synchronous randomized MIS algorithm which uses only 1 bit unary messages (a beeping signal with collision detection), allows for asynchronous wake up, does not assume any knowledge of the network topology, and assumes only a loose bound on the network size. We also present an extension with no collision detection in which the round complexity increases to $(\log^3 n)$. Finally, we show that our algorithm is optimal under some restriction, by presenting a tight lower bound of $\Omega(\log^2 n)$ on the number of rounds required to construct a MIS for a restricted model.