scispace - formally typeset
Search or ask a question

Showing papers by "Yehuda Afek published in 2012"


Book ChapterDOI
16 Oct 2012
TL;DR: The CBTree is presented, a new counting-based self-adjusting binary search tree that moves more frequently accessed nodes closer to the root that improves performance compared to existing concurrent search trees on non-uniform access sequences derived from real workloads.
Abstract: We present the CBTree, a new counting-based self-adjusting binary search tree that, like splay trees, moves more frequently accessed nodes closer to the root. After m operations on n items, c of which access some item v, an operation on v traverses a path of length $\mathcal{O}(\log\dfrac{m}{c})$ while performing few if any rotations. In contrast to the traditional self-adjusting splay tree in which each accessed item is moved to the root through a sequence of tree rotations, the CBTree performs rotations infrequently (an amortized subconstant o(1) per operation if m≫n), mostly at the bottom of the tree. As a result, the CBTree scales with the amount of concurrency. We adapt the CBTree to a multicore setting and show experimentally that it improves performance compared to existing concurrent search trees on non-uniform access sequences derived from real workloads.

64 citations


Book ChapterDOI
16 Oct 2012
TL;DR: Pessimistic lock-elision (PLE), a new approach for non-speculatively replacing read-write locks with pessimistic software transactional code that allows read- write concurrency even for contended code and even if the code includes system calls, is introduced.
Abstract: Read-write locks are one of the most prevalent lock forms in concurrent applications because they allow read accesses to locked code to proceed in parallel. However, they do not offer any parallelism between reads and writes. This paper introduces pessimistic lock-elision (PLE), a new approach for non-speculatively replacing read-write locks with pessimistic (i.e. non-aborting) software transactional code that allows read-write concurrency even for contended code and even if the code includes system calls. On systems with hardware transactional support, PLE will allow failed transactions, or ones that contain system calls, to preserve read-write concurrency. Our PLE algorithm is based on a novel encounter-order design of a fully pessimistic STM system that in a variety of benchmarks spanning from counters to trees, even when up to 40% of calls are mutating the locked structure, provides up to 5 times the performance of a state-of-the-art read-write lock.

32 citations


01 Jan 2012
TL;DR: In this article, the authors propose a new approach for non-speculatively replacing read-write locks with pessimistic (i.e., non-aborting) software transactional code that allows readwrite concurrency even for contended code and even if the code includes system calls.
Abstract: Read-write locks are one of the most prevalent lock forms in concurrent applications because they allow read accesses to locked code to proceed in parallel. However, they do not offer any parallelism between reads and writes. This paper introduces pessimistic lock-elision (PLE), a new approach for non-speculatively replacing read-write locks with pessimistic (i.e. non-aborting) software transactional code that allows read-write concurrency even for contended code and even if the code includes system calls. On systems with hardware transactional support, PLE will allow failed transactions, or ones that contain system calls, to preserve read-write concurrency. Our PLE algorithm is based on a novel encounter-order design of a fully pessimistic STM system that in a variety of benchmarks spanning from counters to trees, even when up to 40% of calls are mutating the locked structure, provides up to 5 times the performance of a state-of-the-art read-write lock.

30 citations


Posted Content
TL;DR: In this paper, the authors investigated the message survivability in a per-round basis that allows for the minimal global cooperation, i.e., allows to solve any task that is wait-free read-write solvable.
Abstract: We consider synchronous dynamic networks which like radio networks may have asymmetric communication links, and are affected by communication rather than processor failures. In this paper we investigate the minimal message survivability in a per round basis that allows for the minimal global cooperation, i.e., allows to solve any task that is wait-free read-write solvable. The paper completely characterizes this survivability requirement. Message survivability is formalized by considering adversaries that have a limited power to remove messages in a round. Removal of a message on a link in one direction does not necessarily imply the removal of the message on that link in the other direction. Surprisingly there exist a single strongest adversary which solves any wait-free read/write task. Any different adversary that solves any wait-free read/write task is weaker, and any stronger adversary will not solve any wait-free read/write task. ABD \cite{ABD} who considered processor failure, arrived at an adversary that is $n/2$ resilient, consequently can solve tasks, such as $n/2$-set-consensus, which are not read/write wait-free solvable. With message adversaries, we arrive at an adversary which has exactly the read-write wait-free power. Furthermore, this adversary allows for a considerably simpler (simplest that we know of) proof that the protocol complex of any read/write wait-free task is a subdivided simplex, finally making this proof accessible for students with no algebraic-topology prerequisites, and alternatively dispensing with the assumption that the Immediate Snapshot complex is a subdivided simplex.

26 citations


Journal ArticleDOI
TL;DR: This paper introduces new algorithms and techniques that drastically reduce this space requirement for such bump-in-the-wire devices like security and other content based networking tools, thus making real-time compressed traffic inspection a viable option for networking devices.

12 citations


Proceedings ArticleDOI
29 Oct 2012
TL;DR: The emerging multi-core computer architecture is used to design a general framework for mitigating network-based complexity attacks, called MCA2—Multi-Core Architecture for Mitigating Complexity Attacks.
Abstract: This paper takes advantage of the emerging multi-core computer architecture to design a general framework for mitigating network-based complexity attacks. In complexity attacks, an attacker carefully crafts "heavy" messages (or packets) such that each heavy message consumes substantially more resources than a normal message. Then, it sends a sufficient number of heavy messages to bring the system to a crawl at best. In our architecture, called MCA2---Multi-Core Architecture for Mitigating Complexity Attacks---cores quickly identify such suspicious messages and divert them to a fraction of the cores that are dedicated to handle all the heavy messages. This keeps the rest of the cores relatively unaffected and free to provide the legitimate traffic the same quality of service as if no attack takes place.We demonstrate the effectiveness of our architecture by examining cache-miss complexity attacks against Deep Packet Inspection (DPI) engines. For example, for Snort DPI engine, an attack in which 30% of the packets are malicious degrades the system throughput by over 50%, while with MCA2 the throughput drops by either 20% when no packets are dropped or by 10% in case dropping of heavy packets is allowed. At 60% malicious packets, the corresponding numbers are 70%, 40% and 23%.

10 citations


Journal ArticleDOI
TL;DR: This paper presents a highly scalable wait-free implementation of a concurrent size() operation based on a new lock-free interrupting snapshots algorithm, which scales well and significantly outperforms existing implementations.

6 citations


Journal ArticleDOI
TL;DR: It is derived that in a restricted class of eventual failure detectors there does not exist a single weakest oracle, but a weakest family of oracles, and every oracle that allows for solving renaming provides at least as much information about failures as one of the oracles in $$\zeta _n$$.
Abstract: We address the question of the weakest failure detector to circumvent the impossibility of $$(2n-2)$$ -renaming in a system of up to $$n$$ participating processes. We derive that in a restricted class of eventual failure detectors there does not exist a single weakest oracle, but a weakest family of oracles $$\zeta _n$$ : every two oracles in $$\zeta _n$$ are incomparable, and every oracle that allows for solving renaming provides at least as much information about failures as one of the oracles in $$\zeta _n$$ . As a by product, we obtain one more evidence that renaming is strictly easier to solve than set agreement.

6 citations


Posted Content
TL;DR: In this paper, the authors considered the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing, and proved a lower bound that it is not possible to locally converge to an MIS in subpolynomial time.
Abstract: We consider the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing. The model consists of an anonymous broadcast network in which nodes have no knowledge about the topology of the network or even an upper bound on its size. Furthermore, it is assumed that an adversary chooses at which time slot each node wakes up. At each time slot a node can either beep, that is, emit a signal, or be silent. At a particular time slot, beeping nodes receive no feedback, while silent nodes can only differentiate between none of its neighbors beeping, or at least one of its neighbors beeping. We start by proving a lower bound that shows that in this model, it is not possible to locally converge to an MIS in sub-polynomial time. We then study four different relaxations of the model which allow us to circumvent the lower bound and find an MIS in polylogarithmic time. First, we show that if a polynomial upper bound on the network size is known, it is possible to find an MIS in O(log^3 n) time. Second, if we assume sleeping nodes are awoken by neighboring beeps, then we can also find an MIS in O(log^3 n) time. Third, if in addition to this wakeup assumption we allow sender-side collision detection, that is, beeping nodes can distinguish whether at least one neighboring node is beeping concurrently or not, we can find an MIS in O(log^2 n) time. Finally, if instead we endow nodes with synchronous clocks, it is also possible to find an MIS in O(log^2 n) time.

4 citations


Posted Content
TL;DR: It is proved that for every series of tournaments that the adversary selects, it is still true that after two rounds of communication, the initial input of at least one processor reaches everyone.
Abstract: We think of a tournament T = ([n];E) as a communication network where in each round of communication processor Pi sends its information to Pj, for every directed edge ij ∈ E(T). By Landau’s theorem (1953) there is a King in T, i.e., a processor whose initial input reaches every other processor in two rounds or less. Namely, a processor P such that after two rounds of communication along T’s edges, the initial information of P reaches all other processors. Here we consider a more general scenario where an adversary selects an arbitrary series of tournaments T1;T2;:::, so that in each round s = 1;2;:::, communication is governed by the corresponding tournament Ts. We prove that for every series of tournaments that the adversary selects, it is still true that after two rounds of communication, the initial input of at least one processor reaches everyone. Concretely, we show that for every two tournaments T1;T2 there is a vertex in [n] that can reach all vertices via (i) A step in T1, or (ii) A step in T2 or (iii) A step in T1 followed by a step in T2.

1 citations


Posted Content
TL;DR: In this paper, it was shown that for musical chairs, the scheduler has a strategy that is guaranteed to make the game continue indefinitely and thus win, and that this bound is tight.
Abstract: In the {\em Musical Chairs} game $MC(n,m)$ a team of $n$ players plays against an adversarial {\em scheduler} The scheduler wins if the game proceeds indefinitely, while termination after a finite number of rounds is declared a win of the team At each round of the game each player {\em occupies} one of the $m$ available {\em chairs} Termination (and a win of the team) is declared as soon as each player occupies a unique chair Two players that simultaneously occupy the same chair are said to be {\em in conflict} In other words, termination (and a win for the team) is reached as soon as there are no conflicts The only means of communication throughout the game is this: At every round of the game, the scheduler selects an arbitrary nonempty set of players who are currently in conflict, and notifies each of them separately that it must move A player who is thus notified changes its chair according to its deterministic program As we show, for $m\ge 2n-1$ chairs the team has a winning strategy Moreover, using topological arguments we show that this bound is tight For $m\leq 2n-2$ the scheduler has a strategy that is guaranteed to make the game continue indefinitely and thus win We also have some results on additional interesting questions For example, if $m \ge 2n-1$ (so that the team can win), how quickly can they achieve victory?

01 Apr 2012
TL;DR: This paper presents a highly scalable wait-free implementation of a concurrent size() operation based on a new lock-free interrupting snapshots algorithm for the classical atomic snapshot problem, significantly outperforming existing implementations.
Abstract: The JavaTM developers kit requires a size() operation for all objects. Unfortunately, the best known solution, available in the Java concurrency package, has a blocking concurrent implementation that does not scale. This paper presents a highly scalable wait-free implementation of a concurrent size() operation based on a new lock-free interrupting snapshots algorithm for the classical atomic snapshot problem. This is perhaps the first example of the potential benefit from using atomic snapshots in real industrial code (the concurrency package is currently deployed on over 10 million desktops). The key idea behind the new algorithm is to allow snapshot scans to interrupt each other until they agree on a shared linearization point with respect to updates, rather than trying, as was done in the past, to have them coordinate the collecting of a shared global view. As we show, the new algorithm scales well, significantly outperforming existing implementations.