scispace - formally typeset
Search or ask a question
Author

S. Yemini

Bio: S. Yemini is an academic researcher. The author has contributed to research in topics: Event correlation & Complex event processing. The author has an hindex of 3, co-authored 3 publications receiving 659 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The authors describe a network management system and illustrates its application to managing a distributed database application on a complex enterprise network.
Abstract: The authors describe a network management system and illustrate its application to managing a distributed database application on a complex enterprise network.

404 citations

Book ChapterDOI
S. Klinger, S. Yemini, Yechiam Yemini1, D. Ohsie1, Salvatore J. Stolfo1 
01 Jan 1995
TL;DR: Preliminary benchmarks of the SEMS demonstrate that the coding approach provides a speedup at least two orders of magnitude over other published correlation systems, and scales well to very large domains involving thousands of problems.
Abstract: This paper describes a novel approach to event correlation in networks based on coding techniques. Observable symptom events are viewed as a code that identifies the problems that caused them; correlation is performed by decoding the set of observed symptoms. The coding approach has been implemented in SMARTS Event Management System (SEMS), as server running under Sun Solaris 2.3. Preliminary benchmarks of the SEMS demonstrate that the coding approach provides a speedup at least two orders of magnitude over other published correlation systems. In addition, it is resilient to high rates of symptom loss and false alarms. Finally, the coding approach scales well to very large domains involving thousands of problems.

239 citations

Book ChapterDOI
16 May 1997
TL;DR: The MODEL language, which comprises the event modeling component of SMARTS’ InChargeTM event correlation system, is introduced; it is demonstrated that MODEL generalizes the capabilities of other systems and is more flexible.
Abstract: Event modeling is an essential component of event correlation systems; this paper introduces the MODEL language, which comprises the event modeling component of SMARTS’ InChargeTM event correlation system. We demonstrate the features of the MODEL language through examples from the multimedia Quality of Service (QoS) domain. In addition, we provide a comparison of MODEL with the event modeling capabilities of other event correlation systems; we demonstrate that MODEL generalizes the capabilities of other systems and is more flexible.

22 citations


Cited by
More filters
Proceedings ArticleDOI
23 Jun 2002
TL;DR: This work presents a dynamic analysis methodology that automates problem determination in these environments by coarse-grained tagging of numerous real client requests as they travel through the system and using data mining techniques to correlate the believed failures and successes of these requests to determine which components are most likely to be at fault.
Abstract: Traditional problem determination techniques rely on static dependency models that are difficult to generate accurately in today's large, distributed, and dynamic application environments such as e-commerce systems. We present a dynamic analysis methodology that automates problem determination in these environments by 1) coarse-grained tagging of numerous real client requests as they travel through the system and 2) using data mining techniques to correlate the believed failures and successes of these requests to determine which components are most likely to be at fault. To validate our methodology, we have implemented Pinpoint, a framework for root cause analysis on the J2EE platform that requires no knowledge of the application components. Pinpoint consists of three parts: a communications layer that traces client requests, a failure detector that uses traffic-sniffing and middleware instrumentation, and a data analysis engine. We evaluate Pinpoint by injecting faults into various application components and show that Pinpoint identifies the faulty components with high accuracy and produces few false-positives.

910 citations

Proceedings Article
01 Jan 2002
TL;DR: The EMERALD (Event Monitoring Enabling Responses to Anomalous Live Disturbances) environment is a distributed scalable tool suite for tracking malicious activity through and across large networks.
Abstract: The EMERALD (Event Monitoring Enabling Responses to Anomalous Live Disturbances) environment is a distributed scalable tool suite for tracking malicious activity through and across large networks. EMERALD introduces a highly distributed, building-block approach to network surveillance, attack isolation, and automated response. It combines models from research in distributed high-volume event-correlation methodologies with over a decade of intrusion detection research and engineering experience. The approach is novel in its use of highly distributed, independently tunable, surveillance and response monitors that are deployable polymorphically at various abstract layers in a large network. These monitors contribute to a streamlined event-analysis system that combines signature analysis with statistical profiling to provide localized real-time protection of the most widely used network services on the Internet. Equally important, EMERALD introduces a recursive framework for coordinating the dissemination of analyses from the distributed monitors to provide a global detection and response capability that can counter attacks occurring across an entire network enterprise. Further, EMERALD introduces a versatile application programmers' interface that enhances its ability to integrate with heterogeneous target hosts and provides a high degree of interoperability with third-party tool suites.

729 citations

Patent
24 May 1995
TL;DR: In this paper, a computer implemented method on a computer readable media is provided for determining the source of a problem in a complex system of managed components based upon symptoms, where the problem source identification process is split into different activities.
Abstract: A computer implemented method on a computer readable media is provided for determining the source of a problem in a complex system of managed components based upon symptoms. The problem source identification process is split into different activities. Explicit configuration non-specific representations of types of managed components, their problems, symptoms and the relations along which the problems or symptoms propagate are created that can be manipulated by executable computer code. A data structure is produced for determining the source of a problem by combining one or more of the representations based on information of specific instances of managed components in the system. Computer code is then executed which uses the data structure to determine the source of the problem from one or more symptoms.

487 citations

Journal ArticleDOI
Klaus Julisch1
TL;DR: A novel alarm-clustering method is proposed that supports the human analyst in identifying root causes and shows that the alarm load decreases quite substantially if the identified root causes are eliminated so that they can no longer trigger alarms in the future.
Abstract: It is a well-known problem that intrusion detection systems overload their human operators by triggering thousands of alarms per day. This paper presents a new approach for handling intrusion detection alarms more efficiently. Central to this approach is the notion that each alarm occurs for a reason, which is referred to as the alarm's root causes. This paper observes that a few dozens of rather persistent root causes generally account for over 90p of the alarms that an intrusion detection system triggers. Therefore, we argue that alarms should be handled by identifying and removing the most predominant and persistent root causes. To make this paradigm practicable, we propose a novel alarm-clustering method that supports the human analyst in identifying root causes. We present experiments with real-world intrusion detection alarms to show how alarm clustering helped us identify root causes. Moreover, we show that the alarm load decreases quite substantially if the identified root causes are eliminated so that they can no longer trigger alarms in the future.

481 citations

01 Jan 2002
TL;DR: Recovery Oriented Computing (ROC) takes the perspective that hardware faults, software bugs, and operator errors are facts to be coped with, not problems to be solved, and thus offers higher availability.
Abstract: It is time to broaden our performance-dominated research agenda. A four order of magnitude increase in performance since the first ASPLOS in 1982 means that few outside the CS&E research community believe that speed is the only problem of computer hardware and software. Current systems crash and freeze so frequently that people become violent. 1 Fast but flaky should not be our 21 st century legacy. Recovery Oriented Computing (ROC) takes the perspective that hardware faults, software bugs, and operator errors are facts to be coped with, not problems to be solved. By concentrating on Mean Time to Repair (MTTR) rather than Mean Time to Failure (MTTF), ROC reduces recovery time and thus offers higher availability. Since a large portion of system administration is dealing with failures, ROC may also reduce total cost of ownership. One to two orders of magnitude reduction in cost mean that the purchase price of hardware and software is now a small part of the total cost of ownership. In addition to giving the motivation and definition of ROC, we introduce failure data for Internet sites that shows that the leading cause of outages is operator error. We also demonstrate five ROC techniques in five case studies, which we hope will influence designers of architectures and operating systems. If we embrace availability and maintainability, systems of the future may compete on recovery performance rather than just SPEC performance, and on total cost of ownership rather than just system price. Such a change may restore our pride in the architectures and operating systems we craft.

470 citations