scispace - formally typeset
Search or ask a question
Topic

Downtime

About: Downtime is a research topic. Over the lifetime, 4391 publications have been published within this topic receiving 50420 citations.


Papers
More filters
Proceedings ArticleDOI

[...]

27 Jun 1995
TL;DR: A model for analyzing software rejuvenation in continuously-running applications is presented and express downtime and costs due to downtime during rejuvenations in terms of the parameters in that model and Threshold conditions for rejuvenation to be beneficial are derived.
Abstract: Software rejuvenation is the concept of gracefully terminating an application and immediately restarting it at a clean internal state. In a client-server type of application where the server is intended to ran perpetually for providing a service to its clients, rejuvenating the server process periodically during the most idle time of the server increases the availability of that service. In a long-running computation-intensive application, rejuvenating the application periodically and restarting it at a previous checkpoint increases the likelihood of successfully completing the application execution. We present a model for analyzing software rejuvenation in such continuously-running applications and express downtime and costs due to downtime during rejuvenation in terms of the parameters in that model. Threshold conditions for rejuvenation to be beneficial are also derived. We implemented a reusable module to perform software rejuvenation. That module can be embedded in any existing application on a UNIX platform with minimal effort. Experiences with software rejuvenation in a billing data collection subsystem of a telecommunications operations system and other continuously-running systems and scientific applications in AT&T are described. >

883 citations

Journal ArticleDOI

[...]

TL;DR: In this article, the authors synthesize and place these individual pieces of information in context, while identifying their merits and weaknesses, and discuss the identified challenges, and in doing so, alerts researchers to opportunities for conducting advanced research in the field.
Abstract: Machinery prognosis is the forecast of the remaining operational life, future condition, or probability of reliable operation of an equipment based on the acquired condition monitoring data. This approach to modern maintenance practice promises to reduce downtime, spares inventory, maintenance costs, and safety hazards. Given the significance of prognostics capabilities and the maturity of condition monitoring technology, there have been an increasing number of publications on rotating machinery prognostics in the past few years. These publications covered a wide spectrum of prognostics techniques. This review article first synthesises and places these individual pieces of information in context, while identifying their merits and weaknesses. It then discusses the identified challenges, and in doing so, alerts researchers to opportunities for conducting advanced research in the field. Current methods for predicting rotating machinery failures are summarised and classified as conventional reliability models, condition-based prognostics models and models integrating reliability and prognostics. Areas in need of development or improvement include the integration of condition monitoring and reliability, utilisation of incomplete trending data, consideration of effects from maintenance actions and variable operating conditions, derivation of the non-linear relationship between measured data and actual asset health, consideration of failure interactions, practicability of requirements and assumptions, as well as development of performance evaluation frameworks.

842 citations

Proceedings Article

[...]

16 Apr 2008
TL;DR: Remus as mentioned in this paper is a high availability service that allows existing, unmodified software to be protected from the failure of the physical machine on which it runs by encapsulating protected software in a virtual machine, asynchronously propagating changed state to a backup host at frequencies as high as forty times a second.
Abstract: Allowing applications to survive hardware failure is an expensive undertaking, which generally involves reengineering software to include complicated recovery logic as well as deploying special-purpose hardware; this represents a severe barrier to improving the dependability of large or legacy applications. We describe the construction of a general and transparent high availability service that allows existing, unmodified software to be protected from the failure of the physical machine on which it runs. Remus provides an extremely high degree of fault tolerance, to the point that a running system can transparently continue execution on an alternate physical host in the face of failure with only seconds of downtime, while completely preserving host state such as active network connections. Our approach encapsulates protected software in a virtual machine, asynchronously propagates changed state to a backup host at frequencies as high as forty times a second, and uses speculative execution to concurrently run the active VM slightly ahead of the replicated system state.

715 citations

Journal ArticleDOI

[...]

TL;DR: A review of the state-of-the-art in the condition monitoring of wind turbines can be found in this article, which describes the different maintenance strategies, condition monitoring techniques and methods, and highlights in a table the various combinations of these that have been reported in the literature.
Abstract: Wind Turbines (WT) are one of the fastest growing sources of power production in the world today and there is a constant need to reduce the costs of operating and maintaining them. Condition monitoring (CM) is a tool commonly employed for the early detection of faults/failures so as to minimise downtime and maximize productivity. This paper provides a review of the state-of-the-art in the CM of wind turbines, describing the different maintenance strategies, CM techniques and methods, and highlighting in a table the various combinations of these that have been reported in the literature. Future research opportunities in fault diagnostics are identified using a qualitative fault tree analysis.

685 citations

Proceedings Article

[...]

10 Apr 2005
TL;DR: This is the first system that can migrate unmodified applications on unmodified mainstream Intel x86-based operating system, including Microsoft Windows, Linux, Novell NetWare and others, to provide fast, transparent application migration.
Abstract: This paper describes the design and implementation of a system that uses virtual machine technology [1] to provide fast, transparent application migration. This is the first system that can migrate unmodified applications on unmodified mainstream Intel x86-based operating system, including Microsoft Windows, Linux, Novell NetWare and others. Neither the application nor any clients communicating with the application can tell that the application has been migrated. Experimental measurements show that for a variety of workloads, application downtime caused by migration is less than a second.

587 citations

Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
78% related
Optimization problem
96.4K papers, 2.1M citations
75% related
Image processing
229.9K papers, 3.5M citations
74% related
Information system
107.5K papers, 1.8M citations
74% related
Cloud computing
156.4K papers, 1.9M citations
73% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023269
2022578
2021276
2020337
2019343
2018286