scispace - formally typeset
Topic

Backup

About: Backup is a(n) research topic. Over the lifetime, 31383 publication(s) have been published within this topic receiving 318087 citation(s). The topic is also known as: back up & backup copy.


Papers
More filters
Journal ArticleDOI
TL;DR: To achieve efficient data dynamics, the existing proof of storage models are improved by manipulating the classic Merkle Hash Tree construction for block tag authentication, and an elegant verification scheme is constructed for the seamless integration of these two salient features in the protocol design.
Abstract: Cloud Computing has been envisioned as the next-generation architecture of IT Enterprise. It moves the application software and databases to the centralized large data centers, where the management of the data and services may not be fully trustworthy. This unique paradigm brings about many new security challenges, which have not been well understood. This work studies the problem of ensuring the integrity of data storage in Cloud Computing. In particular, we consider the task of allowing a third party auditor (TPA), on behalf of the cloud client, to verify the integrity of the dynamic data stored in the cloud. The introduction of TPA eliminates the involvement of the client through the auditing of whether his data stored in the cloud are indeed intact, which can be important in achieving economies of scale for Cloud Computing. The support for data dynamics via the most general forms of data operation, such as block modification, insertion, and deletion, is also a significant step toward practicality, since services in Cloud Computing are not limited to archive or backup data only. While prior works on ensuring remote data integrity often lacks the support of either public auditability or dynamic data operations, this paper achieves both. We first identify the difficulties and potential security problems of direct extensions with fully dynamic data updates from prior works and then show how to construct an elegant verification scheme for the seamless integration of these two salient features in our protocol design. In particular, to achieve efficient data dynamics, we improve the existing proof of storage models by manipulating the classic Merkle Hash Tree construction for block tag authentication. To support efficient handling of multiple auditing tasks, we further explore the technique of bilinear aggregate signature to extend our main result into a multiuser setting, where TPA can perform multiple auditing tasks simultaneously. Extensive security and performance analysis show that the proposed schemes are highly efficient and provably secure.

1,307 citations

Patent
21 Oct 1996
TL;DR: In this paper, the authors propose a system for backuping files from disk volumes on multiple nodes of a computer network to a common random-access backup storage means, where duplicate files (or portions of files) may be identified across nodes, so that only a single copy of the contents of the duplicate files or portions thereof is stored in the backup storage mean.
Abstract: A system for backing up files from disk volumes on multiple nodes of a computer network to a common random-access backup storage means. As part of the backup process, duplicate files (or portions of files) may be identified across nodes, so that only a single copy of the contents of the duplicate files (or portions thereof) is stored in the backup storage means. For each backup operation after the initial backup on a particular volume, only those files which have changed since the previous backup are actually read from the volume and stored on the backup storage means. In addition, differences between a file and its version in the previous backup may be computed so that only the changes to the file need to be written on the backup storage means. All of these enhancements significantly reduce both the amount of storage and the amount of network bandwidth required for performing the backup. Even when the backup data is stored on a shared-file server, data privacy can be maintained by encrypting each file using a key generated from a fingerprint of the file contents, so that only users who have a copy of the file are able to produce the encryption key and access the file contents. To view or restore files from a backup, a user may mount the backup set as a disk volume with a directory structure identical to that of the entire original disk volume at the time of the backup.

1,113 citations

Journal ArticleDOI
TL;DR: In this paper, the index is maintained with an average of 9 (at least 4) transactions per second on an IBM 360/44 with a 2311 disc and the index pages are organized in a special datastructure, so-called B-trees.
Abstract: Organization and maintenance of an index for a dynamic random access file is considered. It is assumed that the index must be kept on some pseudo random access backup store like a disc or a drum. The index organization described allows retrieval, insertion, and deletion of keys in time proportional to logk I where I is the size of the index and k is a device dependent natural number such that the performance of the scheme becomes near optimal. Storage utilization is at least 50% but generally much higher. The pages of the index are organized in a special datastructure, so-called B-trees. The scheme is analyzed, performance bounds are obtained, and a near optimal k is computed. Experiments have been performed with indexes up to 100000 keys. An index of size 15000 (100000) can be maintained with an average of 9 (at least 4) transactions per second on an IBM 360/44 with a 2311 disc.

1,034 citations

Patent
26 Nov 1996
TL;DR: In this article, a highly reliable computer memory storage system that is divided into subsystems, each of which is provided in triplicate: a primary subsystem, a backup subsystem and a spare subsystem, is presented.
Abstract: A highly reliable computer memory storage system that is divided into subsystems, each of which is provided in triplicate: a primary subsystem, a backup subsystem and a spare subsystem. Upon detection of a non-recoverable failure in a primary subsystem, the backup subsystem substantially immediately assumes the tasks of the primary subsystem while the spare subsystem is integrated into the operation of the computer memory storage system. The triple replication of all subsystems and mechanisms for detecting failures in at least the primary and secondary subsystems provides an overall memory system which is highly reliable and substantially never requires servicing. In an alternative embodiment, three subsystems can share a load equally, for example a cooling or power supply load requirement. Upon failure, of any one or two of such three redundant subsystems, the remaining subsystems(s) is built with sufficient extra capacity that remaining subsystem(s) can still supply the total power or cooling requirements of the system.

951 citations

Proceedings ArticleDOI
03 Nov 2013
TL;DR: D-Streams enable a parallel recovery mechanism that improves efficiency over traditional replication and backup schemes, and tolerates stragglers, and can easily be composed with batch and interactive query models like MapReduce, enabling rich applications that combine these modes.
Abstract: Many "big data" applications must act on data in real time. Running these applications at ever-larger scales requires parallel platforms that automatically handle faults and stragglers. Unfortunately, current distributed stream processing models provide fault recovery in an expensive manner, requiring hot replication or long recovery times, and do not handle stragglers. We propose a new processing model, discretized streams (D-Streams), that overcomes these challenges. D-Streams enable a parallel recovery mechanism that improves efficiency over traditional replication and backup schemes, and tolerates stragglers. We show that they support a rich set of operators while attaining high per-node throughput similar to single-node systems, linear scaling to 100 nodes, sub-second latency, and sub-second fault recovery. Finally, D-Streams can easily be composed with batch and interactive query models like MapReduce, enabling rich applications that combine these modes. We implement D-Streams in a system called Spark Streaming.

941 citations


Network Information
Related Topics (5)
Wireless
133.4K papers, 1.9M citations
82% related
Software
130.5K papers, 2M citations
81% related
The Internet
213.2K papers, 3.8M citations
80% related
Network packet
159.7K papers, 2.2M citations
80% related
Information system
107.5K papers, 1.8M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20226
2021510
20201,188
20191,941
20182,420
20172,382