Showing papers on "Data Corruption published in 2009"

PDF

Open Access

Proceedings Article•DOI•

Error propagation analysis for file systems

[...]

Cindy Rubio-González¹, Haryadi S. Gunawi¹, Ben Liblit¹, Remzi H. Arpaci-Dusseau¹, Andrea C. Arpaci-Dusseau¹ - Show less +1 more•Institutions (1)

University of Wisconsin-Madison¹

15 Jun 2009

TL;DR: This work proposes an interprocedural static analysis that tracks errors as they propagate through file system code, and detects overwritten, out-of-scope, and unsaved unchecked errors.

...read moreread less

Abstract: Unchecked errors are especially pernicious in operating system file management code. Transient or permanent hardware failures are inevitable, and error-management bugs at the file system layer can cause silent, unrecoverable data corruption. We propose an interprocedural static analysis that tracks errors as they propagate through file system code. Our implementation detects overwritten, out-of-scope, and unsaved unchecked errors. Analysis of four widely-used Linux file system implementations (CIFS, ext3, IBM JFS and ReiserFS), a relatively new file system implementation (ext4), and shared virtual file system (VFS) code uncovers 312 error propagation bugs. Our flow- and context-sensitive approach produces more precise results than related techniques while providing better diagnostic information, including possible execution paths that demonstrate each bug found.

...read moreread less

80 citations

Journal Article•DOI•

DRAM errors in the wild

[...]

SchroederBianca, PinheiroEduardo, WeberWolf-Dietrich

15 Jun 2009-Sigmetrics Performance Evaluation Review

TL;DR: This research presents a meta-modelling architecture that automates the very labor-intensive and therefore time-heavy and therefore expensive process of manually cataloging and reprograming DRAM modules for use in compute clusters.

...read moreread less

Abstract: Errors in dynamic random access memory (DRAM) are a common form of hardware failure in modern compute clusters. Failures are costly both in terms of hardware replacement costs and service disruptio...

...read moreread less

48 citations

Proceedings Article•DOI•

Evaluating the impact of Undetected Disk Errors in RAID systems

[...]

Eric W. D. Rozier¹, Wendy A. Belluomini¹, Veera W. Deenadhayalan¹, J. L. Hafner¹, KK Rao¹, Pin Zhou¹ - Show less +2 more•Institutions (1)

IBM¹

29 Sep 2009

TL;DR: Results indicate that corruption from UDEs is a significant problem in the absence of protection schemes and that such schemes dramatically decrease the rate of undetected data corruption.

...read moreread less

Abstract: Despite the reliability of modern disks, recent studies have made it clear that a new class of faults, UndetectedDisk Errors (UDEs) also known as silent data corruption events, become a real challenge as storage capacity scales. While RAID systems have proven effective in protecting data from traditional disk failures, silent data corruption events remain a significant problem unaddressed by RAID. We present a fault model for UDEs, and a hybrid framework for simulating UDEs in large-scale systems. The framework combines a multi-resolution discrete event simulator with numerical solvers. Our implementation enables us to model arbitrary storage systems and workloads and estimate the rate of undetected data corruptions. We present results for several systems and workloads, from gigascale to petascale. These results indicate that corruption from UDEs is a significant problem in the absence of protection schemes and that such schemes dramatically decrease the rate of undetected data corruption.

...read moreread less

32 citations

Flicker: Saving Refresh-Power in Mobile Devices through Critical Data Partitioning

[...]

Song Liu, Karthik Pattabiraman, Thomas Moscibroda, Benjamin G. Zorn¹•Institutions (1)

Microsoft¹

07 Oct 2009

TL;DR: Flicker explores a novel and interesting trade-off between energy consumption and hardware correctness, and shows that many mobile applications are naturally tolerant to errors in the non-critical data, and in the vast majority of cases, the errors have little or no impact on the application’s final outcome.

...read moreread less

Abstract: Mobile devices are left in sleep mode for long periods of time But even while in sleep mode, the contents of DRAM memory need to be periodically refreshed, which consumes a significant fraction of power in mobile devices This paper introduces Flicker, an application-level technique to reduce refresh power in DRAM memories Flicker enables developers to specify critical and non-critical data in programs and the runtime system allocates this data in separate parts of memory The portion of memory containing critical data is refreshed at the regular refresh-rate, while the portion containing non-critical data is refreshed at substantially lower rates This saves energy at the cost of a modest increase in data corruption in the non-critical data Flicker thus explores a novel and interesting trade-off between energy consumption and hardware correctness We show that many mobile applications are naturally tolerant to errors in the non-critical data, and in the vast majority of cases, the errors have little or no impact on the application’s final outcome We also find that Flicker can save between 20-25% of the power consumed by the memory subsystem in a mobile device, with negligible impact on application performance Flicker is implemented almost entirely in software, and requires only modest changes to the application, operating system and hardware

...read moreread less

30 citations

Patent•

Method, apparatus, and device for protecting against programming attacks and/or data corruption

[...]

Scott A. Krig¹•Institutions (1)

Advanced Micro Devices¹

27 Oct 2009

TL;DR: In this article, signature verification policy information associated with each of the plurality of target memory segments is evaluated during run-time, and the signature verification is then repeatedly performed, during runtime, on each of a plurality of memory segments based on the programmed signature verification policies associated with a target memory segment.

...read moreread less

Abstract: The method and accompanying apparatus and device protects against programming attacks and/or data corruption by computer viruses, malicious code, or other types of corruption. In one example, signature verification policy information that identifies a plurality of policies associated with a plurality of target memory segments is programmed during a secure boot process. The programmed signature verification policy information associated with each of the plurality of target memory segments is then evaluated during run-time. Signature verification is then repeatedly performed, during run-time, on each of the plurality of target memory segments based on the programmed signature verification policy information associated with each target memory segment.

...read moreread less

25 citations

Proceedings Article•DOI•

Multiscale electrophysiology format: An open-source electrophysiology format using data compression, encryption, and cyclic redundancy check

[...]

Benjamin H. Brinkmann¹, Mark R. Bower¹, Keith A. Stengel, Gregory A. Worrell¹, Matt Stead¹ - Show less +1 more•Institutions (1)

University of Rochester¹

13 Nov 2009

TL;DR: A novel file format that employs range encoding to provide a high degree of data compression, a three-tiered 128-bit encryption system for patient information and data security, and a 32-bit cyclic redundancy check to verify the integrity of compressed data blocks is presented.

...read moreread less

Abstract: Continuous, long-term (up to 10 days) electrophysiological monitoring using hybrid intracranial electrodes is an emerging tool for presurgical epilepsy evaluation and fundamental investigations of seizure generation. Detection of high-frequency oscillations and microseizures could provide valuable insights into causes and therapies for the treatment of epilepsy, but requires high spatial and temporal resolution. Our group is currently using hybrid arrays composed of up to 320 micro- and clinical macroelectrode arrays sampled at 32 kHz per channel with 18-bits of A/D resolution. Such recordings produce approximately 3 terabytes of data per day. Existing file formats have limited data compression capabilities, and do not offer mechanisms for protecting patient identifying information or detecting data corruption during transmission or storage. We present a novel file format that employs range encoding to provide a high degree of data compression, a three-tiered 128-bit encryption system for patient information and data security, and a 32-bit cyclic redundancy check to verify the integrity of compressed data blocks. Open-source software to read, write, and process these files are provided.

...read moreread less

16 citations

Proceedings Article•DOI•

Uncovering errors: the cost of detecting silent data corruption

[...]

Sumit Narayan¹, John A. Chandy¹, Samuel Lang², Philip Carns², Robert Ross² - Show less +1 more•Institutions (2)

University of Connecticut¹, Argonne National Laboratory²

14 Nov 2009

TL;DR: This paper assesses the cost of providing data integrity on a parallel file system and presents an approach that provides this capability with as low as 5% overhead for writes and 22% overheadFor reads for aligned requests and some additional cost for unaligned requests.

...read moreread less

Abstract: Data integrity is pivotal to the usefulness of any storage system. It ensures that the data stored is free from any modification throughout its existence on the storage medium. Hash functions such as cyclic redundancy checks or check-sums are frequently used to detect data corruption during its transmission to permanent storage or its stay there. Without these checks, such data errors usually go undetected and unreported to the system and hence are not communicated to the application. They are referred as "silent data corruption." When an application reads corrupted or malformed data, it leads to incorrect results or a failed system. Storage arrays in leadership computing facilities comprise several thousands of drives, thus increasing the likelihood of such failures. These environments mandate a file system capable of detecting data corruption. Parallel file systems have traditionally ignored providing integrity checks because of the high computational cost, particularly in dealing with unaligned data request from scientific applications. In this paper, we assess the cost of providing data integrity on a parallel file system. We present an approach that provides this capability with as low as 5% overhead for writes and 22% overhead for reads for aligned requests and some additional cost for unaligned requests.

...read moreread less

12 citations

Book Chapter•DOI•

Just one bit in a million: on the effects of data corruption in files

[...]

Volker Heydegger¹•Institutions (1)

University of Cologne¹

27 Sep 2009

TL;DR: The research work is based on a study on the status quo of file format robustness for various file formats from the image domain, and a controlled test corpus was built which comprises files with different format characteristics.

...read moreread less

Abstract: So far little attention has been paid to file format robustness, ie, a file formats capability for keeping its information as safe as possible in spite of data corruption The paper on hand reports on the first comprehensive research on this topic The research work is based on a study on the status quo of file format robustness for various file formats from the image domain A controlled test corpus was built which comprises files with different format characteristics The files are the basis for data corruption experiments which are reported on and discussed

...read moreread less

8 citations

Patent•

Data corruption detection

[...]

Judy Lynn Westby¹, Rodney Blake¹•Institutions (1)

Seagate Technology¹

26 Aug 2009

TL;DR: In this paper, a data storage device may include a first error-related code generating unit configured to generate first error related code based on received data and combine the first errorrelated code and the received data to generate a first data stream.

...read moreread less

Abstract: In general, this disclosure relates to various techniques for detecting corrupt bits in a data stream. The techniques may allow a data storage device to detect corrupt bits prior to transformation of the data stream and subsequent to transformation of the data stream. A data storage device may include a first error-related code generating unit configured to generate a first error-related code based on received data and combine the first error-related code and the received data to generate a first data stream. The data storage device may further include a transform unit configured to transform the first data stream to a transformed data stream. The data storage device may also include a second error-related code generating unit configured to generate a second error-related code based on the transformed data stream.

...read moreread less

4 citations

Patent•

System and method for highly reliable data replication

[...]

HaiHong Wang¹•Institutions (1)

EMC Corporation¹

19 Feb 2009

TL;DR: The reliable protocol as discussed by the authors is a protocol capable of detecting most but not all data corruption introduced by the communication channel and undetected by the reliable protocol at the destination data replication device.

...read moreread less

Abstract: Data replication includes generating replication data that is part of a replicated file system to be sent over a communication channel to a destination replication device; adding additional verification information to at least a portion of the replication data to prevent data corruption; and sending the replication data and the additional verification information over the communication channel to the destination replication device. The replication data with additional verification information is sent over the communication channel using a reliable protocol that allows the replication data to be verified by the reliable protocol at the destination replication device. The reliable protocol is a protocol capable of detecting most but not all data corruption introduced by the communication channel. The additional verification information includes information for verifying that replication data sent using the reliable protocol does not include data corruption that was introduced by the communication channel and undetected by the reliable protocol.

...read moreread less

4 citations

Reliable audiovisual archiving using unreliable storage technology and services

[...]

Matthew Addis, Richard Lowe, Nicola Salvo, Lee Middleton

01 Sep 2009

TL;DR: This paper presents ongoing work in the UK AVATAR-m project and in the recently started EC PrestoPrime project on a framework for storing large audiovisual files on heterogeneous and distributed storage infrastructures that allows various strategies for content replication, integrity monitoring and repair to be developed and tested.

...read moreread less

Abstract: The drive for online access to archive content within ‘tapeless’ workflows means that mass-storage technology is an inevitable part of modern archive solutions, either in-house or provided as services by third-parties. But are these solutions safe? Can they assure the data integrity needed for long-term preservation of Petabyte volumes of data? The answer is no. Field studies reveal data corruption can take place silently without detection or correction, including in 'enterprise class' systems explicitly designed to prevent data loss. The reality is that data loss is inevitable to some degree or another from hardware failures, software bugs, and human errors. This paper presents ongoing work in the UK AVATAR-m project and in the recently started EC PrestoPrime project on a framework for storing large audiovisual files on heterogeneous and distributed storage infrastructures that allows various strategies for content replication, integrity monitoring and repair to be developed and tested.

...read moreread less

Patent•

System and method for refreshing data corruption of certain raw device

[...]

Lei He, Xuantong Chen, Wenhan Liu

04 Feb 2009

TL;DR: In this paper, a system and a method for refreshing the specified raw device dirty data, which are applied to a Linux system, is presented, which can write the dirty data of a single raw device into a disk according to the requirement under the uninterrupted service situation, so as to provide convenient, efficient and safe dirty data disk writing.

...read moreread less

Abstract: Disclosed are a system and a method for refreshing the specified raw device dirty data, which are applied to a Linux system The invention sends the command parameter in the correct format to the Linux kernel through judging and refreshing the command parameter format of the specified raw device dirty data (Dirty Data) The data structure of the specified raw device is searched according to the command parameter to obtain the quick search tree of the specified raw device; finally, all the dirty data pages of the specified raw device are searched out from the quick search tree; the dirty data pages are refreshed to a disk in the synchronous or asynchronous mode The system and the method can write the dirty data of a single raw device into a disk according to the requirement under the uninterrupted service situation, so as to provide the convenient, efficient and safe dirty data disk writing

...read moreread less

Patent•

Display apparatus performing correction of data corruption and method thereof

[...]

Eun-Kyung Park¹, Bong-geun Lee¹•Institutions (1)

Samsung¹

11 Aug 2009

TL;DR: In this article, a broadcast signal containing a main image and a sub-image is received and a restriction area where motion estimation is restricted in the sub image is set. And the broadcast signal is displayed after motion estimation.

...read moreread less

Abstract: A display method and display apparatus are provided. The display method includes receiving a broadcast signal that contains a main image and a sub-image, setting a restriction area where motion estimation is restricted in the sub-image, performing motion estimation in areas corresponding to the broadcast signal other than the set restriction area, and displaying the broadcast signal obtained after the motion estimation is performed.

...read moreread less

Book Chapter•DOI•

ITRA under partitions

[...]

Aviv Dagan¹, Eliezer Dekel²•Institutions (2)

University of Haifa¹, IBM²

16 Jun 2009

TL;DR: This paper proposes an extension to ITRA that supports continuous availability under partitions in multi-tier environments using the collaboration of neighboring tiers and discusses a unique approach, discussed in this paper.

...read moreread less

Abstract: In Service Oriented Architecture (SOA), web services may span several sites or logical tiers, each responsible for some part of the service. Most services need to be highly reliable and should allow no data corruption. A known problem in distributed systems that may lead to data corruption or inconsistency is the partition problem, also known as the split-brain phenomena. A split-brain occurs when a network, hardware, or software malfunction breaks a cluster of computer into several separate sub-clusters that reside side by side and are not aware of each other. When, during a session, two or more of these sub-clusters serve the same client, the data may become inconsistent or corrupted. ITRA - Inter Tier Relationship Architecture [1] enables web services to transparently recover from multiple failures in a multi-tier environment and to achieve continuous availability. However, the ITRA protocol does not handle partitions. In this paper we propose an extension to ITRA that supports continuous availability under partitions. Our unique approach, discussed in this paper, deals with partitions in multi-tier environments using the collaboration of neighboring tiers.

...read moreread less

Book Chapter•DOI•

Design and Implementation of High Performance Viterbi Decoder for Mobile Communication Data Security

[...]

T. Menakadevi¹, M. Madheswaran¹•Institutions (1)

Anna University Chennai - Regional Office, Coimbatore¹

01 Jan 2009

TL;DR: The design and implementation of a reduced complexity decode approach along with minimum power dissipation FPGAs for Mobile Communication data security is described.

...read moreread less

Abstract: With the ever increasing growth of data communication in the field of e-commerce transactions and mobile communication data security has gained utmost importance. However the conflicting requirements of power, area and throughput of such applications make hardware cryptography an ideal choice. Dedicated hardware devices such as FPGAs can run encryption routines concurrently with the host computer which runs other applications. The use of error correcting code has proven to be are effective way to overcome data corruption in digital communication channel. In this paper, we describe the design and implementation of a reduced complexity decode approach along with minimum power dissipation FPGAs for Mobile Communication data security.

...read moreread less

Proceedings Article•DOI•

A method of improving the performance of continuous data protection system

[...]

Daoan Huo¹, Qiang Cao¹, Changsheng Xie¹, Jing Yang¹•Institutions (1)

Huazhong University of Science and Technology¹

24 Aug 2009

TL;DR: This paper discusses a method based buffer chains that can reduce the extra I/O operations from the disk and presents an implementation in the Linux kernel which provides continuous data protection service with higher performance under some buffer chains strategy.

...read moreread less

Abstract: Data is the core resource of the information system, so that data corruption is one of the key problems which are on top of the radar screen of most information system administrators. Continuous Data Protection (CDP) technologies help them deal with data corruption by providing timely recovery to any point-in-time. But in this CDP process, the CDP system needs record every data changes, and then the extra system I/O operations increase greatly, the performance of information systems will be lower and the system cannot provide the best service. This paper discusses a method based buffer chains that can reduce the extra I/O operations from the disk. We have presented an implementation in the Linux kernel which provides continuous data protection service with higher performance under some buffer chains strategy.

...read moreread less