scispace - formally typeset
Search or ask a question
Author

Zachary J. Mark

Bio: Zachary J. Mark is an academic researcher from IBM. The author has contributed to research in topics: Computer data storage & Server. The author has an hindex of 18, co-authored 27 publications receiving 4458 citations.

Papers
More filters
Patent
22 Mar 2007
TL;DR: In this paper, an information dispersal sytem in which original data to be stored is separated into a number of data "slices" in such a manner that the data in each subset is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other data subsets.
Abstract: Briefly, the present invention relates to an information dispersal sytem in which original data to be stored is separated into a number of data 'slices' in such a manner that the data in each subset is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other data subsets. These data subsets are stored on separate storage devices as a way of increasing privacy and security. In accordance with an important aspect of the invention, a metadata management system stores and indexes user files across all of the storage nodes. A number of applications run on the servers supporting these storage nodes and are responsible for controlling the metadata. Metadata is the information about the data, the data slices or data subsets and the way in which these data subsets are dispersed among different storage nodes running over the network. As used herein, metadata includes data source names, their size, last modification date, authentication information etc. This information is required to keep track of dispersed data subsets among all the nodes in the system. Every time new data subsets are stored and old ones are removed from the storage nodes, the metadata is updated. In accordance with an important aspect of the invention, the metadata management system stores metadata for dispersed data where: the dispersed data is in several pieces; the metadata is in a separate dataspace from the dispersed data. Accordingly, the metadata management system is able to manage the metadata in a manner that is computationally efficient relative to known systems in order to enable broad use of the invention using the types of computers generally used by businesses, consumers and other organizations currently.

947 citations

Patent
22 Mar 2007
TL;DR: In this article, a digital data file storage system is disclosed in which original data files to be stored are dispersed using some form of information dispersal algorithm into a number of file subsets in such a manner that the data in each file share is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other file shares.
Abstract: A digital data file storage system is disclosed in which original data files to be stored are dispersed using some form of information dispersal algorithm into a number of file “slices” or subsets in such a manner that the data in each file share is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other file shares. These file shares are stored on separate digital data storage devices as a way of increasing privacy and security. As dispersed file shares are being transferred to or stored on a grid of distributed storage locations, various grid resources may become non-operational or may operate below at a less than optimal level. When dispersed file shares are being written to a dispersed storage grid which not available, the grid clients designates the dispersed data shares that could not be written at that time on a Rebuild List. In addition when grid resources already storing dispersed data become non-available, a process within the dispersed storage grid designates the dispersed data shares that need to be recreated on the Rebuild List. At other points in time a separate process reads the set of Rebuild Lists used to create the corresponding dispersed data and stores that data on available grid resources.

938 citations

Patent
22 Mar 2007
TL;DR: In this paper, the original data to be stored is separated into a number of data'slices' or shares (22, 24, 26, 28, 30, and 32) and stored on separate digital data storage devices (34, 36, 38, 40, 42, and 44) as a way of increasing privacy and security.
Abstract: A billing process is disclosed for an information dispersal system or digital data storage system. The original data to be stored is separated into a number of data 'slices' or shares (22, 24, 26, 28, 30, and 32). These data subsets are stored on separate digital data storage devices (34, 36, 38, 40, 42, and 44) as a way of increasing privacy and security. A set of metadata tables are created, separate from the dispersed file share storage, to maintain information about the original data size of each block, file or set of file shares dispersed on the grid.

936 citations

Patent
26 Apr 2011
TL;DR: In this article, a system, method, and apparatus for implementing a plurality of dispersed data storage networks using a set of slice servers are disclosed, with each information record corresponding to a distributed data storage network.
Abstract: A system, method, and apparatus for implementing a plurality of dispersed data storage networks using a set of slice servers are disclosed. A plurality of information records are maintained, with each information record corresponding to a dispersed data storage network. The information record maintains what slice servers are used to implement the dispersed data storage network, as well as other information needed to administer a DDSN, such as the information dispersal algorithm used, how data is stored, and whether data is compressed or encrypted.

916 citations

Patent
Greg R. Dhuse, Andrew D. Baptist1, Zachary J. Mark, Jason K. Resch, Ilya Volvovski 
26 Apr 2010
TL;DR: In this paper, a data slice is rebuilt by combining in any order slice partials generated from at least a threshold number T of the plurality of data slices in a distributed storage system.
Abstract: A dispersed storage system includes a plurality of storage units that each include a partial rebuild grid module. The partial rebuild grid module includes partial rebuilding functionality to reconstruct one of a plurality of encoded data slices wherein the plurality of encoded data slices are generated from a data segment based on an error encoding dispersal function. In the partial rebuilding process, a data slice is rebuilt by combining in any order slice partials generated from at least a threshold number T of the plurality of data slices.

105 citations


Cited by
More filters
Patent
22 Mar 2007
TL;DR: In this paper, an information dispersal sytem in which original data to be stored is separated into a number of data "slices" in such a manner that the data in each subset is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other data subsets.
Abstract: Briefly, the present invention relates to an information dispersal sytem in which original data to be stored is separated into a number of data 'slices' in such a manner that the data in each subset is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other data subsets. These data subsets are stored on separate storage devices as a way of increasing privacy and security. In accordance with an important aspect of the invention, a metadata management system stores and indexes user files across all of the storage nodes. A number of applications run on the servers supporting these storage nodes and are responsible for controlling the metadata. Metadata is the information about the data, the data slices or data subsets and the way in which these data subsets are dispersed among different storage nodes running over the network. As used herein, metadata includes data source names, their size, last modification date, authentication information etc. This information is required to keep track of dispersed data subsets among all the nodes in the system. Every time new data subsets are stored and old ones are removed from the storage nodes, the metadata is updated. In accordance with an important aspect of the invention, the metadata management system stores metadata for dispersed data where: the dispersed data is in several pieces; the metadata is in a separate dataspace from the dispersed data. Accordingly, the metadata management system is able to manage the metadata in a manner that is computationally efficient relative to known systems in order to enable broad use of the invention using the types of computers generally used by businesses, consumers and other organizations currently.

947 citations

Patent
09 Oct 2007
TL;DR: In this paper, an improved system for accessing data within a distributed data storage network (DDSN) is disclosed, in which traffic is routed to individual slice servers within the DDSN in accordance with objective criteria as well as user-defined policies.
Abstract: An improved system for accessing data within a distributed data storage network (“DDSN”) is disclosed. In a system implementing the disclosed invention, traffic is routed to individual slice servers within the DDSN in accordance with objective criteria as well as user-defined policies. In accordance with one aspect of the disclosed invention, when a data segment is written to a DDSN, the segment is divided into multiple data slices, which are simultaneously transmitted to different slice servers. In accordance with another aspect of the disclosed invention, when a data segment is read from a DDSN, a list of slice servers, each containing a data slice that could be used to reconstruct the requested data segment, is assembled, and sorted in accordance with a preference rating assigned to each of the slice servers. Sufficient data slices to reconstruct the data segment are then read in accordance with the preference ranking of the slice servers.

941 citations

Patent
22 Mar 2007
TL;DR: In this article, a digital data file storage system is disclosed in which original data files to be stored are dispersed using some form of information dispersal algorithm into a number of file subsets in such a manner that the data in each file share is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other file shares.
Abstract: A digital data file storage system is disclosed in which original data files to be stored are dispersed using some form of information dispersal algorithm into a number of file “slices” or subsets in such a manner that the data in each file share is less usable or less recognizable or completely unusable or completely unrecognizable by itself except when combined with some or all of the other file shares. These file shares are stored on separate digital data storage devices as a way of increasing privacy and security. As dispersed file shares are being transferred to or stored on a grid of distributed storage locations, various grid resources may become non-operational or may operate below at a less than optimal level. When dispersed file shares are being written to a dispersed storage grid which not available, the grid clients designates the dispersed data shares that could not be written at that time on a Rebuild List. In addition when grid resources already storing dispersed data become non-available, a process within the dispersed storage grid designates the dispersed data shares that need to be recreated on the Rebuild List. At other points in time a separate process reads the set of Rebuild Lists used to create the corresponding dispersed data and stores that data on available grid resources.

938 citations

Patent
22 Mar 2007
TL;DR: In this paper, the original data to be stored is separated into a number of data'slices' or shares (22, 24, 26, 28, 30, and 32) and stored on separate digital data storage devices (34, 36, 38, 40, 42, and 44) as a way of increasing privacy and security.
Abstract: A billing process is disclosed for an information dispersal system or digital data storage system. The original data to be stored is separated into a number of data 'slices' or shares (22, 24, 26, 28, 30, and 32). These data subsets are stored on separate digital data storage devices (34, 36, 38, 40, 42, and 44) as a way of increasing privacy and security. A set of metadata tables are created, separate from the dispersed file share storage, to maintain information about the original data size of each block, file or set of file shares dispersed on the grid.

936 citations

Patent
19 Nov 2010
TL;DR: In this paper, a block-based interface to a dispersed data storage network is disclosed, which accepts read and write commands from a file system resident on a user's computer and generates network commands that are forwarded to slice servers.
Abstract: A block-based interface to a dispersed data storage network is disclosed. The disclosed interface accepts read and write commands from a file system resident on a user's computer and generates network commands that are forwarded to slice servers that form the storage component of the dispersed data storage network. The slice servers then fulfill the read and write commands.

929 citations