scispace - formally typeset
Search or ask a question
Author

Neil Thomas

Bio: Neil Thomas is an academic researcher from Storage Technology Corporation. The author has contributed to research in topics: Data file & Tape drive. The author has an hindex of 2, co-authored 3 publications receiving 139 citations.

Papers
More filters
Patent
09 Feb 1989
TL;DR: In this paper, the adaptive data compression apparatus is used to efficiently compress a user data file received from a host computer into a bit oriented compressed format for storage on the magnetic tape that is loaded in the tape transport.
Abstract: The adaptive data compression apparatus is located within a tape drive control unit which is interposed between one or more host computers and one or more tape transports. The adaptive data compression apparatus functions to efficiently compress a user data file received from a host computer into a bit oriented compressed format for storage on the magnetic tape that is loaded in the tape transport. The data compression apparatus divides each block of an incoming user data file into predetermined sized segments, each of which is compressed independently without reference to any other segment in the user data file. The data compression apparatus concurrently uses a plurality of data compression algorithms to adapt the data compression operation to the particular data stored in the user data file. A cyclic redundancy check circuit is used to compute a predetermined length CRC code from all of the incoming user data bytes before they are compressed. The computed CRC code is appended to the end of the compressed data block. The data compression apparatus operates by converting bytes and strings of bytes into shorter bit string codes called reference values. The reference values replace the bytes and strings of bytes when recorded on the magnetic tape. The byte strings have two forms, a run length form for characters that are repeated three or more times, and a string form that recognizes character patterns of two or more characters.

137 citations

Patent
08 Feb 1990
TL;DR: The adaptive data compression apparatus (100) as discussed by the authors is located within a tape drive control unit between one or more host computers and tape transports, where the data file is divided into predetermined sized segments which are compressed independently of any other segment.
Abstract: The adaptive data compression apparatus (100) is located within a tape drive control unit between one or more host computers and tape transports The apparatus (100) efficiently compresses a data file into a bit oriented compressed format for storage An input data file is divided into predetermined sized segments which are compressed independently of any other segment The apparatus (100) uses a plurality of compression algorithms (105) best suited to the data files A cyclic redundancy check circuit (104, 206) computes a predetermined length CRC code for all incoming data bytes before compression The CRC code is appended to the end of the compressed data block The apparatus (100) compresses bytes and strings of bytes into shorter bit string codes called reference values for recording A run length form for characters repeated three or more times and a string form that recognizes patterns of two or more characters are used

Cited by
More filters
Patent
15 Feb 1996
TL;DR: In this paper, a method and apparatus for detecting common spans within one or more data blocks by partitioning the blocks into subblocks and searching the group of subblocks (or their corresponding hashes) for duplicates is presented.
Abstract: This invention provides a method and apparatus for detecting common spans within one or more data blocks by partitioning the blocks (figure 4) into subblocks and searching the group of subblocks (figure 12) (or their corresponding hashes (figure 13)) for duplicates. Blocks can be partitioned into subblocks using a variety of methods, including methods that place subblock boundaries at fixed positions (figure 3), methods that place subblock boundaries at data-dependent positions (figure 3), and methods that yield multiple overlapping subblocks (figure 6). By comparing the hashes of subblocks, common spans of one or more blocks can be identified without ever having to compare the blocks or subblocks themselves (figure 13). This leads to several applications including an incremental backup system that backs up changes rather than changed files (figure 25), a utility that determines the similarities and differences between two files (figure 13), a file system that stores each unique subblock at most once (figure 26), and a communications system that eliminates the need to transmit subblocks already possessed by the receiver (figure 19).

385 citations

Patent
08 Apr 2006
TL;DR: In this paper, a method for compressing data comprises the steps of: analyzing a data block of an input data stream to identify a data type of the data block, the input dataset consisting of a plurality of disparate data types; performing content dependent data compression on the block; and performing content independent data compression if the data type is not identified.
Abstract: Systems and methods for providing fast and efficient data compression using a combination of content independent data compression and content dependent data compression. In one aspect, a method for compressing data comprises the steps of: analyzing a data block of an input data stream to identify a data type of the data block, the input data stream comprising a plurality of disparate data types; performing content dependent data compression on the data block, if the data type of the data block is identified; performing content independent data compression on the data block, if the data type of the data block is not identified.

304 citations

Patent
14 Feb 2001
TL;DR: In this paper, the hash file system of the present invention utilizes hash values for computer files or file pieces which may be produced by a checksum generating program, engine or algorithm such as industry standard MD4, MD5, SHA or SHA-1 algorithms.
Abstract: A system and method for a computer file system that is based and organized upon hashes and/or strings of digits of certain, different, or changing lengths and which is capable of eliminating or screening redundant copies of aggregate blocks of data (or parts of data blocks) from the system. The hash file system of the present invention utilizes hash values for computer files or file pieces which may be produced by a checksum generating program, engine or algorithm such as industry standard MD4, MD5, SHA or SHA-1 algorithms. Alternatively, the hash values may be generated by a checksum program, engine, algorithm or other means that produces an effectively unique hash value for a block of data of indeterminate size based upon a non-linear probablistic mathematical algorithm.

297 citations

Patent
22 Aug 2002
TL;DR: In this paper, a reversible wavelet filter is used to generate coefficients from input data, such as image data, and an entropy coder performs entropy coding on the embedded codestream to produce the compressed data stream.
Abstract: A compression and decompression system in which a reversible wavelet filter are used to generates coefficients from input data, such as image data. The reversible wavelet filter is an efficient transform implemented with integer arithmetic that has exact reconstruction. The present invention uses the reversible wavelet filter in a lossless system (or lossy system) in which an embedded codestream is generated from the coefficients produced by the filter. An entropy coder performs entropy coding on the embedded codestream to produce the compressed data stream.

218 citations

Patent
19 Oct 2006
TL;DR: In this article, the authors present a method for providing accelerated loading of operating system and application programs upon system boot or application launch, which consists of: maintaining a list of boot data associated with an application program, preloading the application data upon launching the application program; and servicing requests for application data from a computer system using the preloaded boot data.
Abstract: Systems and methods for providing accelerated loading of operating system and application programs upon system boot or application launch are disclosed. In one aspect, a method for providing accelerated loading of an operating system comprises the steps of: maintaining a list of boot data used for booting a computer system; preloading the boot data upon initialization of the computer system; and servicing requests for boot data from the computer system using the preloaded boot data. In another aspect, a method for providing accelerated launching of an application program comprises the steps of: maintaining a list of application data associated with an application program; preloading the application data upon launching the application program; and servicing requests for application data from a computer system using the preloaded application data.

207 citations