scispace - formally typeset
Search or ask a question
Author

Barry Perlman

Bio: Barry Perlman is an academic researcher. The author has contributed to research in topics: Data structure & Disk storage. The author has an hindex of 1, co-authored 1 publications receiving 146 citations.

Papers
More filters
Patent
06 Apr 2010
TL;DR: In this article, a high-performance dictionary data structure is defined for storing data in a disk storage system, which supports full transactional semantics, concurrent access from multiple transactions, and logging and recovery.
Abstract: A method, apparatus and computer program product for storing data in a disk storage system is presented. A high-performance dictionary data structure is defined. The dictionary data structure is stored on a disk storage system. Key-value pairs can be inserted and deleted into the dictionary data structure. Updates run faster than one insertion per disk-head movement. The structure can also be stored on any system with two or more levels of memory. The dictionary is high performance and supports with full transactional semantics, concurrent access from multiple transactions, and logging and recovery. Keys can be looked up with only a logarithmic number of transfers, even for keys that have been recently inserted or deleted. Queries can be performed on ranges of key-value pairs, including recently inserted or deleted pairs, at a constant fraction of the bandwidth of the disk.

146 citations


Cited by
More filters
Patent
31 Jul 2013
TL;DR: In this paper, a method and system for indexing, searching, and retrieving information from timed media files based on relevance intervals is presented, where a portion of a timed media file is returned, which is selected specifically to be relevant to the given information representations.
Abstract: A method and system for indexing, searching, and retrieving information from timed media files based upon relevance intervals. The method and system for indexing, searching, and retrieving this information is based upon relevance intervals so that a portion of a timed media file is returned, which is selected specifically to be relevant to the given information representations, thereby eliminating the need for a manual determination of the relevance and avoiding missing relevant portions. The timed media includes streaming audio, streaming video, timed HTML, animations such as vector-based graphics, slide shows, other timed media, and combinations thereof.

276 citations

Patent
13 Jun 2011
TL;DR: In this article, a method for maintaining an index in multi-tier data structure includes providing a plurality of a storage devices forming the multilevel data structure, caching an index of key-value pairs across the multilayer data structure.
Abstract: A method for maintaining an index in multi-tier data structure includes providing a plurality of a storage devices forming the multi-tier data structure, caching an index of key-value pairs across the multi-tier data structure, wherein each of the key-value pairs includes a key, and one of a data value and a data pointer, the key-value pairs stored in the multi-tier data structure, providing a journal for interfacing with the multi-tier data structure, providing a plurality of zone allocators recording which zones of the multi-tier data structure are in used, and providing a plurality of zone managers for controlling access to cache lines of the multi-tier data structure through the journal and zone allocators, wherein each zone manager maintains a header object pointing to data to be stored in an allocated zone.

115 citations

Patent
Alexander Gorelik1, Lingling Yan1
06 Oct 2011
TL;DR: In this paper, an apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems.
Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.

100 citations

Patent
12 May 2010
TL;DR: In this paper, a disambiguation process is used to reduce the potential matches to a single known entity, by ranking candidate entities according to a hierarchy of criteria, and then using a hierarchical ranking of the candidate entities.
Abstract: Tagging of content items and entities identified therein may include a matching process, a classification process and a disambiguation process. Matching may include the identification of potential matching candidate entities in a content item whereas the classification process may categorize or group identified candidate entities according to known entities to which they are likely a match. In some instances, a candidate entity may be categorized with multiple known entities. Accordingly, a disambiguation process may be used to reduce the potential matches to a single known entity. In one example, the disambiguation process may include ranking potentially matching known entities according to a hierarchy of criteria.

75 citations

Patent
30 Jun 2010
TL;DR: In this article, a topic specific language model was created by performing an initial pass on an audio signal using a generic or basis language model and then determining topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics.
Abstract: Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.

61 citations