Institution
IBM
Company•Armonk, New York, United States•
About: IBM is a company organization based out in Armonk, New York, United States. It is known for research contribution in the topics: Layer (electronics) & Cache. The organization has 134567 authors who have published 253905 publications receiving 7458795 citations. The organization is also known as: International Business Machines Corporation & Big Blue.
Papers published on a yearly basis
Papers
More filters
••
IBM1
TL;DR: This paper describes a number of statistical models for use in speech recognition, with special attention to determining the parameters for such models from sparse data, and describes two decoding methods appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks.
Abstract: Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them.
1,637 citations
••
IBM1
TL;DR: This paper is a compendium of data base management operating systems folklore and focuses on particular issues unique to the transaction management component especially locking and recovery.
Abstract: This paper is a compendium of data base management operating systems folklore. It is an early paper and is still in draft form. It is intended as a set of course notes for a class on data base operating systems. After a brief overview of what a data management system is it focuses on particular issues unique to the transaction management component especially locking and recovery.
1,635 citations
••
IBM1
TL;DR: The number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck, and allows us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly.
Abstract: In this paper, we examine the issue of mining association rules among items in a large database of sales transactions. The mining of association rules can be mapped into the problem of discovering large itemsets where a large itemset is a group of items which appear in a sufficient number of transactions. The problem of discovering large itemsets can be solved by constructing a candidate set of itemsets first and then, identifying, within this candidate set, those itemsets that meet the large itemset requirement. Generally this is done iteratively for each large k-itemset in increasing order of k where a large k-itemset is a large itemset with k items. To determine large itemsets from a huge number of candidate large itemsets in early iterations is usually the dominating factor for the overall data mining performance. To address this issue, we propose an effective hash-based algorithm for the candidate set generation. Explicitly, the number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck. Note that the generation of smaller candidate sets enables us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly. Extensive simulation study is conducted to evaluate performance of the proposed algorithm.
1,625 citations
••
TL;DR: It is proposed that interneuron network oscillations, in conjunction with intrinsic membrane resonances and long-loop (such as thalamocortical) interactions, contribute to 40-Hz rhythms in vivo.
Abstract: Partially synchronous 40-Hz oscillations of cortical neurons have been implicated in cognitive function. Specifically, coherence of these oscillations between different parts of the cortex may provide conjunctive properties to solve the 'binding problem': associating features detected by the cortex into unified perceived objects. Here we report an emergent 40-Hz oscillation in networks of inhibitory neurons connected by synapses using GABAA (gamma-aminobutyric acid) receptors in slices of rat hippocampus and neocortex. These network inhibitory postsynaptic potential oscillations occur in response to the activation of metabotropic glutamate receptors. The oscillations can entrain pyramidal cell discharges. The oscillation frequency is determined both by the net excitation of interneurons and by the kinetics of the inhibitory postsynaptic potentials between them. We propose that interneuron network oscillations, in conjunction with intrinsic membrane resonances and long-loop (such as thalamocortical) interactions, contribute to 40-Hz rhythms in vivo.
1,625 citations
••
04 Jan 2001TL;DR: This paper examines the behavior of the commonly used L k norm and shows that the problem of meaningfulness in high dimensionality is sensitive to the value of k, which means that the Manhattan distance metric is consistently more preferable than the Euclidean distance metric for high dimensional data mining applications.
Abstract: In recent years, the effect of the curse of high dimensionality has been studied in great detail on several problems such as clustering, nearest neighbor search, and indexing. In high dimensional space the data becomes sparse, and traditional indexing and algorithmic techniques fail from a efficiency and/or effectiveness perspective. Recent research results show that in high dimensional space, the concept of proximity, distance or nearest neighbor may not even be qualitatively meaningful. In this paper, we view the dimensionality curse from the point of view of the distance metrics which are used to measure the similarity between objects. We specifically examine the behavior of the commonly used Lk norm and show that the problem of meaningfulness in high dimensionality is sensitive to the value of k. For example, this means that the Manhattan distance metric (L1 norm) is consistently more preferable than the Euclidean distance metric (L2 norm) for high dimensional data mining applications. Using the intuition derived from our analysis, we introduce and examine a natural extension of the Lk norm to fractional distance metrics. We show that the fractional distance metric provides more meaningful results both from the theoretical and empirical perspective. The results show that fractional distance metrics can significantly improve the effectiveness of standard clustering algorithms such as the k-means algorithm.
1,614 citations
Authors
Showing all 134658 results
Name | H-index | Papers | Citations |
---|---|---|---|
Zhong Lin Wang | 245 | 2529 | 259003 |
Anil K. Jain | 183 | 1016 | 192151 |
Hyun-Chul Kim | 176 | 4076 | 183227 |
Rodney S. Ruoff | 164 | 666 | 194902 |
Tobin J. Marks | 159 | 1621 | 111604 |
Jean M. J. Fréchet | 154 | 726 | 90295 |
Albert-László Barabási | 152 | 438 | 200119 |
György Buzsáki | 150 | 446 | 96433 |
Stanislas Dehaene | 149 | 456 | 86539 |
Philip S. Yu | 148 | 1914 | 107374 |
James M. Tour | 143 | 859 | 91364 |
Thomas P. Russell | 141 | 1012 | 80055 |
Naomi J. Halas | 140 | 435 | 82040 |
Steven G. Louie | 137 | 777 | 88794 |
Daphne Koller | 135 | 367 | 71073 |