scispace - formally typeset
Search or ask a question
Institution

IBM

CompanyArmonk, New York, United States
About: IBM is a company organization based out in Armonk, New York, United States. It is known for research contribution in the topics: Layer (electronics) & Cache. The organization has 134567 authors who have published 253905 publications receiving 7458795 citations. The organization is also known as: International Business Machines Corporation & Big Blue.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper describes a number of statistical models for use in speech recognition, with special attention to determining the parameters for such models from sparse data, and describes two decoding methods appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks.
Abstract: Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them.

1,637 citations

Book ChapterDOI
Jim Gray1
01 Jan 1978
TL;DR: This paper is a compendium of data base management operating systems folklore and focuses on particular issues unique to the transaction management component especially locking and recovery.
Abstract: This paper is a compendium of data base management operating systems folklore. It is an early paper and is still in draft form. It is intended as a set of course notes for a class on data base operating systems. After a brief overview of what a data management system is it focuses on particular issues unique to the transaction management component especially locking and recovery.

1,635 citations

Proceedings ArticleDOI
22 May 1995
TL;DR: The number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck, and allows us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly.
Abstract: In this paper, we examine the issue of mining association rules among items in a large database of sales transactions. The mining of association rules can be mapped into the problem of discovering large itemsets where a large itemset is a group of items which appear in a sufficient number of transactions. The problem of discovering large itemsets can be solved by constructing a candidate set of itemsets first and then, identifying, within this candidate set, those itemsets that meet the large itemset requirement. Generally this is done iteratively for each large k-itemset in increasing order of k where a large k-itemset is a large itemset with k items. To determine large itemsets from a huge number of candidate large itemsets in early iterations is usually the dominating factor for the overall data mining performance. To address this issue, we propose an effective hash-based algorithm for the candidate set generation. Explicitly, the number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck. Note that the generation of smaller candidate sets enables us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly. Extensive simulation study is conducted to evaluate performance of the proposed algorithm.

1,625 citations

Journal ArticleDOI
16 Feb 1995-Nature
TL;DR: It is proposed that interneuron network oscillations, in conjunction with intrinsic membrane resonances and long-loop (such as thalamocortical) interactions, contribute to 40-Hz rhythms in vivo.
Abstract: Partially synchronous 40-Hz oscillations of cortical neurons have been implicated in cognitive function. Specifically, coherence of these oscillations between different parts of the cortex may provide conjunctive properties to solve the 'binding problem': associating features detected by the cortex into unified perceived objects. Here we report an emergent 40-Hz oscillation in networks of inhibitory neurons connected by synapses using GABAA (gamma-aminobutyric acid) receptors in slices of rat hippocampus and neocortex. These network inhibitory postsynaptic potential oscillations occur in response to the activation of metabotropic glutamate receptors. The oscillations can entrain pyramidal cell discharges. The oscillation frequency is determined both by the net excitation of interneurons and by the kinetics of the inhibitory postsynaptic potentials between them. We propose that interneuron network oscillations, in conjunction with intrinsic membrane resonances and long-loop (such as thalamocortical) interactions, contribute to 40-Hz rhythms in vivo.

1,625 citations

Book ChapterDOI
04 Jan 2001
TL;DR: This paper examines the behavior of the commonly used L k norm and shows that the problem of meaningfulness in high dimensionality is sensitive to the value of k, which means that the Manhattan distance metric is consistently more preferable than the Euclidean distance metric for high dimensional data mining applications.
Abstract: In recent years, the effect of the curse of high dimensionality has been studied in great detail on several problems such as clustering, nearest neighbor search, and indexing. In high dimensional space the data becomes sparse, and traditional indexing and algorithmic techniques fail from a efficiency and/or effectiveness perspective. Recent research results show that in high dimensional space, the concept of proximity, distance or nearest neighbor may not even be qualitatively meaningful. In this paper, we view the dimensionality curse from the point of view of the distance metrics which are used to measure the similarity between objects. We specifically examine the behavior of the commonly used Lk norm and show that the problem of meaningfulness in high dimensionality is sensitive to the value of k. For example, this means that the Manhattan distance metric (L1 norm) is consistently more preferable than the Euclidean distance metric (L2 norm) for high dimensional data mining applications. Using the intuition derived from our analysis, we introduce and examine a natural extension of the Lk norm to fractional distance metrics. We show that the fractional distance metric provides more meaningful results both from the theoretical and empirical perspective. The results show that fractional distance metrics can significantly improve the effectiveness of standard clustering algorithms such as the k-means algorithm.

1,614 citations


Authors

Showing all 134658 results

NameH-indexPapersCitations
Zhong Lin Wang2452529259003
Anil K. Jain1831016192151
Hyun-Chul Kim1764076183227
Rodney S. Ruoff164666194902
Tobin J. Marks1591621111604
Jean M. J. Fréchet15472690295
Albert-László Barabási152438200119
György Buzsáki15044696433
Stanislas Dehaene14945686539
Philip S. Yu1481914107374
James M. Tour14385991364
Thomas P. Russell141101280055
Naomi J. Halas14043582040
Steven G. Louie13777788794
Daphne Koller13536771073
Network Information
Related Institutions (5)
Carnegie Mellon University
104.3K papers, 5.9M citations

93% related

Georgia Institute of Technology
119K papers, 4.6M citations

92% related

Bell Labs
59.8K papers, 3.1M citations

90% related

Microsoft
86.9K papers, 4.1M citations

89% related

Massachusetts Institute of Technology
268K papers, 18.2M citations

88% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202330
2022137
20213,163
20206,336
20196,427
20186,278