Institution

Carnegie Mellon University

Education•Pittsburgh, Pennsylvania, United States•

About: Carnegie Mellon University is a education organization based out in Pittsburgh, Pennsylvania, United States. It is known for research contribution in the topics: Computer science & Robot. The organization has 36317 authors who have published 104359 publications receiving 5975734 citations. The organization is also known as: CMU & Carnegie Mellon.

...read moreread less

Topics: Computer science, Robot, Context (language use), Population, Mobile robot ...read more

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Speaker-independent phone recognition using hidden Markov models

[...]

Kai-Fu Lee¹, H.-W. Hon¹•Institutions (1)

Carnegie Mellon University¹

01 Nov 1989-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The authors introduce the co-occurrence smoothing algorithm, which enables accurate recognition even with very limited training data, and can be used as benchmarks to evaluate future systems.

...read moreread less

Abstract: Hidden Markov modeling is extended to speaker-independent phone recognition. Using multiple codebooks of various linear-predictive-coding (LPC) parameters and discrete hidden Markov models (HMMs) the authors obtain a speaker-independent phone recognition accuracy of 58.8-73.8% on the TIMIT database, depending on the type of acoustic and language models used. In comparison, the performance of expert spectrogram readers is only 69% without use of higher level knowledge. The authors introduce the co-occurrence smoothing algorithm, which enables accurate recognition even with very limited training data. Since the results were evaluated on a standard database, they can be used as benchmarks to evaluate future systems. >

...read moreread less

895 citations

Proceedings Article•

Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?

[...]

Bianca Schroeder¹, Garth A. Gibson¹•Institutions (1)

Carnegie Mellon University¹

13 Feb 2007

TL;DR: In this article, the authors present and analyze field-gathered disk replacement data from a number of large production systems, including high-performance computing sites and internet services sites, and find that in the field, annual disk replacement rates typically exceed 1%, with 2-4% common and up to 13% observed on some systems.

...read moreread less

Abstract: Component failure in large-scale IT installations is becoming an ever larger problem as the number of components in a single cluster approaches a million. In this paper, we present and analyze field-gathered disk replacement data from a number of large production systems, including high-performance computing sites and internet services sites. About 100,000 disks are covered by this data, some for an entire lifetime of five years. The data include drives with SCSI and FC, as well as SATA interfaces. The mean time to failure (MTTF) of those drives, as specified in their datasheets, ranges from 1,000,000 to 1,500,000 hours, suggesting a nominal annual failure rate of at most 0.88%. We find that in the field, annual disk replacement rates typically exceed 1%, with 2-4% common and up to 13% observed on some systems. This suggests that field replacement is a fairly different process than one might predict based on datasheet MTTF. We also find evidence, based on records of disk replacements in the field, that failure rate is not constant with age, and that, rather than a significant infant mortality effect, we see a significant early onset of wearout degradation. That is, replacement rates in our data grew constantly with age, an effect often assumed not to set in until after a nominal lifetime of 5 years. Interestingly, we observe little difference in replacement rates between SCSI, FC and SATA drives, potentially an indication that disk-independent factors, such as operating conditions, affect replacement rates more than component specific factors. On the other hand, we see only one instance of a customer rejecting an entire population of disks as a bad batch, in this case because of media error rates, and this instance involved SATA disks. Time between replacement, a proxy for time between failure, is not well modeled by an exponential distribution and exhibits significant levels of correlation, including autocorrelation and long-range dependence.

...read moreread less

894 citations

Proceedings Article•

Federated multi-task learning

[...]

Virginia Smith¹, Chao-Kai Chiang², Maziar Sanjabi², Ameet Talwalkar³•Institutions (3)

Stanford University¹, University of Southern California², Carnegie Mellon University³

04 Dec 2017

TL;DR: In this paper, the authors propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues, such as high communication cost, stragglers, and fault tolerance for distributed multi-task learning.

...read moreread less

Abstract: Federated learning poses new statistical and systems challenges in training machine learning models over distributed networks of devices. In this work, we show that multi-task learning is naturally suited to handle the statistical challenges of this setting, and propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Our method and theory for the first time consider issues of high communication cost, stragglers, and fault tolerance for distributed multi-task learning. The resulting method achieves significant speedups compared to alternatives in the federated setting, as we demonstrate through simulations on real-world federated datasets.

...read moreread less

894 citations

Book Chapter•DOI•

The evolution of management accounting

[...]

Robert S. Kaplan¹, Robert S. Kaplan²•Institutions (2)

Carnegie Mellon University¹, Harvard University²

01 Jan 1984

TL;DR: A survey of cost accounting and managerial control practices and their relevance to the changing nature of industrial competition in the 1980s can be found in this paper, where the authors advocate a return to field-based research to discover the innovative practices being introduced by organizations successfully adapting to the new organization and technology of manufacturing.

...read moreread less

Abstract: This paper surveys the development of cost accounting and managerial control practices and assesses their relevance to the changing nature of industrial competition in the 1980s. The paper starts with a review of cost accounting developments from 1850 through 1915, including the demands imposed by the origin of the railroad and steel enterprises and the subsequent activity from the scientific management movement. The DuPont Corporation (1903) and the reorganization of General Motors (1920) provided the opportunity for major innovations in the management control of decentralized operations, including the ROI criterion for evaluation of performance and formal budgeting and incentive plans. More recent developments have included discounted cash flow analysis and the application of management science and multiperson decision theory models. The cost accounting and management control procedures developed more than 60 years ago for the mass production of standard products with high direct labor content may no longer be appropriate for the planning and control decisions of contemporary organizations. Also, problems with using profits as the prime criterion for motivating and evaluating short-term performance are becoming apparent. This paper advocates a return to field-based research to discover the innovative practices being introduced by organizations successfully adapting to the new organization and technology of manufacturing.

...read moreread less

893 citations

Proceedings Article•DOI•

Polygraph: automatically generating signatures for polymorphic worms

[...]

James Newsome¹, Brad Karp¹, Dawn Song¹•Institutions (1)

Carnegie Mellon University¹

08 May 2005

TL;DR: Polygraph as mentioned in this paper is a signature generation system that successfully produces signatures that match polymorphic worms by using multiple disjoint content substrings, which correspond to protocol framing, return addresses, and poorly obfuscated code.

...read moreread less

Abstract: It is widely believed that content-signature-based intrusion detection systems (IDS) are easily evaded by polymorphic worms, which vary their payload on every infection attempt. In this paper, we present Polygraph, a signature generation system that successfully produces signatures that match polymorphic worms. Polygraph generates signatures that consist of multiple disjoint content substrings. In doing so, Polygraph leverages our insight that for a real-world exploit to function properly, multiple invariant substrings must often be present in all variants of a payload; these substrings typically correspond to protocol framing, return addresses, and in some cases, poorly obfuscated code. We contribute a definition of the polymorphic signature generation problem; propose classes of signature suited for matching polymorphic worm payloads; and present algorithms for automatic generation of signatures in these classes. Our evaluation of these algorithms on a range of polymorphic worms demonstrates that Polygraph produces signatures for polymorphic worms that exhibit low false negatives and false positives.

...read moreread less

893 citations

Collapse

Authors

Showing all 36645 results

Name	H-index	Papers	Citations
Yi Chen	217	4342	293080
Rakesh K. Jain	200	1467	177727
Robert C. Nichol	187	851	162994
Michael I. Jordan	176	1016	216204
Jasvinder A. Singh	176	2382	223370
J. N. Butler	172	2525	175561
P. Chang	170	2154	151783
Krzysztof Matyjaszewski	169	1431	128585
Yang Yang	164	2704	144071
Geoffrey E. Hinton	157	414	409047
Herbert A. Simon	157	745	194597
Yongsun Kim	156	2588	145619
Terrence J. Sejnowski	155	845	117382
John B. Goodenough	151	1064	113741
Scott Shenker	150	454	118017

Network Information

Related Institutions (5)

Massachusetts Institute of Technology

268K papers, 18.2M citations

95% related

University of Maryland, College Park

155.9K papers, 7.2M citations

225.1K papers, 10.1M citations

93% related

IBM

253.9K papers, 7.4M citations

93% related

Princeton University

146.7K papers, 9.1M citations

92% related

Performance

Metrics

104,917

Papers

6,710,469

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	120
2022	499
2021	4,981
2020	5,375
2019	5,420
2018	4,972