scispace - formally typeset
Search or ask a question
Institution

University of California, Berkeley

EducationBerkeley, California, United States
About: University of California, Berkeley is a education organization based out in Berkeley, California, United States. It is known for research contribution in the topics: Population & Galaxy. The organization has 128244 authors who have published 265680 publications receiving 16824174 citations. The organization is also known as: University of California Berkeley & UC Berkeley.
Topics: Population, Galaxy, Poison control, Gene, Supernova


Papers
More filters
Journal ArticleDOI

[...]

01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

58,232 citations

Book ChapterDOI

[...]

TL;DR: In this article, the product-limit (PL) estimator was proposed to estimate the proportion of items in the population whose lifetimes would exceed t (in the absence of such losses), without making any assumption about the form of the function P(t).
Abstract: In lifetesting, medical follow-up, and other fields the observation of the time of occurrence of the event of interest (called a death) may be prevented for some of the items of the sample by the previous occurrence of some other event (called a loss). Losses may be either accidental or controlled, the latter resulting from a decision to terminate certain observations. In either case it is usually assumed in this paper that the lifetime (age at death) is independent of the potential loss time; in practice this assumption deserves careful scrutiny. Despite the resulting incompleteness of the data, it is desired to estimate the proportion P(t) of items in the population whose lifetimes would exceed t (in the absence of such losses), without making any assumption about the form of the function P(t). The observation for each item of a suitable initial event, marking the beginning of its lifetime, is presupposed. For random samples of size N the product-limit (PL) estimate can be defined as follows: L...

51,084 citations

Journal ArticleDOI

[...]

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

27,392 citations

Proceedings Article

[...]

03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

25,546 citations

Journal ArticleDOI

[...]

TL;DR: The dynamic capabilities framework as mentioned in this paper analyzes the sources and methods of wealth creation and capture by private enterprise firms operating in environments of rapid technological change, and suggests that private wealth creation in regimes of rapid technology change depends in large measure on honing intemal technological, organizational, and managerial processes inside the firm.
Abstract: The dynamic capabilities framework analyzes the sources and methods of wealth creation and capture by private enterprise firms operating in environments of rapid technological change. The competitive advantage of firms is seen as resting on distinctive processes (ways of coordinating and combining), shaped by the firm's (specific) asset positions (such as the firm's portfolio of difftcult-to- trade knowledge assets and complementary assets), and the evolution path(s) it has aflopted or inherited. The importance of path dependencies is amplified where conditions of increasing retums exist. Whether and how a firm's competitive advantage is eroded depends on the stability of market demand, and the ease of replicability (expanding intemally) and imitatability (replication by competitors). If correct, the framework suggests that private wealth creation in regimes of rapid technological change depends in large measure on honing intemal technological, organizational, and managerial processes inside the firm. In short, identifying new opportunities and organizing effectively and efficiently to embrace them are generally more fundamental to private wealth creation than is strategizing, if by strategizing one means engaging in business conduct that keeps competitors off balance, raises rival's costs, and excludes new entrants. © 1997 by John Wiley & Sons, Ltd.

25,469 citations


Authors

Showing all 128244 results

NameH-indexPapersCitations
Shizuo Akira2611308320561
Michael Grätzel2481423303599
Michael Karin236704226485
Yi Cui2201015199725
Yi Chen2174342293080
Fred H. Gage216967185732
Rob Knight2011061253207
Hongjie Dai197570182579
Martin White1962038232387
Michael Marmot1931147170338
David J. Schlegel193600193972
Simon D. M. White189795231645
George Efstathiou187637156228
Michael A. Strauss1851688208506
David H. Weinberg183700171424
Network Information
Related Institutions (5)
Massachusetts Institute of Technology
268K papers, 18.2M citations

96% related

Cornell University
235.5K papers, 12.2M citations

94% related

Columbia University
224K papers, 12.8M citations

94% related

Stanford University
320.3K papers, 21.8M citations

94% related

University of California, San Diego
204.5K papers, 12.3M citations

94% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
2022126
202111,131
202011,774
201910,932
20189,781
20179,516