Institution
University of California, Berkeley
Education•Berkeley, California, United States•
About: University of California, Berkeley is a education organization based out in Berkeley, California, United States. It is known for research contribution in the topics: Population & Galaxy. The organization has 128244 authors who have published 265680 publications receiving 16824174 citations. The organization is also known as: University of California Berkeley & UC Berkeley.
Topics: Population, Galaxy, Poison control, Gene, Supernova
Papers published on a yearly basis
Papers
More filters
[...]
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
58,232 citations
[...]
TL;DR: In this article, the product-limit (PL) estimator was proposed to estimate the proportion of items in the population whose lifetimes would exceed t (in the absence of such losses), without making any assumption about the form of the function P(t).
Abstract: In lifetesting, medical follow-up, and other fields the observation of the time of occurrence of the event of interest (called a death) may be prevented for some of the items of the sample by the previous occurrence of some other event (called a loss). Losses may be either accidental or controlled, the latter resulting from a decision to terminate certain observations. In either case it is usually assumed in this paper that the lifetime (age at death) is independent of the potential loss time; in practice this assumption deserves careful scrutiny. Despite the resulting incompleteness of the data, it is desired to estimate the proportion P(t) of items in the population whose lifetimes would exceed t (in the absence of such losses), without making any assumption about the form of the function P(t). The observation for each item of a suitable initial event, marking the beginning of its lifetime, is presupposed. For random samples of size N the product-limit (PL) estimate can be defined as follows: L...
51,084 citations
[...]
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
27,392 citations
Proceedings Article•
[...]
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.
25,546 citations
[...]
TL;DR: The dynamic capabilities framework as mentioned in this paper analyzes the sources and methods of wealth creation and capture by private enterprise firms operating in environments of rapid technological change, and suggests that private wealth creation in regimes of rapid technology change depends in large measure on honing intemal technological, organizational, and managerial processes inside the firm.
Abstract: The dynamic capabilities framework analyzes the sources and methods of wealth creation and capture by private enterprise firms operating in environments of rapid technological change. The competitive advantage of firms is seen as resting on distinctive processes (ways of coordinating and combining), shaped by the firm's (specific) asset positions (such as the firm's portfolio of difftcult-to- trade knowledge assets and complementary assets), and the evolution path(s) it has aflopted or inherited. The importance of path dependencies is amplified where conditions of increasing retums exist. Whether and how a firm's competitive advantage is eroded depends on the stability of market demand, and the ease of replicability (expanding intemally) and imitatability (replication by competitors). If correct, the framework suggests that private wealth creation in regimes of rapid technological change depends in large measure on honing intemal technological, organizational, and managerial processes inside the firm. In short, identifying new opportunities and organizing effectively and efficiently to embrace them are generally more fundamental to private wealth creation than is strategizing, if by strategizing one means engaging in business conduct that keeps competitors off balance, raises rival's costs, and excludes new entrants. © 1997 by John Wiley & Sons, Ltd.
25,469 citations
Authors
Showing all 128244 results
Name | H-index | Papers | Citations |
---|---|---|---|
Shizuo Akira | 261 | 1308 | 320561 |
Michael Grätzel | 248 | 1423 | 303599 |
Michael Karin | 236 | 704 | 226485 |
Yi Cui | 220 | 1015 | 199725 |
Yi Chen | 217 | 4342 | 293080 |
Fred H. Gage | 216 | 967 | 185732 |
Rob Knight | 201 | 1061 | 253207 |
Hongjie Dai | 197 | 570 | 182579 |
Martin White | 196 | 2038 | 232387 |
Michael Marmot | 193 | 1147 | 170338 |
David J. Schlegel | 193 | 600 | 193972 |
Simon D. M. White | 189 | 795 | 231645 |
George Efstathiou | 187 | 637 | 156228 |
Michael A. Strauss | 185 | 1688 | 208506 |
David H. Weinberg | 183 | 700 | 171424 |