Journal•

arXiv: Applications

About: arXiv: Applications is an academic journal. The journal publishes majorly in the area(s): Population & Estimator. Over the lifetime, 6700 publications have been published receiving 47774 citations.

...read moreread less

Topics: Population, Estimator, Bayesian probability, Bayesian inference, Inference ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Random survival forests

[...]

Hemant Ishwaran¹, Udaya B. Kogalur¹, Eugene H. Blackstone, Michael S. Lauer•Institutions (1)

University of Miami¹

11 Nov 2008-arXiv: Applications

TL;DR: Random Survival Forest (RSF) as discussed by the authors is a random forests method for the analysis of right-censored survival data, which is based on the conservation-of-events principle.

...read moreread less

Abstract: We introduce random survival forests, a random forests method for the analysis of right-censored survival data. New survival splitting rules for growing survival trees are introduced, as is a new missing data algorithm for imputing missing data. A conservation-of-events principle for survival forests is introduced and used to define ensemble mortality, a simple interpretable measure of mortality that can be used as a predicted outcome. Several illustrative examples are given, including a case study of the prognostic implications of body mass for individuals with coronary artery disease. Computations for all examples were implemented using the freely available R-software package, randomSurvivalForest.

...read moreread less

1,562 citations

Journal Article•DOI•

A correlated topic model of Science

[...]

David M. Blei, John Lafferty

27 Aug 2007-arXiv: Applications

TL;DR: The correlated topic model (CTM) is developed, where the topic proportions exhibit correlation via the logistic normal distribution, and it is demonstrated its use as an exploratory tool of large document collections.

...read moreread less

Abstract: Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics, each of which is a distribution over the vocabulary. A limitation of LDA is the inability to model topic correlation even though, for example, a document about genetics is more likely to also be about disease than X-ray astronomy. This limitation stems from the use of the Dirichlet distribution to model the variability among the topic proportions. In this paper we develop the correlated topic model (CTM), where the topic proportions exhibit correlation via the logistic normal distribution [J. Roy. Statist. Soc. Ser. B 44 (1982) 139--177]. We derive a fast variational inference algorithm for approximate posterior inference in this model, which is complicated by the fact that the logistic normal is not conjugate to the multinomial. We apply the CTM to the articles from Science published from 1990--1999, a data set that comprises 57M words. The CTM gives a better fit of the data than LDA, and we demonstrate its use as an exploratory tool of large document collections.

...read moreread less

1,100 citations

Journal Article•DOI•

A Tutorial on Regularized Partial Correlation Networks

[...]

Sacha Epskamp¹, Eiko I. Fried¹•Institutions (1)

University of Amsterdam¹

05 Jul 2016-arXiv: Applications

TL;DR: In this article, the authors describe how regularization techniques can be used to efficiently estimate a parsimonious and interpretable network structure in psychological data, and demonstrate the method in an empirical example on post-traumatic stress disorder data.

...read moreread less

Abstract: Recent years have seen an emergence of network modeling applied to moods, attitudes, and problems in the realm of psychology. In this framework, psychological variables are understood to directly affect each other rather than being caused by an unobserved latent entity. In this tutorial, we introduce the reader to estimating the most popular network model for psychological data: the partial correlation network. We describe how regularization techniques can be used to efficiently estimate a parsimonious and interpretable network structure in psychological data. We show how to perform these analyses in R and demonstrate the method in an empirical example on post-traumatic stress disorder data. In addition, we discuss the effect of the hyperparameter that needs to be manually set by the researcher, how to handle non-normal data, how to determine the required sample size for a network analysis, and provide a checklist with potential solutions for problems that can arise when estimating regularized partial correlation networks.

...read moreread less

839 citations

Journal Article•DOI•

Homophily and Contagion Are Generically Confounded in Observational Social Network Studies

[...]

Cosma Rohilla Shalizi¹, Andrew C. Thomas¹•Institutions (1)

Carnegie Mellon University¹

27 Apr 2010-arXiv: Applications

TL;DR: The authors demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects and that very simple models of imitation can produce substantial correlations between an individual’s enduring traits and his or her choices, even when there is no intrinsic affinity between them.

...read moreread less

Abstract: We consider processes on social networks that can potentially involve three factors: homophily, or the formation of social ties due to matching individual traits; social contagion, also known as social influence; and the causal effect of an individual's covariates on their behavior or other measurable responses. We show that, generically, all of these are confounded with each other. Distinguishing them from one another requires strong assumptions on the parametrization of the social process or on the adequacy of the covariates used (or both). In particular we demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects, and that very simple models of imitation (a form of social contagion) can produce substantial correlations between an individual's enduring traits and their choices, even when there is no intrinsic affinity between them. We also suggest some possible constructive responses to these results.

...read moreread less

763 citations

Journal Article•DOI•

Measuring reproducibility of high-throughput experiments

[...]

Qiang Li, James B. Brown¹, Haiyan Huang¹, Peter J. Bickel¹•Institutions (1)

University of California, Berkeley¹

21 Oct 2011-arXiv: Applications

TL;DR: This work proposes a unified approach to measure the reproducibility of findings identified from replicate experiments and identify putative discoveries using reproducible discoveries, which creates a curve, which quantitatively assesses when the findings are no longer consistent across replicates.

...read moreread less

Abstract: Reproducibility is essential to reliable scientific discovery in high-throughput experiments. In this work we propose a unified approach to measure the reproducibility of findings identified from replicate experiments and identify putative discoveries using reproducibility. Unlike the usual scalar measures of reproducibility, our approach creates a curve, which quantitatively assesses when the findings are no longer consistent across replicates. Our curve is fitted by a copula mixture model, from which we derive a quantitative reproducibility score, which we call the "irreproducible discovery rate" (IDR) analogous to the FDR. This score can be computed at each set of paired replicate ranks and permits the principled setting of thresholds both for assessing reproducibility and combining replicates. Since our approach permits an arbitrary scale for each replicate, it provides useful descriptive measures in a wide variety of situations to be explored. We study the performance of the algorithm using simulations and give a heuristic analysis of its theoretical properties. We demonstrate the effectiveness of our method in a ChIP-seq experiment.

...read moreread less

733 citations

Collapse

Network Information

Related Journals (5)

Journal of Statistical Software

1.4K papers, 352.7K citations

1.2K papers, 201K citations

6.1K papers, 193.5K citations

90% related

Annals of Statistics

5.6K papers, 653K citations

89% related

arXiv: Machine Learning

12.4K papers, 260.6K citations

89% related

Performance

Metrics

6,700

Papers

62,903

Citations

No. of papers from the Journal in previous years
Year	Papers
2021	795
2020	974
2019	873
2018	734
2017	597
2016	523