Book ChapterDOI
Estimating the Number of Unseen Species: How Many Words did Shakespeare Know?
Peter McCullagh
- pp 104-118
TLDR
Efron and Thisted as discussed by the authors studied the frequency distribution of words in the Shakespearean canon and found that the expected number of words that occur x ≥ 1 times in a large sample of n words isAbstract:
This paper is the first of two written by Brad Efron and Ron Thisted studying the frequency distribution of words in the Shakespearean canon. The key idea due to Fisher in the context of sampling of species is simple and elegant. When applied to Shakespeare the idea appears to be preposterous: an author has a personal vocabulary of word species represented by a distribution G, and text is generated by sampling from this distribution. Most results do not require successive words to be sampled independently, which leaves room for individual style and context, but stationarity is needed for prediction and inference. The expected number of words that occur x ≥ 1 times in a large sample of n words isread more
Citations
More filters
Journal ArticleDOI
Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells
Harlan Robins,Paulo Vidal Campregher,Santosh Srivastava,Abigail Wacher,Cameron J. Turtle,Cameron J. Turtle,Orsalem J. Kahsai,Stanley R. Riddell,Stanley R. Riddell,Edus H. Warren,Edus H. Warren,Christopher S. Carlson +11 more
TL;DR: A novel experimental and computational approach is developed to measure TCR CDR3 diversity based on single-molecule DNA sequencing, and it is found that total TCRbeta receptor diversity is at least 4-fold higher than previous estimates, and the diversity in the subset of CD45RO(+) antigen-experienced alphabeta T cells is at at least 10-foldHigher than previously estimates.
Proceedings ArticleDOI
A large-scale study of web password habits
Dinei Florencio,Cormac Herley +1 more
TL;DR: The study involved half a million users over athree month period and gets extremely detailed data on password strength, the types and lengths of passwords chosen, and how they vary by site.
Journal ArticleDOI
VDJtools: Unifying post-analysis of T cell receptor repertoires
Mikhail Shugay,Dmitriy V. Bagaev,Maria A. Turchaninova,Dmitriy A. Bolotin,Olga V. Britanova,Olga V. Britanova,Ekaterina V. Putintseva,Ekaterina V. Putintseva,Mikhail V. Pogorelyy,Vadim I. Nazarov,Ivan V. Zvyagin,Ivan V. Zvyagin,Vitalina I. Kirgizova,Kirill Kirgizov,E V Skorobogatova,Dmitriy M. Chudakov,Dmitriy M. Chudakov +16 more
TL;DR: VDJtools is reported, a complementary software suite that solves a wide range of T cell receptor (TCR) repertoires post-analysis tasks, provides a detailed tabular output and publication-ready graphics, and is built on top of a flexible API.
BookDOI
Handbook of Capture-Recapture Analysis
TL;DR: This book aims to bridge the gap between field-based biologists and statisticians as new methods are developed to deal with more complex data by helping biologists understand state-of-the-art statistical methods for capture–recapture analysis.
Journal ArticleDOI
Age-Related Decrease in TCR Repertoire Diversity Measured with Deep and Normalized Sequence Profiling
Olga V. Britanova,Ekaterina V. Putintseva,Mikhail Shugay,Mikhail Shugay,Ekaterina M. Merzlyak,Maria A. Turchaninova,Dmitriy B. Staroverov,Dmitriy A. Bolotin,Sergey Lukyanov,Sergey Lukyanov,Ekaterina A. Bogdanova,Ilgar Z. Mamedov,Y B Lebedev,Dmitriy M. Chudakov,Dmitriy M. Chudakov +14 more
TL;DR: It is demonstrated that TCR β diversity per 106 T cells decreases roughly linearly with age, with significant reduction already apparent by age 40, and the percentage of naive T cells showed a strong correlation with measured TCR diversity and decreased linearly up to age 70.
References
More filters
Journal ArticleDOI
The Relation Between the Number of Species and the Number of Individuals in a Random Sample of an Animal Population
TL;DR: It is shown that in a large collection of Lepidoptera captured in Malaya the frequency of the number of species represented by different numbers of individuals fitted somewhat closely to a hyperbola type of curve, so long as only the rarer species were considered.
Journal ArticleDOI
The sampling theory of selectively neutral alleles.
TL;DR: This paper considers deductive and subsequently inductive questions relating to a sample of genes from a selectively neutral locus, and the test of the hypothesis that the alleles being sampled are indeed selectively neutral will be considered.
BookDOI
Combinatorial Stochastic Processes
TL;DR: In this paper, the Brownian forest and the additive coalescent were constructed for random walks and random forests, respectively, and the Bessel process was used for random mappings.
Journal ArticleDOI
Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells
Harlan Robins,Paulo Vidal Campregher,Santosh Srivastava,Abigail Wacher,Cameron J. Turtle,Cameron J. Turtle,Orsalem J. Kahsai,Stanley R. Riddell,Stanley R. Riddell,Edus H. Warren,Edus H. Warren,Christopher S. Carlson +11 more
TL;DR: A novel experimental and computational approach is developed to measure TCR CDR3 diversity based on single-molecule DNA sequencing, and it is found that total TCRbeta receptor diversity is at least 4-fold higher than previous estimates, and the diversity in the subset of CD45RO(+) antigen-experienced alphabeta T cells is at at least 10-foldHigher than previously estimates.
Proceedings ArticleDOI
A large-scale study of web password habits
Dinei Florencio,Cormac Herley +1 more
TL;DR: The study involved half a million users over athree month period and gets extremely detailed data on password strength, the types and lengths of passwords chosen, and how they vary by site.