scispace - formally typeset
Search or ask a question
Author

Erkan Ozge Buzbas

Bio: Erkan Ozge Buzbas is an academic researcher from University of Idaho. The author has contributed to research in topics: Population & Approximate Bayesian computation. The author has an hindex of 9, co-authored 25 publications receiving 562 citations. Previous affiliations of Erkan Ozge Buzbas include University of Michigan & Stanford University.

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, a statistical test based on a measure of haplotype homozygosity (H12) was developed to detect both hard and soft sweeps with similar power, and they used H12 to identify multiple genomic regions that have undergone recent and strong adaptation in a large population sample of fully sequenced Drosophila melanogaster strains from the DGRP.
Abstract: Adaptation from standing genetic variation or recurrent de novo mutation in large populations should commonly generate soft rather than hard selective sweeps. In contrast to a hard selective sweep, in which a single adaptive haplotype rises to high population frequency, in a soft selective sweep multiple adaptive haplotypes sweep through the population simultaneously, producing distinct patterns of genetic variation in the vicinity of the adaptive site. Current statistical methods were expressly designed to detect hard sweeps and most lack power to detect soft sweeps. This is particularly unfortunate for the study of adaptation in species such as Drosophila melanogaster, where all three confirmed cases of recent adaptation resulted in soft selective sweeps and where there is evidence that the effective population size relevant for recent and strong adaptation is large enough to generate soft sweeps even when adaptation requires mutation at a specific single site at a locus. Here, we develop a statistical test based on a measure of haplotype homozygosity (H12) that is capable of detecting both hard and soft sweeps with similar power. We use H12 to identify multiple genomic regions that have undergone recent and strong adaptation in a large population sample of fully sequenced Drosophila melanogaster strains from the Drosophila Genetic Reference Panel (DGRP). Visual inspection of the top 50 candidates reveals that in all cases multiple haplotypes are present at high frequencies, consistent with signatures of soft sweeps. We further develop a second haplotype homozygosity statistic (H2/H1) that, in combination with H12, is capable of differentiating hard from soft sweeps. Surprisingly, we find that the H12 and H2/H1 values for all top 50 peaks are much more easily generated by soft rather than hard sweeps. We discuss the implications of these results for the study of adaptation in Drosophila and in species with large census population sizes.

394 citations

Journal ArticleDOI
15 May 2019-PLOS ONE
TL;DR: It is shown that the scientific process may not converge to truth even if scientific results are reproducible and that irreproducible results do not necessarily imply untrue results.
Abstract: Consistent confirmations obtained independently of each other lend credibility to a scientific result. We refer to results satisfying this consistency as reproducible and assume that reproducibility is a desirable property of scientific discovery. Yet seemingly science also progresses despite irreproducible results, indicating that the relationship between reproducibility and other desirable properties of scientific discovery is not well understood. These properties include early discovery of truth, persistence on truth once it is discovered, and time spent on truth in a long-term scientific inquiry. We build a mathematical model of scientific discovery that presents a viable framework to study its desirable properties including reproducibility. In this framework, we assume that scientists adopt a model-centric approach to discover the true model generating data in a stochastic process of scientific discovery. We analyze the properties of this process using Markov chain theory, Monte Carlo methods, and agent-based modeling. We show that the scientific process may not converge to truth even if scientific results are reproducible and that irreproducible results do not necessarily imply untrue results. The proportion of different research strategies represented in the scientific population, scientists’ choice of methodology, the complexity of truth, and the strength of signal contribute to this counter-intuitive finding. Important insights include that innovative research speeds up the discovery of scientific truth by facilitating the exploration of model space and epistemic diversity optimizes across desirable properties of scientific discovery.

51 citations

Posted ContentDOI
28 Apr 2020-bioRxiv
TL;DR: A formal statistical analysis of three popular claims in the metascientific literature is presented, showing how the use and benefits of such formalism can inform and shape debates about such methodological claims.
Abstract: Current attempts at methodological reform in sciences come in response to an overall lack of rigor in methodological and scientific practices in experimental sciences. However, most methodological reform attempts suffer from similar mistakes and over-generalizations to the ones they aim to address. We argue that this can be attributed in part to lack of formalism and first principles. Considering the costs of allowing false claims to become canonized, we argue for formal statistical rigor and scientific nuance in methodological reform. To attain this rigor and nuance, we propose a five-step formal approach for solving methodological problems. To illustrate the use and benefits of such formalism, we present a formal statistical analysis of three popular claims in the metascientific literature: (a) that reproducibility is the cornerstone of science; (b) that data must not be used twice in any analysis; and (c) that exploratory projects imply poor statistical practice. We show how our formal approach can inform and shape debates about such methodological claims.

51 citations

Journal ArticleDOI
TL;DR: To the Editor: Human embryonic stem-cell research may lead to new methods of drug discovery, insights into mechanisms of disease, and eventually, cellular therapies, but investigators have been unable to target their research to diverse subgroups of existing lines or to ensure the inclusion of lines from the human populations most relevant to their diseases of interest.
Abstract: To the Editor: Human embryonic stem-cell research may lead to new methods of drug discovery, insights into mechanisms of disease, and eventually, cellular therapies. The potential benefit to patient populations may depend partially on the diversity of the stem-cell lines that are available for research and clinical use. However, investigators have been unable to target their research to diverse subgroups of existing lines or to ensure the inclusion of lines from the human populations most relevant to their diseases of interest, because almost no information has been available on the human population origin of existing stem-cell lines. Therefore, with the . . .

46 citations

Journal ArticleDOI
TL;DR: This work presents "approximate approximate Bayesian computation" (AABC), a class of computationally fast inference methods that extends ABC to models in which simulating data is expensive, and demonstrates the performance of AABC on a population-genetic model of natural selection, as well as on a model of the admixture history of hybrid populations.

36 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
19 Jan 2012-Nature
TL;DR: The ability to restore pluripotency to somatic cells through the ectopic co-expression of reprogramming factors has created powerful new opportunities for modelling human diseases and offers hope for personalized regenerative cell therapies.
Abstract: The field of stem-cell biology has been catapulted forward by the startling development of reprogramming technology. The ability to restore pluripotency to somatic cells through the ectopic co-expression of reprogramming factors has created powerful new opportunities for modelling human diseases and offers hope for personalized regenerative cell therapies. While the field is racing ahead, some researchers are pausing to evaluate whether induced pluripotent stem cells are indeed the true equivalents of embryonic stem cells and whether subtle differences between these types of cell might affect their research applications and therapeutic potential.

1,064 citations

Book ChapterDOI
01 Jan 2004
TL;DR: To study the operational behaviour of λ-terms, this work will use the denotational (mathematical) approach to choose a space of semantics values, or denotations, where terms are to be interpreted.
Abstract: To study the operational behaviour of λ-terms, we will use the denotational (mathematical) approach. A denotational semantics for a language is based on the choice of a space of semantics values, or denotations, where terms are to be interpreted. Choosing a space with nice mathematical properties can help in proving the semantic properties of terms, since to this aim standard mathematical techniques can be used.

880 citations

01 Jan 2013
TL;DR: Four rationales for sharing data are examined, drawing examples from the sciences, social sciences, and humanities: to reproduce or to verify research, to make results of publicly funded research available to the public, to enable others to ask new questions of extant data, and to advance the state of research and innovation.
Abstract: We must all accept that science is data and that data are science, and thus provide for, and justify the need for the support of, much-improved data curation. (Hanson, Sugden, & Alberts) Researchers are producing an unprecedented deluge of data by using new methods and instrumentation. Others may wish to mine these data for new discoveries and innovations. However, research data are not readily available as sharing is common in only a few fields such as astronomy and genomics. Data sharing practices in other fields vary widely. Moreover, research data take many forms, are handled in many ways, using many approaches, and often are difficult to interpret once removed from their initial context. Data sharing is thus a conundrum. Four rationales for sharing data are examined, drawing examples from the sciences, social sciences, and humanities: (1) to reproduce or to verify research, (2) to make results of publicly funded research available to the public, (3) to enable others to ask new questions of extant data, and (4) to advance the state of research and innovation. These rationales differ by the arguments for sharing, by beneficiaries, and by the motivations and incentives of the many stakeholders involved. The challenges are to understand which data might be shared, by whom, with whom, under what conditions, why, and to what effects. Answers will inform data policy and practice. © 2012 Wiley Periodicals, Inc.

634 citations

Journal ArticleDOI
Katherine Amps1, Peter W. Andrews1, George Anyfantis2, Lyle Armstrong2, Stuart Avery3, Hossein Baharvand4, Julie C. Baker5, Duncan Baker6, Maria D. Barbadillo Muñoz7, Stephen J. Beil8, Nissim Benvenisty9, Dalit Ben-Yosef10, Juan Carlos Biancotti11, Alexis Bosman12, Romulo M. Brena8, Daniel R. Brison13, Gunilla Caisander, Marãa V. Camarasa14, Jieming Chen15, Eric Chiao5, Young Min Choi16, Andre Choo, D.M. Collins, Alan Colman3, Jeremy M. Crook3, George Q. Daley17, Anne Dalton6, Paul A. De Sousa18, Chris Denning7, J.M. Downie, Petr Dvorak19, Karen Dyer Montgomery20, Anis Feki, Angela Ford1, Victoria Fox8, Ana Maria Fraga21, Tzvia Frumkin10, Lin Ge22, Paul J. Gokhale1, Tamar Golan-Lev9, Hamid Gourabi4, Michal Gropp, Lu GuangXiu22, Aleš Hampl19, Katie Harron23, Lyn Healy, Wishva Herath15, Frida Holm24, Outi Hovatta24, Johan Hyllner, Maneesha S. Inamdar25, Astrid K. Irwanto15, Tetsuya Ishii26, Marisa Jaconi12, Ying Jin27, Susan J. Kimber14, Sergey Kiselev28, Barbara B. Knowles3, Oded Kopper9, Valeri Kukharenko, Anver Kuliev, Maria A. Lagarkova29, Peter W. Laird8, Majlinda Lako2, Andrew L. Laslett, Neta Lavon11, Dong Ryul Lee, Jeoung Eun Lee, Chunliang Li27, Linda S. Lim15, Tenneille Ludwig20, Yu Ma27, Edna Maltby6, Ileana Mateizel30, Yoav Mayshar9, Maria Mileikovsky, Stephen L. Minger31, Takamichi Miyazaki26, Shin Yong Moon16, Harry Moore1, Christine L. Mummery32, Andras Nagy, Norio Nakatsuji26, Kavita Narwani11, Steve Oh, Sun Kyung Oh16, Cia Olson33, Timo Otonkoski33, Fei Pan8, In-Hyun Park34, Steve Pells18, Martin F. Pera8, Lygia da Veiga Pereira21, Ouyang Qi22, Grace Selva Raj3, Benjamin Reubinoff, Alan Robins, Paul Robson15, Janet Rossant35, Ghasem Hosseini Salekdeh4, Thomas C. Schulz, Karen Sermon30, Jameelah Sheik Mohamed15, Hui Shen8, Eric S Sherrer, Kuldip S. Sidhu36, Shirani Sivarajah3, Heli Skottman37, Claudia Spits30, Glyn Stacey, Raimund Strehl, Nick Strelchenko, Hirofumi Suemori26, Bowen Sun27, Riitta Suuronen37, Kazutoshi Takahashi26, Timo Tuuri33, Parvathy Venu25, Yuri Verlinsky, Dorien Ward-van Oostwaard32, Daniel J. Weisenberger8, Yue Wu31, Shinya Yamanaka26, Lorraine E. Young7, Qi Zhou38 
TL;DR: Of these genes, BCL2L1 is a strong candidate for driving culture adaptation of ES cells, and single-nucleotide polymorphism analysis revealed that they included representatives of most major ethnic groups.
Abstract: The International Stem Cell Initiative analyzed 125 human embryonic stem (ES) cell lines and 11 induced pluripotent stem (iPS) cell lines, from 38 laboratories worldwide, for genetic changes occurring during culture. Most lines were analyzed at an early and late passage. Single-nucleotide polymorphism (SNP) analysis revealed that they included representatives of most major ethnic groups. Most lines remained karyotypically normal, but there was a progressive tendency to acquire changes on prolonged culture, commonly affecting chromosomes 1, 12, 17 and 20. DNA methylation patterns changed haphazardly with no link to time in culture. Structural variants, determined from the SNP arrays, also appeared sporadically. No common variants related to culture were observed on chromosomes 1, 12 and 17, but a minimal amplicon in chromosome 20q11.21, including three genes expressed in human ES cells, ID1, BCL2L1 and HM13, occurred in >20% of the lines. Of these genes, BCL2L1 is a strong candidate for driving culture adaptation of ES cells.

506 citations