scispace - formally typeset
Search or ask a question
Author

Bronwyn Carlisle

Bio: Bronwyn Carlisle is an academic researcher from University of Otago. The author has contributed to research in topics: Canopy clustering algorithm & Determining the number of clusters in a data set. The author has an hindex of 3, co-authored 3 publications receiving 303 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: To overcome the limitations of hard clustering, this work applied soft clustering which offers several advantages for researchers, including more noise robust and a priori pre-filtering of genes can be avoided.
Abstract: Clustering is an important tool in microarray data analysis. This unsupervised learning technique is commonly used to reveal structures hidden in large gene expression data sets. The vast majority of clustering algorithms applied so far produce hard partitions of the data, i.e. each gene is assigned exactly to one cluster. Hard clustering is favourable if clusters are well separated. However, this is generally not the case for microarray time-course data, where gene clusters frequently overlap. Additionally, hard clustering algorithms are often highly sensitive to noise. To overcome the limitations of hard clustering, we applied soft clustering which offers several advantages for researchers. First, it generates accessible internal cluster structures, i.e. it indicates how well corresponding clusters represent genes. This can be used for the more targeted search for regulatory elements. Second, the overall relation between clusters, and thus a global clustering structure, can be defined. Additionally, soft clustering is more noise robust and a priori pre-filtering of genes can be avoided. This prevents the exclusion of biologically relevant genes from the data analysis. Soft clustering was implemented here using the fuzzy c-means algorithm. Procedures to find optimal clustering parameters were developed. A software package for soft clustering has been developed based on the open-source statistical language R. The package called Mfuzz is freely available.

375 citations

Journal ArticleDOI
TL;DR: Structures consistent in size, shape and character with various stages of a Lentivirus replicative cycle were observed by electron microscopy in 12-day peripheral-blood lymphocyte cultures from 10 of 17 Chronic Fatigue Syndrome patients and not in controls.

13 citations

Journal ArticleDOI
TL;DR: The establishment of ToothPrint, a proteomic database for dental tissues accessed at http://toothprint.otago.ac.nz is reported, which should prove to be an effective bioinformatic resource for investigations of dental biology.
Abstract: Increasing demand exists to disseminate and integrate proteomic data as proteome analysis assumes a commanding role in the postgenome era. Databases on the World Wide Web are an effective means to share information obtained from two-dimensional gels and allied proteomic approaches. Here we report the establishment of ToothPrint, a proteomic database for dental tissues accessed at http://toothprint.otago.ac.nz. Using developing rat enamel as a prototype, ToothPrint provides a variety of functionally relevant data (ligand binding, subcellular localisation, developmental regulation) in addition to protein identification maps. Features designed to enhance usability of the website and simplify its computing requirements are also outlined. Customized for mineralizing tissues, ToothPrint should prove to be an effective bioinformatic resource for investigations of dental biology.

11 citations


Cited by
More filters
Journal ArticleDOI
03 Nov 2006-Cell
TL;DR: A general mass spectrometric technology is developed and applied for identification and quantitation of phosphorylation sites as a function of stimulus, time, and subcellular location to provide a missing link in a global, integrative view of cellular regulation.

3,404 citations

Journal ArticleDOI
TL;DR: An R package termed Mfuzz is constructed implementing soft clustering tools for microarray data analysis, which can overcome shortcomings of conventional hard clustering techniques and offer further advantages.
Abstract: For the analysis of microarray data, clustering techniques are frequently used. Most of such methods are based on hard clustering of data wherein one gene (or sample) is assigned to exactly one cluster. Hard clustering, however, suffers from several drawbacks such as sensitivity to noise and information loss. In contrast, soft clustering methods can assign a gene to several clusters. They can overcome shortcomings of conventional hard clustering techniques and offer further advantages. Thus, we constructed an R package termed Mfuzz implementing soft clustering tools for microarray data analysis. The additional package Mfuzzgui provides a convenient TclTk based graphical user interface. Availability The R package Mfuzz and Mfuzzgui are available at http://itb1.biologie.hu-berlin.de/~futschik/software/R/Mfuzz/index.html. Their distribution is subject to GPL version 2 license.

828 citations

Journal ArticleDOI
Francine E. Garrett-Bakelman1, Francine E. Garrett-Bakelman2, Manjula Darshi3, Stefan J. Green4, Ruben C. Gur5, Ling Lin6, Brandon R. Macias, Miles J. McKenna7, Cem Meydan2, Tejaswini Mishra6, Jad Nasrini5, Brian D. Piening6, Brian D. Piening8, Lindsay F. Rizzardi9, Kumar Sharma3, Jamila H. Siamwala10, Jamila H. Siamwala11, Lynn Taylor7, Martha Hotz Vitaterna12, Maryam Afkarian13, Ebrahim Afshinnekoo2, Sara Ahadi6, Aditya Ambati6, Maneesh Arya, Daniela Bezdan2, Colin M. Callahan9, Songjie Chen6, Augustine M.K. Choi2, George E. Chlipala4, Kévin Contrepois6, Marisa Covington, Brian Crucian, Immaculata De Vivo14, David F. Dinges5, Douglas J. Ebert, Jason I. Feinberg9, Jorge Gandara2, Kerry George, John Goutsias9, George Grills2, Alan R. Hargens10, Martina Heer15, Martina Heer16, Ryan P. Hillary6, Andrew N. Hoofnagle17, Vivian Hook10, Garrett Jenkinson9, Garrett Jenkinson18, Peng Jiang12, Ali Keshavarzian19, Steven S. Laurie, Brittany Lee-McMullen6, Sarah B. Lumpkins, Matthew MacKay2, Mark Maienschein-Cline4, Ari Melnick2, Tyler M. Moore5, Kiichi Nakahira2, Hemal H. Patel10, Robert Pietrzyk, Varsha Rao6, Rintaro Saito20, Rintaro Saito10, Denis Salins6, Jan M. Schilling10, Dorothy D. Sears10, Caroline Sheridan2, Michael B. Stenger, Rakel Tryggvadottir9, Alexander E. Urban6, Tomas Vaisar17, Benjamin Van Espen10, Jing Zhang6, Michael G. Ziegler10, Sara R. Zwart21, John B. Charles, Craig E. Kundrot, Graham B. I. Scott22, Susan M. Bailey7, Mathias Basner5, Andrew P. Feinberg9, Stuart M. C. Lee, Christopher E. Mason, Emmanuel Mignot6, Brinda K. Rana10, Scott M. Smith, Michael Snyder6, Fred W. Turek11, Fred W. Turek12 
12 Apr 2019-Science
TL;DR: Given that the majority of the biological and human health variables remained stable, or returned to baseline, after a 340-day space mission, these data suggest that human health can be mostly sustained over this duration of spaceflight.
Abstract: INTRODUCTION To date, 559 humans have been flown into space, but long-duration (>300 days) missions are rare (n = 8 total). Long-duration missions that will take humans to Mars and beyond are planned by public and private entities for the 2020s and 2030s; therefore, comprehensive studies are needed now to assess the impact of long-duration spaceflight on the human body, brain, and overall physiology. The space environment is made harsh and challenging by multiple factors, including confinement, isolation, and exposure to environmental stressors such as microgravity, radiation, and noise. The selection of one of a pair of monozygotic (identical) twin astronauts for NASA’s first 1-year mission enabled us to compare the impact of the spaceflight environment on one twin to the simultaneous impact of the Earth environment on a genetically matched subject. RATIONALE The known impacts of the spaceflight environment on human health and performance, physiology, and cellular and molecular processes are numerous and include bone density loss, effects on cognitive performance, microbial shifts, and alterations in gene regulation. However, previous studies collected very limited data, did not integrate simultaneous effects on multiple systems and data types in the same subject, or were restricted to 6-month missions. Measurement of the same variables in an astronaut on a year-long mission and in his Earth-bound twin indicated the biological measures that might be used to determine the effects of spaceflight. Presented here is an integrated longitudinal, multidimensional description of the effects of a 340-day mission onboard the International Space Station. RESULTS Physiological, telomeric, transcriptomic, epigenetic, proteomic, metabolomic, immune, microbiomic, cardiovascular, vision-related, and cognitive data were collected over 25 months. Some biological functions were not significantly affected by spaceflight, including the immune response (T cell receptor repertoire) to the first test of a vaccination in flight. However, significant changes in multiple data types were observed in association with the spaceflight period; the majority of these eventually returned to a preflight state within the time period of the study. These included changes in telomere length, gene regulation measured in both epigenetic and transcriptional data, gut microbiome composition, body weight, carotid artery dimensions, subfoveal choroidal thickness and peripapillary total retinal thickness, and serum metabolites. In addition, some factors were significantly affected by the stress of returning to Earth, including inflammation cytokines and immune response gene networks, as well as cognitive performance. For a few measures, persistent changes were observed even after 6 months on Earth, including some genes’ expression levels, increased DNA damage from chromosomal inversions, increased numbers of short telomeres, and attenuated cognitive function. CONCLUSION Given that the majority of the biological and human health variables remained stable, or returned to baseline, after a 340-day space mission, these data suggest that human health can be mostly sustained over this duration of spaceflight. The persistence of the molecular changes (e.g., gene expression) and the extrapolation of the identified risk factors for longer missions (>1 year) remain estimates and should be demonstrated with these measures in future astronauts. Finally, changes described in this study highlight pathways and mechanisms that may be vulnerable to spaceflight and may require safeguards for longer space missions; thus, they serve as a guide for targeted countermeasures or monitoring during future missions.

538 citations

Journal ArticleDOI
TL;DR: Cellular events underlying the pluripotency of human embryonic stem cells (hESCs) are elucidated and a core hESC phosphoproteome of sites with similar robust changes in response to the two distinct treatments is identified.
Abstract: To elucidate cellular events underlying the pluripotency of human embryonic stem cells (hESCs), we performed parallel quantitative proteomic and phosphoproteomic analyses of hESCs during differentiation initiated by a diacylglycerol analog or transfer to media that had not been conditioned by feeder cells. We profiled 6521 proteins and 23,522 phosphorylation sites, of which almost 50% displayed dynamic changes in phosphorylation status during 24 hours of differentiation. These data are a resource for studies of the events associated with the maintenance of hESC pluripotency and those accompanying their differentiation. From these data, we identified a core hESC phosphoproteome of sites with similar robust changes in response to the two distinct treatments. These sites exhibited distinct dynamic phosphorylation patterns, which were linked to known or predicted kinases on the basis of the matching sequence motif. In addition to identifying previously unknown phosphorylation sites on factors associated with differentiation, such as kinases and transcription factors, we observed dynamic phosphorylation of DNA methyltransferases (DNMTs). We found a specific interaction of DNMTs during early differentiation with the PAF1 (polymerase-associated factor 1) transcriptional elongation complex, which binds to promoters of the pluripotency and known DNMT target genes encoding OCT4 and NANOG, thereby providing a possible molecular link for the silencing of these genes during differentiation.

450 citations

Journal ArticleDOI
TL;DR: This paper compares the efficacy of three different implementations of techniques aimed to extend fuzzy c-means (FCM) clustering to VL data and concludes by demonstrating the VL algorithms on a dataset with 5 billion objects and presenting a set of recommendations regarding the use of different VL FCM clustering schemes.
Abstract: Very large (VL) data or big data are any data that you cannot load into your computer's working memory. This is not an objective definition, but a definition that is easy to understand and one that is practical, because there is a dataset too big for any computer you might use; hence, this is VL data for you. Clustering is one of the primary tasks used in the pattern recognition and data mining communities to search VL databases (including VL images) in various applications, and so, clustering algorithms that scale well to VL data are important and useful. This paper compares the efficacy of three different implementations of techniques aimed to extend fuzzy c-means (FCM) clustering to VL data. Specifically, we compare methods that are based on 1) sampling followed by noniterative extension; 2) incremental techniques that make one sequential pass through subsets of the data; and 3) kernelized versions of FCM that provide approximations based on sampling, including three proposed algorithms. We use both loadable and VL datasets to conduct the numerical experiments that facilitate comparisons based on time and space complexity, speed, quality of approximations to batch FCM (for loadable data), and assessment of matches between partitions and ground truth. Empirical results show that random sampling plus extension FCM, bit-reduced FCM, and approximate kernel FCM are good choices to approximate FCM for VL data. We conclude by demonstrating the VL algorithms on a dataset with 5 billion objects and presenting a set of recommendations regarding the use of different VL FCM clustering schemes.

424 citations