scispace - formally typeset
Search or ask a question
Author

Pengyi Yang

Bio: Pengyi Yang is an academic researcher from University of Sydney. The author has contributed to research in topics: Computer science & Biology. The author has an hindex of 25, co-authored 103 publications receiving 2916 citations. Previous affiliations of Pengyi Yang include Garvan Institute of Medical Research & Southwest University.


Papers
More filters
Journal ArticleDOI
TL;DR: This article provides a review of the most widely used ensemble learning methods and their application in various bioinformatics problems, including the main topics of gene expression, mass spectrometry-based proteomics, gene-gene interaction identification from genome-wide association studies, and prediction of regulatory elements from DNA and protein sequences.
Abstract: Ensemble learning is an intensively studies technique in machine learning and pattern recognition. Recent work in computational biology has seen an increasing use of ensemble learning methods due to their unique advantages in dealing with small sample size, high-dimensionality, and complexity data structures. The aim of this article is two-fold. First, it is to provide a review of the most widely used ensemble learning methods and their application in various bioinformatics problems, including the main topics of gene expression, mass spectrometry-based proteomics, gene-gene interaction identification from genome-wide association studies, and prediction of regulatory elements from DNA and protein sequences. Second, we try to identify and summarize future trends of ensemble methods in bioinformatics. Promising directions such as ensemble of support vector machine, meta-ensemble, and ensemble based feature selection are discussed.

436 citations

Journal ArticleDOI
TL;DR: The dynamic phosphoproteome described here contains numerous phosphorylation sites on proteins involved in diverse molecular functions and should serve as a useful functional resource for cell biologists.

333 citations

Journal ArticleDOI
TL;DR: A global analysis of protein phosphorylation in human skeletal muscle biopsies from untrained healthy males before and after a single high-intensity exercise bout revealed 1,004 unique exercise-regulated phosphosites on 562 proteins, exposing the unexplored complexity of acute exercise signaling.

293 citations

Journal ArticleDOI
TL;DR: An essential role for DNMT1 is uncovered in MaSC and CSC maintenance and theDNMT1-ISL1 axis is identified as a potential therapeutic target for breast cancer treatment.
Abstract: Mammary stem/progenitor cells (MaSCs) maintain self-renewal of the mammary epithelium during puberty and pregnancy. DNA methylation provides a potential epigenetic mechanism for maintaining cellular memory during self-renewal. Although DNA methyltransferases (DNMTs) are dispensable for embryonic stem cell maintenance, their role in maintaining MaSCs and cancer stem cells (CSCs) in constantly replenishing mammary epithelium is unclear. Here we show that DNMT1 is indispensable for MaSC maintenance. Furthermore, we find that DNMT1 expression is elevated in mammary tumours, and mammary gland-specific DNMT1 deletion protects mice from mammary tumorigenesis by limiting the CSC pool. Through genome-scale methylation studies, we identify ISL1 as a direct DNMT1 target, hypermethylated and downregulated in mammary tumours and CSCs. DNMT inhibition or ISL1 expression in breast cancer cells limits CSC population. Altogether, our studies uncover an essential role for DNMT1 in MaSC and CSC maintenance and identify DNMT1-ISL1 axis as a potential therapeutic target for breast cancer treatment.

200 citations

Journal ArticleDOI
TL;DR: It is shown that an mRNA 3′ processing factor, Fip1, is essential for embryonic stem cell self‐renewal and somatic cell reprogramming and mechanistic insight is provided on APA regulation in development and an important function for APA in cell fate specification is established.
Abstract: mRNA alternative polyadenylation (APA) plays a critical role in post-transcriptional gene control and is highly regulated during development and disease. However, the regulatory mechanisms and functional consequences of APA remain poorly understood. Here, we show that an mRNA 3 0 processing factor, Fip1, is essential for embryonic stem cell (ESC) self-renewal and somatic cell reprogramming. Fip1 promotes stem cell maintenance, in part, by activating the ESC-specific APA profiles to ensure the optimal expression of a specific set of genes, including critical self-renewal factors. Fip1 expression and the Fip1-dependent APA program change during ESC differentiation and are restored to an ESC-like state during somatic reprogramming. Mechanistically, we provide evidence that the specificity of Fip1-mediated APA regulation depends on multiple factors, including Fip1-RNA interactions and the distance between APA sites. Together, our data highlight the role for post-transcriptional control in stem cell self-renewal, provide mechanistic insight on APA regulation in development, and establish an important function for APA in cell fate specification.

145 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations

Journal ArticleDOI
TL;DR: How AMPK functions as a central mediator of the cellular response to energetic stress and mitochondrial insults and coordinates multiple features of autophagy and mitochondrial biology is discussed.
Abstract: Cells constantly adapt their metabolism to meet their energy needs and respond to nutrient availability. Eukaryotes have evolved a very sophisticated system to sense low cellular ATP levels via the serine/threonine kinase AMP-activated protein kinase (AMPK) complex. Under conditions of low energy, AMPK phosphorylates specific enzymes and growth control nodes to increase ATP generation and decrease ATP consumption. In the past decade, the discovery of numerous new AMPK substrates has led to a more complete understanding of the minimal number of steps required to reprogramme cellular metabolism from anabolism to catabolism. This energy switch controls cell growth and several other cellular processes, including lipid and glucose metabolism and autophagy. Recent studies have revealed that one ancestral function of AMPK is to promote mitochondrial health, and multiple newly discovered targets of AMPK are involved in various aspects of mitochondrial homeostasis, including mitophagy. This Review discusses how AMPK functions as a central mediator of the cellular response to energetic stress and mitochondrial insults and coordinates multiple features of autophagy and mitochondrial biology.

1,873 citations

Journal ArticleDOI
TL;DR: An in depth review of rare event detection from an imbalanced learning perspective and a comprehensive taxonomy of the existing application domains of im balanced learning are provided.
Abstract: 527 articles related to imbalanced data and rare events are reviewed.Viewing reviewed papers from both technical and practical perspectives.Summarizing existing methods and corresponding statistics by a new taxonomy idea.Categorizing 162 application papers into 13 domains and giving introduction.Some opening questions are discussed at the end of this manuscript. Rare events, especially those that could potentially negatively impact society, often require humans decision-making responses. Detecting rare events can be viewed as a prediction task in data mining and machine learning communities. As these events are rarely observed in daily life, the prediction task suffers from a lack of balanced data. In this paper, we provide an in depth review of rare event detection from an imbalanced learning perspective. Five hundred and seventeen related papers that have been published in the past decade were collected for the study. The initial statistics suggested that rare events detection and imbalanced learning are concerned across a wide range of research areas from management science to engineering. We reviewed all collected papers from both a technical and a practical point of view. Modeling methods discussed include techniques such as data preprocessing, classification algorithms and model evaluation. For applications, we first provide a comprehensive taxonomy of the existing application domains of imbalanced learning, and then we detail the applications for each category. Finally, some suggestions from the reviewed papers are incorporated with our experiences and judgments to offer further research directions for the imbalanced learning and rare event detection fields.

1,448 citations