scispace - formally typeset
Search or ask a question
Author

Liping Wei

Bio: Liping Wei is an academic researcher from Peking University. The author has contributed to research in topics: Gene & Genome. The author has an hindex of 37, co-authored 96 publications receiving 11373 citations. Previous affiliations of Liping Wei include University of Illinois at Chicago & Peking Union Medical College.


Papers
More filters
Journal ArticleDOI
TL;DR: A KO-Based Annotation System (KOBAS) is developed that can automatically annotate a set of sequences with KO terms and identify both the most frequent and the statistically significantly enriched pathways.
Abstract: Motivation: High-throughput technologies such as DNA sequencing and microarrays have created the need for automated annotation of large sets of genes, including whole genomes, and automated identification of pathways. Ontologies, such as the popular Gene Ontology (GO), provide a common controlled vocabulary for these types of automated analysis. Yet, while GO offers tremendous value, it also has certain limitations such as the lack of direct association with pathways. Results: We demonstrated the use of the KEGG Orthology (KO), part of the KEGG suite of resources, as an alternative controlled vocabulary for automated annotation and pathway identification. We developed a KO-Based Annotation System (KOBAS) that can automatically annotate a set of sequences with KO terms and identify both the most frequent and the statistically significantly enriched pathways. Results from both whole genome and microarray gene cluster annotations with KOBAS are comparable and complementary to known annotations. KOBAS is a freely available standalone Python program that can contribute significantly to genome annotation and microarray analysis. Availability: Supplementary data and the KOBAS system are available at http://genome.cbi.pku.edu.cn/download.html Contact: weilp@mail.cbi.pku.edu.cn

2,595 citations

Journal ArticleDOI
Lei Kong1, Yong Zhang1, Zhi-Qiang Ye1, Xiaoqiao Liu1, Shuqi Zhao1, Liping Wei1, Ge Gao1 
TL;DR: A support vector machine-based classifier, named Coding Potential Calculator (CPC), to assess the protein-coding potential of a transcript based on six biologically meaningful sequence features, which can discriminate coding from noncoding transcripts with high accuracy.
Abstract: Recent transcriptome studies have revealed that a large number of transcripts in mammals and other organisms do not encode proteins but function as noncoding RNAs (ncRNAs) instead. As millions of transcripts are generated by large-scale cDNA and EST sequencing projects every year, there is a need for automatic methods to distinguish protein-coding RNAs from noncoding RNAs accurately and quickly. We developed a support vector machine-based classifier, named Coding Potential Calculator (CPC), to assess the protein-coding potential of a transcript based on six biologically meaningful sequence features. Tenfold cross-validation on the training dataset and further testing on several large datasets showed that CPC can discriminate coding from noncoding transcripts with high accuracy. Furthermore, CPC also runs an order-of-magnitude faster than a previous state-of-the-art tool and has higher accuracy. We developed a user-friendly web-based interface of CPC at http://cpc.cbi.pku.edu.cn. In addition to predicting the coding potential of the input transcripts, the CPC web server also graphically displays detailed sequence features and additional annotations of the transcript that may facilitate users’ further investigation.

2,168 citations

Journal ArticleDOI
13 Nov 2014-Nature
TL;DR: It is estimated that LGD mutation in about 400 genes can contribute to the joint class of affected females and males of lower IQ, with an overlapping and similar number of genes vulnerable to contributory missense mutation.
Abstract: Whole exome sequencing has proven to be a powerful tool for understanding the genetic architecture of human disease. Here we apply it to more than 2,500 simplex families, each having a child with an autistic spectrum disorder. By comparing affected to unaffected siblings, we show that 13% of de novo missense mutations and 43% of de novo likely gene-disrupting (LGD) mutations contribute to 12% and 9% of diagnoses, respectively. Including copy number variants, coding de novo mutations contribute to about 30% of all simplex and 45% of female diagnoses. Almost all LGD mutations occur opposite wild-type alleles. LGD targets in affected females significantly overlap the targets in males of lower intelligence quotient (IQ), but neither overlaps significantly with targets in males of higher IQ. We estimate that LGD mutation in about 400 genes can contribute to the joint class of affected females and males of lower IQ, with an overlapping and similar number of genes vulnerable to contributory missense mutation. LGD targets in the joint class overlap with published targets for intellectual disability and schizophrenia, and are enriched for chromatin modifiers, FMRP-associated genes and embryonically expressed genes. Most of the significance for the latter comes from affected females.

2,124 citations

Journal ArticleDOI
21 Nov 2013-Cell
TL;DR: Coexpression networks are constructed based on the hcASD "seed" genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood and demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons.

810 citations

Journal ArticleDOI
Yu-Jian Kang1, De-Chang Yang1, Lei Kong1, Mei Hou1, Yu-Qi Meng1, Liping Wei1, Ge Gao1 
TL;DR: The coding potential calculator CPC1 is upgraded to CPC2, which runs ∼1000 times faster than CPC1 and exhibits superior accuracy compared with CPC1, especially for long non-coding transcripts.
Abstract: With advances in next-generation sequencing technologies, numerous novel transcripts in a large number of organisms have been identified With the goal of fast, accurate assessment of the coding ability of RNA transcripts, we upgraded the coding potential calculator CPC1 to CPC2 CPC2 runs ∼1000 times faster than CPC1 and exhibits superior accuracy compared with CPC1, especially for long non-coding transcripts Moreover, the model of CPC2 is species-neutral, making it feasible for ever-growing non-model organism transcriptomes A mobile-friendly web server, as well as a downloadable standalone package, is freely available at http://cpc2cbipkueducn

789 citations


Cited by
More filters
Journal ArticleDOI
06 Jun 1986-JAMA
TL;DR: The editors have done a masterful job of weaving together the biologic, the behavioral, and the clinical sciences into a single tapestry in which everyone from the molecular biologist to the practicing psychiatrist can find and appreciate his or her own research.
Abstract: I have developed "tennis elbow" from lugging this book around the past four weeks, but it is worth the pain, the effort, and the aspirin. It is also worth the (relatively speaking) bargain price. Including appendixes, this book contains 894 pages of text. The entire panorama of the neural sciences is surveyed and examined, and it is comprehensive in its scope, from genomes to social behaviors. The editors explicitly state that the book is designed as "an introductory text for students of biology, behavior, and medicine," but it is hard to imagine any audience, interested in any fragment of neuroscience at any level of sophistication, that would not enjoy this book. The editors have done a masterful job of weaving together the biologic, the behavioral, and the clinical sciences into a single tapestry in which everyone from the molecular biologist to the practicing psychiatrist can find and appreciate his or

7,563 citations

01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal ArticleDOI
TL;DR: The Blast2GO framework is used to carry out a detailed analysis of annotation behaviour through homology transfer and its impact in functional genomics research to offer biologists useful information to take into account when addressing the task of functionally characterizing their sequence data.
Abstract: Functional genomics technologies have been widely adopted in the biological research of both model and non-model species. An efficient functional annotation of DNA or protein sequences is a major requirement for the successful application of these approaches as functional information on gene products is often the key to the interpretation of experimental results. Therefore, there is an increasing need for bioinformatics resources which are able to cope with large amount of sequence data, produce valuable annotation results and are easily accessible to laboratories where functional genomics projects are being undertaken. We present the Blast2GO suite as an integrated and biologist-oriented solution for the high-throughput and automatic functional annotation of DNA or protein sequences based on the Gene Ontology vocabulary. The most outstanding Blast2GO features are: (i) the combination of various annotation strategies and tools controlling type and intensity of annotation, (ii) the numerous graphical features such as the interactive GO-graph visualization for gene-set function profiling or descriptive charts, (iii) the general sequence management features and (iv) high-throughput capabilities. We used the Blast2GO framework to carry out a detailed analysis of annotation behaviour through homology transfer and its impact in functional genomics research. Our aim is to offer biologists useful information to take into account when addressing the task of functionally characterizing their sequence data.

3,306 citations

Journal ArticleDOI
TL;DR: A web server, KOBAS 2.0, is reported, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations, which allows for both ID mapping and cross-species sequence similarity mapping.
Abstract: High-throughput experimental technologies often identify dozens to hundreds of genes related to, or changed in, a biological or pathological process. From these genes one wants to identify biological pathways that may be involved and diseases that may be implicated. Here, we report a web server, KOBAS 2.0, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations. It allows for both ID mapping and cross-species sequence similarity mapping. It then performs statistical tests to identify statistically significantly enriched pathways and diseases. KOBAS 2.0 incorporates knowledge across 1327 species from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther) and 5 human disease databases (OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog). KOBAS 2.0 can be accessed at http://kobas.cbi.pku.edu.cn.

3,293 citations