Home
/
Authors
/
Yang Liao

Author

Yang Liao

Walter and Eliza Hall Institute of Medical Research

Other affiliations: University of Melbourne

Bio: Yang Liao is an academic researcher from Walter and Eliza Hall Institute of Medical Research. The author has contributed to research in topics: Transcription factor & Cellular differentiation. The author has an hindex of 25, co-authored 45 publications receiving 14202 citations. Previous affiliations of Yang Liao include University of Melbourne.

Topics: Transcription factor, Cellular differentiation, Biology, T cell, Medicine ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010

Papers

PDF

Open Access

More filters

Journal Article•DOI•

featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features

[...]

Yang Liao¹, Gordon K. Smyth¹, Wei Shi¹•Institutions (1)

Walter and Eliza Hall Institute of Medical Research¹

01 Apr 2014-Bioinformatics

TL;DR: FeatureCounts as discussed by the authors is a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments, which implements highly efficient chromosome hashing and feature blocking techniques.

...read moreread less

Abstract: MOTIVATION: Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. RESULTS: We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. AVAILABILITY AND IMPLEMENTATION: featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.

...read moreread less

14,103 citations

Journal Article•DOI•

The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote

[...]

Yang Liao¹, Gordon K. Smyth¹, Wei Shi¹•Institutions (1)

Walter and Eliza Hall Institute of Medical Research¹

01 May 2013-Nucleic Acids Research

TL;DR: This article proposes an elegantly simple multi-seed strategy, called seed-and-vote, for mapping reads to a reference genome, which uses a relatively large number of short seeds extracted from each read and allows all the seeds to vote on the optimal location.

...read moreread less

Abstract: Read alignment is an ongoing challenge for the analysis of data from sequencing technologies. This article proposes an elegantly simple multi-seed strategy, called seed-and-vote, for mapping reads to a reference genome. The new strategy chooses the mapped genomic location for the read directly from the seeds. It uses a relatively large number of short seeds (called subreads) extracted from each read and allows all the seeds to vote on the optimal location. When the read length is <160 bp, overlapping subreads are used. More conventional alignment algorithms are then used to fill in detailed mismatch and indel information between the subreads that make up the winning voting block. The strategy is fast because the overall genomic location has already been chosen before the detailed alignment is done. It is sensitive because no individual subread is required to map exactly, nor are individual subreads constrained to map close by other subreads. It is accurate because the final location must be supported by several different subreads. The strategy extends easily to find exon junctions, by locating reads that contain sets of subreads mapping to different exons of the same gene. It scales up efficiently for longer reads.

...read moreread less

2,228 citations

Journal Article•DOI•

The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads.

[...]

Yang Liao¹, Yang Liao², Gordon K. Smyth¹, Gordon K. Smyth², Wei Shi², Wei Shi¹ - Show less +2 more•Institutions (2)

Walter and Eliza Hall Institute of Medical Research¹, University of Melbourne²

07 May 2019-Nucleic Acids Research

TL;DR: Rsubread is presented, a Bioconductor software package that provides high-performance alignment and read counting functions for RNA-seq reads that integrates read mapping and quantification in a single package and has no software dependencies other than R itself.

...read moreread less

Abstract: We present Rsubread, a Bioconductor software package that provides high-performance alignment and read counting functions for RNA-seq reads. Rsubread is based on the successful Subread suite with the added ease-of-use of the R programming environment, creating a matrix of read counts directly as an R object ready for downstream analysis. It integrates read mapping and quantification in a single package and has no software dependencies other than R itself. We demonstrate Rsubread's ability to detect exon-exon junctions de novo and to quantify expression at the level of either genes, exons or exon junctions. The resulting read counts can be input directly into a wide range of downstream statistical analyses using other Bioconductor packages. Using SEQC data and simulations, we compare Rsubread to TopHat2, STAR and HTSeq as well as to counting functions in the Bioconductor infrastructure packages. We consider the performance of these tools on the combined quantification task starting from raw sequence reads through to summary counts, and in particular evaluate the performance of different combinations of alignment and counting algorithms. We show that Rsubread is faster and uses less memory than competitor tools and produces read count summaries that more accurately correlate with true values.

...read moreread less

1,420 citations

Journal Article•DOI•

A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium

[...]

Zhenqiang Su, Paweł P. Łabaj¹, Sheng Li², Jean Thierry-Mieg³ +161 more•Institutions (54)

01 Sep 2014-Nature Biotechnology

TL;DR: The complete SEQC data sets, comprising >100 billion reads, provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings, and measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling.

...read moreread less

Abstract: We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.

...read moreread less

853 citations

Journal Article•DOI•

Hobit and Blimp1 instruct a universal transcriptional program of tissue residency in lymphocytes.

[...]

Laura K. Mackay¹, Laura K. Mackay², Martina Minnich³, Natasja A. M. Kragten⁴, Yang Liao², Yang Liao⁵, Benjamin Nota⁴, Cyril Seillet⁵, Cyril Seillet², Ali Zaid², Kevin Man⁵, Kevin Man², Simon Preston², Simon Preston⁵, David Freestone², Asolina Braun², Erica Wynne-Jones², Felix M. Behr, Regina Stark⁴, Daniel G. Pellicci², Daniel G. Pellicci¹, Dale I. Godfrey², Dale I. Godfrey¹, Gabrielle T. Belz⁵, Gabrielle T. Belz², Marc Pellegrini², Marc Pellegrini⁵, Thomas Gebhardt², Meinrad Busslinger³, Wei Shi², Wei Shi⁵, Francis R. Carbone², René A. W. van Lier⁴, Axel Kallies², Axel Kallies⁵, Klaas P. J. M. van Gisbergen - Show less +32 more•Institutions (5)

Australian Research Council¹, University of Melbourne², Research Institute of Molecular Pathology³, University of Amsterdam⁴, Walter and Eliza Hall Institute of Medical Research⁵

22 Apr 2016-Science

TL;DR: In this paper, the authors identify Hobit and Blimp1 as central regulators of a universal program that instructs tissue retention in diverse tissue-resident lymphocyte populations, including NKT cells and liver-resident NK cells.

...read moreread less

Abstract: Tissue-resident memory T (Trm) cells permanently localize to portals of pathogen entry, where they provide immediate protection against reinfection. To enforce tissue retention, Trm cells up-regulate CD69 and down-regulate molecules associated with tissue egress; however, a Trm-specific transcriptional regulator has not been identified. Here, we show that the transcription factor Hobit is specifically up-regulated in Trm cells and, together with related Blimp1, mediates the development of Trm cells in skin, gut, liver, and kidney in mice. The Hobit-Blimp1 transcriptional module is also required for other populations of tissue-resident lymphocytes, including natural killer T (NKT) cells and liver-resident NK cells, all of which share a common transcriptional program. Our results identify Hobit and Blimp1 as central regulators of this universal program that instructs tissue retention in diverse tissue-resident lymphocyte populations.

...read moreread less

660 citations

1
2
3
4
…
5
6
7
8
9
10
11
12

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Michael I. Love², Wolfgang Huber, Simon Anders•Institutions (2)

Harvard University¹, Max Planck Society²

05 Dec 2014-Genome Biology

TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

...read moreread less

47,038 citations

Journal Article•DOI•

limma powers differential expression analyses for RNA-sequencing and microarray studies

[...]

Matthew E. Ritchie¹, Belinda Phipson², Di Wu³, Yifang Hu¹, Charity W. Law⁴, Wei Shi¹, Gordon K. Smyth⁵, Gordon K. Smyth¹ - Show less +4 more•Institutions (5)

Walter and Eliza Hall Institute of Medical Research¹, Royal Children's Hospital², Harvard University³, University of Zurich⁴, University of Melbourne⁵

20 Apr 2015-Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

22,147 citations

Posted Content•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Wolfgang Huber, Simon Anders•Institutions (1)

Harvard University¹

17 Nov 2014-bioRxiv

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-Seq data, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data. DESeq2 uses shrinkage estimation for dispersions and fold changes to improve stability and interpretability of the estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression and facilitates downstream tasks such as gene ranking and visualization. DESeq2 is available as an R/Bioconductor package.

...read moreread less

17,014 citations

Journal Article•DOI•

HTSeq—a Python framework to work with high-throughput sequencing data

[...]

Simon Anders, Paul Theodor Pyl, Wolfgang Huber

15 Jan 2015-Bioinformatics

TL;DR: This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.

...read moreread less

Abstract: Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.de

...read moreread less

15,744 citations