Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote
Shikai Liu,Shikai Liu,Yu Zhang,Zunchun Zhou,Geoff Waldbieser,Fanyue Sun,Jianguo Lu,Jiaren Zhang,Yanliang Jiang,Hao Zhang,Xiuli Wang,K.V. Rajendran,Lester H. Khoo,Huseyin Kucuktas,Eric Peatman,Zhanjiang Liu +15 more
TLDR
The large set of transcripts assembled in this study is the most comprehensive set of genome resources ever developed from catfish, which will provide the much needed resources for functional genome research inCatfish, serving as a reference transcriptome for genome annotation, analysis of gene duplication, gene family structures, and digital gene expression analysis.Abstract:Â
Upon the completion of whole genome sequencing, thorough genome annotation that associates genome sequences with biological meanings is essential. Genome annotation depends on the availability of transcript information as well as orthology information. In teleost fish, genome annotation is seriously hindered by genome duplication. Because of gene duplications, one cannot establish orthologies simply by homology comparisons. Rather intense phylogenetic analysis or structural analysis of orthologies is required for the identification of genes. To conduct phylogenetic analysis and orthology analysis, full-length transcripts are essential. Generation of large numbers of full-length transcripts using traditional transcript sequencing is very difficult and extremely costly. In this work, we took advantage of a doubled haploid catfish, which has two sets of identical chromosomes and in theory there should be no allelic variations. As such, transcript sequences generated from next-generation sequencing can be favorably assembled into full-length transcripts. Deep sequencing of the doubled haploid channel catfish transcriptome was performed using Illumina HiSeq 2000 platform, yielding over 300 million high-quality trimmed reads totaling 27 Gbp. Assembly of these reads generated 370,798 non-redundant transcript-derived contigs. Functional annotation of the assembly allowed identification of 25,144 unique protein-encoding genes. A total of 2,659 unique genes were identified as putative duplicated genes in the catfish genome because the assembly of the corresponding transcripts harbored PSVs or MSVs (in the form of pseudo-SNPs in the assembly). Of the 25,144 contigs with unique protein hits, around 20,000 contigs matched 50% length of reference proteins, and over 14,000 transcripts were identified as full-length with complete open reading frames. The characterization of consensus sequences surrounding start codon and the stop codon confirmed the correct assembly of the full-length transcripts. The large set of transcripts assembled in this study is the most comprehensive set of genome resources ever developed from catfish, which will provide the much needed resources for functional genome research in catfish, serving as a reference transcriptome for genome annotation, analysis of gene duplication, gene family structures, and digital gene expression analysis. The putative set of duplicated genes provide a starting point for genome scale analysis of gene duplication in the catfish genome, and should be a valuable resource for comparative genome analysis, genome evolution, and genome function studies.read more
Citations
More filters
Journal ArticleDOI
RNA-Seq technology and its application in fish transcriptomics.
TL;DR: An overview of each step of RNA-seq from library construction to the bioinformatic analysis of the data is provided and the recent biological insights obtained from the RNA- sequencing studies in a variety of fish species are discussed.
Journal ArticleDOI
Toll-like receptor recognition of bacteria in fish: ligand specificity and signal pathways.
TL;DR: In-depth studies should be continuously performed to identify the ligand specificity of all TLRs in fish, particularly non-mammalian TLRs, and their signaling pathways.
Journal ArticleDOI
The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts.
Zhanjiang Liu,Shikai Liu,Jun Yao,Lisui Bao,Jiaren Zhang,Yun Li,Chen Jiang,Luyang Sun,Ruijia Wang,Yu Zhang,Tao Zhou,Qifan Zeng,Qiang Fu,Sen Gao,Ning Li,Sergey Koren,Yanliang Jiang,Aleksey V. Zimin,Peng Xu,Adam M. Phillippy,Xin Geng,Lin Song,Fanyue Sun,Chao Li,Xiaozhu Wang,Ailu Chen,Yulin Jin,Zihao Yuan,Yujia Yang,Suxu Tan,Eric Peatman,Jianguo Lu,Zhenkui Qin,Rex A. Dunham,Zhaoxia Li,Tad S. Sonstegard,Jianbin Feng,Roy G. Danzmann,Steven Schroeder,Brian E. Scheffler,Mary V. Duke,Linda L. Ballard,Huseyin Kucuktas,Ludmilla Kaltenboeck,Haixia Liu,Jonathan W. Armbruster,Yangjie Xie,Mona L. Kirby,Yi Tian,Mary Elizabeth Flanagan,Weijie Mu,Geoffrey C. Waldbieser +51 more
TL;DR: A high-quality reference genome sequence of channel catfish is reported, providing evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish and demonstrates the power of comparative subtraction of candidate genes for traits of structural significance.
Journal ArticleDOI
Aquaculture genomics, genetics and breeding in the United States: current status, challenges, and priorities for future research
Hisham A. Abdelrahman,Mohamed ElHady,Acacia Alcivar-Warren,Standish K. Allen,Rafet Al-Tobasei,Lisui Bao,Ben Beck,Harvey D. Blackburn,Brian G. Bosworth,John Buchanan,Jesse A. Chappell,William H. Daniels,Sheng Dong,Rex A. Dunham,Evan Durland,Ahmed Elaswad,Marta Gomez-Chiarri,Kamal Gosh,Ximing Guo,Perry B. Hackett,Terry Hanson,Dennis Hedgecock,Tiffany S. Howard,Leigh Holland,Molly Jackson,Yulin Jin,Karim Khalil,Thomas D. Kocher,Timothy D. Leeds,Ning Li,Lauren Lindsey,Shikai Liu,Zhanjiang Liu,Kyle E. Martin,Romi Novriadi,Ramjie Odin,Yniv Palti,Eric Peatman,Dina A. Proestou,Guyu Qin,Benjamin J. Reading,Caird E. Rexroad,Steven B. Roberts,Mohamed Salem,Andrew J. Severin,Huitong Shi,Craig A. Shoemaker,Sheila Stiles,Suxu Tan,Kathy F.J. Tang,Wilawan Thongda,Terrence R. Tiersch,Joseph R. Tomasso,Wendy Tri Prabowo,Roger L. Vallejo,Hein van der Steen,Khoi Vo,Geoff Waldbieser,Han-Ping Wang,Xiaozhu Wang,Jianhai Xiang,Yujia Yang,Roger Yant,Zihao Yuan,Qifan Zeng,Tao Zhou +65 more
TL;DR: A general review of the current status, challenges and future research needs of aquaculture genomics, genetics, and breeding is provided, with a focus on major Aquaculture species in the United States: catfish, rainbow trout, Atlantic salmon, tilapia, striped bass, oysters, and shrimp.
Journal ArticleDOI
RNA-Seq reveals expression signatures of genes involved in oxygen transport, protein synthesis, folding, and degradation in response to heat stress in catfish
Shikai Liu,Xiuli Wang,Xiuli Wang,Fanyue Sun,Jiaren Zhang,Jianbin Feng,Hong Liu,K.V. Rajendran,Luyang Sun,Yu Zhang,Yanliang Jiang,Eric Peatman,Ludmilla Kaltenboeck,Huseyin Kucuktas,Zhanjiang Liu +14 more
TL;DR: This is the first RNA-Seq-based expression study in catfish in response to heat stress, and the candidate genes identified should be valuable for further targeted studies on heat tolerance, thereby assisting the development of heat-tolerant catfish lines for aquaculture.
References
More filters
Journal ArticleDOI
Full-length transcriptome assembly from RNA-Seq data without a reference genome.
Manfred Grabherr,Brian J. Haas,Moran Yassour,Moran Yassour,Joshua Z. Levin,Dawn Thompson,Ido Amit,Xian Adiconis,Lin Fan,Raktima Raychowdhury,Qiandong Zeng,Zehua Chen,Evan Mauceli,Nir Hacohen,Andreas Gnirke,Nicholas Rhind,Federica Di Palma,Bruce W. Birren,Chad Nusbaum,Kerstin Lindblad-Toh,Kerstin Lindblad-Toh,Nir Friedman,Aviv Regev +22 more
TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.
Journal ArticleDOI
RNA-Seq: a revolutionary tool for transcriptomics
TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Journal ArticleDOI
WebLogo: A Sequence Logo Generator
TL;DR: WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment that provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive.
Journal ArticleDOI
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
Daniel R. Zerbino,Ewan Birney +1 more
TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.
Journal ArticleDOI
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Weizhong Li,Adam Godzik +1 more
TL;DR: Cd-hit-2d compares two protein datasets and reports similar matches between them; cd- Hit-est clusters a DNA/RNA sequence database and cd- hit-est-2D compares two nucleotide datasets.
Related Papers (5)
Full-length transcriptome assembly from RNA-Seq data without a reference genome.
Manfred Grabherr,Brian J. Haas,Moran Yassour,Moran Yassour,Joshua Z. Levin,Dawn Thompson,Ido Amit,Xian Adiconis,Lin Fan,Raktima Raychowdhury,Qiandong Zeng,Zehua Chen,Evan Mauceli,Nir Hacohen,Andreas Gnirke,Nicholas Rhind,Federica Di Palma,Bruce W. Birren,Chad Nusbaum,Kerstin Lindblad-Toh,Kerstin Lindblad-Toh,Nir Friedman,Aviv Regev +22 more