scispace - formally typeset
Search or ask a question
Author

Yijun Xiao

Bio: Yijun Xiao is an academic researcher from State Oceanic Administration. The author has contributed to research in topics: Genome & Sequence assembly. The author has an hindex of 1, co-authored 1 publications receiving 15 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The genome resource in this work will be used for the conservation and population genetics of the yellowbelly pufferfish, as well as in vertebrate chromosome evolution studies.
Abstract: Pufferfish are ideal models for vertebrate chromosome evolution studies. The yellowbelly pufferfish, Takifugu flavidus, is an important marine fish species in the aquaculture industry and ecology of East Asia. The chromosome assembly of the species could facilitate the study of chromosome evolution and functional gene mapping. To this end, 44, 27 and 50 Gb reads were generated for genome assembly using Illumina, PacBio and Hi-C sequencing technologies, respectively. More than 13 Gb full-length transcripts were sequenced on the PacBio platform. A 366 Mb genome was obtained with the contig of 4.4 Mb and scaffold N50 length of 15.7 Mb. 266 contigs were reliably assembled into 22 chromosomes, representing 95.9% of the total genome. A total of 29,416 protein-coding genes were predicted and 28,071 genes were functionally annotated. More than 97.7% of the BUSCO genes were successfully detected in the genome. The genome resource in this work will be used for the conservation and population genetics of the yellowbelly pufferfish, as well as in vertebrate chromosome evolution studies. Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.10008740

21 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This review assesses the availability of complete genomes of aquaculture animals and then briefly discusses the sequencing technologies and SNP array for SNPs genotyping, and summarizes the current status of genetic linkage map construction, QTL mapping, GWAS, and GS in aquatic animals.

70 citations

Journal ArticleDOI
TL;DR: In this article, a haplotype-resolved genome assembly of a zig-zag eel (Mastacembelus armatus) at the chromosomal scale is presented.
Abstract: The origin of sex chromosomes requires the establishment of recombination suppression between the proto-sex chromosomes. In many fish species, the sex chromosome pair is homomorphic with a recent origin, providing species for studying how and why recombination suppression evolved in the initial stages of sex chromosome differentiation, but this requires accurate sequence assembly of the X and Y (or Z and W) chromosomes, which may be difficult if they are recently diverged. Here we produce a haplotype-resolved genome assembly of zig-zag eel (Mastacembelus armatus), an aquaculture fish, at the chromosomal scale. The diploid assembly is nearly gap-free, and in most chromosomes, we resolve the centromeric and subtelomeric heterochromatic sequences. In particular, the Y chromosome, including its highly repetitive short arm, has zero gaps. Using resequencing data, we identify a ~7 Mb fully sex-linked region (SLR), spanning the sex chromosome centromere and almost entirely embedded in the pericentromeric heterochromatin. The SLRs on the X and Y chromosomes are almost identical in sequence and gene content, but both are repetitive and heterochromatic, consistent with zero or low recombination. We further identify an HMG-domain containing gene HMGN6 in the SLR as a candidate sex-determining gene that is expressed at the onset of testis development. Our study supports the idea that preexisting regions of low recombination, such as pericentromeric regions, can give rise to SLR in the absence of structural variations between the proto-sex chromosomes.

26 citations

Journal ArticleDOI
TL;DR: This genome assembly can serve as a valuable genetic resource for exploring fugu‐specific compact genome characteristics, and will provide essential genomic information for understanding molecular adaptations to salinity fluctuations and the evolution of osmoregulatory mechanisms.
Abstract: The Tetraodontidae family are known to have relatively small and compact genomes compared to other vertebrates. The obscure puffer fish Takifugu obscurus is an anadromous species that migrates to freshwater from the sea for spawning. Thus the euryhaline characteristics of T. obscurus have been investigated to gain understanding of their survival ability, osmoregulation, and other homeostatic mechanisms in both freshwater and seawater. In this study, a high quality chromosome-level reference genome for T. obscurus was constructed using long-read Pacific Biosciences (PacBio) Sequel sequencing and a Hi-C-based chromatin contact map platform. The final genome assembly of T. obscurus is 381 Mb, with a contig N50 length of 3,296 kb and longest length of 10.7 Mb, from a total of 62 Gb of raw reads generated using single-molecule real-time sequencing technology from a PacBio Sequel platform. The PacBio data were further clustered into chromosome-scale scaffolds using a Hi-C approach, resulting in a 373 Mb genome assembly with a contig N50 length of 15.2 Mb and and longest length of 28 Mb. When we directly compared the 22 longest scaffolds of T. obscurus to the 22 chromosomes of the tiger puffer Takifugu rubripes, a clear one-to-one orthologous relationship was observed between the two species, supporting the chromosome-level assembly of T. obscurus. This genome assembly can serve as a valuable genetic resource for exploring fugu-specific compact genome characteristics, and will provide essential genomic information for understanding molecular adaptations to salinity fluctuations and the evolution of osmoregulatory mechanisms.

20 citations

Journal ArticleDOI
TL;DR: In this article, the authors used long-read sequencing to improve the contiguity of the threespine stickleback fish (Gasterosteus aculeatus) genome, a prominent genetic model species.
Abstract: While the cost and time for assembling a genome has drastically decreased, it still remains a challenge to assemble a highly contiguous genome. These challenges are rapidly being overcome by the integration of long-read sequencing technologies. Here, we use long-read sequencing to improve the contiguity of the threespine stickleback fish (Gasterosteus aculeatus) genome, a prominent genetic model species. Using Pacific Biosciences sequencing, we assembled a highly contiguous genome of a freshwater fish from Paxton Lake. Using contigs from this genome, we were able to fill over 76.7% of the gaps in the existing reference genome assembly, improving contiguity over fivefold. Our gap filling approach was highly accurate, validated by 10X Genomics long-distance linked-reads. In addition to closing a majority of gaps, we were able to assemble segments of telomeres and centromeres throughout the genome. This highlights the power of using long sequencing reads to assemble highly repetitive and difficult to assemble regions of genomes. This latest genome build has been released through a newly designed community genome browser that aims to consolidate the growing number of genomics datasets available for the threespine stickleback fish.

17 citations

Journal ArticleDOI
TL;DR: Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes.
Abstract: Background Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and of low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but rely on high-quality high molecular weight DNA. However, funding is often insufficient for many independent research groups to use these techniques. Here we use a range of different genomic technologies generated from a roadkill European polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses. Results Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies. Conclusions The high degree of variability between each de novo assembly method (assessed from the 7 key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes.

16 citations