scispace - formally typeset
Search or ask a question
Author

Sangrea Shim

Bio: Sangrea Shim is an academic researcher from Seoul National University. The author has contributed to research in topics: Gene & Genome. The author has an hindex of 10, co-authored 20 publications receiving 703 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A draft genome sequence of mungbean is constructed to facilitate genome research into the subgenus Ceratotropis, which includes several important dietary legumes in Asia, and to enable a better understanding of the evolution of leguminous species.
Abstract: Mungbean (Vigna radiata) is a fast-growing, warm-season legume crop that is primarily cultivated in developing countries of Asia. Here we construct a draft genome sequence of mungbean to facilitate genome research into the subgenus Ceratotropis, which includes several important dietary legumes in Asia, and to enable a better understanding of the evolution of leguminous species. Based on the de novo assembly of additional wild mungbean species, the divergence of what was eventually domesticated and the sampled wild mungbean species appears to have predated domestication. Moreover, the de novo assembly of a tetraploid Vigna species (V. reflexo-pilosa var. glabra) provides genomic evidence of a recent allopolyploid event. The species tree is constructed using de novo RNA-seq assemblies of 22 accessions of 18 Vigna species and protein sets of Glycine max. The present assembly of V. radiata var. radiata will facilitate genome research and accelerate molecular breeding of the subgenus Ceratotropis.

397 citations

Journal ArticleDOI
TL;DR: G glycines supports the conjecture that NBS-LRR genes have disease resistance functions in the soybean genome and is suggested to contribute to disease resistance in soybean.
Abstract: R genes are a key component of genetic interactions between plants and biotrophic bacteria and are known to regulate resistance against bacterial invasion. The most common R proteins contain a nucleotide-binding site and a leucine-rich repeat (NBS-LRR) domain. Some NBS-LRR genes in the soybean genome have also been reported to function in disease resistance. In this study, the number of NBS-LRR genes was found to correlate with the number of disease resistance quantitative trait loci (QTL) that flank these genes in each chromosome. NBS-LRR genes co-localized with disease resistance QTL. The study also addressed the functional redundancy of disease resistance on recently duplicated regions that harbor NBS-LRR genes and NBS-LRR gene expression in the bacterial leaf pustule (BLP)-induced soybean transcriptome. A total of 319 genes were determined to be putative NBS-LRR genes in the soybean genome. The number of NBS-LRR genes on each chromosome was highly correlated with the number of disease resistance QTL in the 2-Mb flanking regions of NBS-LRR genes. In addition, the recently duplicated regions contained duplicated NBS-LRR genes and duplicated disease resistance QTL, and possessed either an uneven or even number of NBS-LRR genes on each side. The significant difference in NBS-LRR gene expression between a resistant near-isogenic line (NIL) and a susceptible NIL after inoculation of Xanthomonas axonopodis pv. glycines supports the conjecture that NBS-LRR genes have disease resistance functions in the soybean genome. The number of NBS-LRR genes and disease resistance QTL in the 2-Mb flanking regions of each chromosome was significantly correlated, and several recently duplicated regions that contain NBS-LRR genes harbored disease resistance QTL for both sides. In addition, NBS-LRR gene expression was significantly different between the BLP-resistant NIL and the BLP-susceptible NIL in response to bacterial infection. From these observations, NBS-LRR genes are suggested to contribute to disease resistance in soybean. Moreover, we propose models for how NBS-LRR genes were duplicated, and apply Ks values for each NBS-LRR gene cluster.

149 citations

Journal ArticleDOI
TL;DR: The present genome assembly will accelerate the genomics-assisted breeding of adzuki bean and reveal 26,857 high confidence protein-coding genes evidenced by RNAseq of different tissues.
Abstract: Adzuki bean (Vigna angularis var. angularis) is a dietary legume crop in East Asia. The presumed progenitor (Vigna angularis var. nipponensis) is widely found in East Asia, suggesting speciation and domestication in these temperate climate regions. Here, we report a draft genome sequence of adzuki bean. The genome assembly covers 75% of the estimated genome and was mapped to 11 pseudo-chromosomes. Gene prediction revealed 26,857 high confidence protein-coding genes evidenced by RNAseq of different tissues. Comparative gene expression analysis with V. radiata showed that the tissue specificity of orthologous genes was highly conserved. Additional re-sequencing of wild adzuki bean, V. angularis var. nipponensis and V. nepalensis, was performed to analyze the variations between cultivated and wild adzuki bean. The determined divergence time of adzuki bean and the wild species predated archaeology-based domestication time. The present genome assembly will accelerate the genomics-assisted breeding of adzuki bean.

143 citations

Journal ArticleDOI
TL;DR: The gene list catalogued in this study provides primary insight for understanding the regulation of flowering time and maturity in soybean.
Abstract: Soybean genome sequences were blasted with Arabidopsis thaliana regulatory genes involved in photoperioddependent flowering. This approach enabled the identification of 118 genes involved in the flowering pathway. Two genome sequences of cultivated (Williams 82) and wild (IT182932) soybeans were employed to survey functional DNA variations in the flowering-related homologs. Forty genes exhibiting nonsynonymous substitutions between G. max and G. soja were catalogued. In addition, 22 genes were found to co-localize with QTLs for six traits including flowering time, first flower, pod maturity, beginning of pod, reproductive period, and seed filling period. Among the genes overlapping the QTL regions, two LHY/CCA1 genes, GI and SFR6 contained amino acid changes. The recently duplicated sequence regions of the soybean genome were used as additional criteria for the speculation of the putative function of the homologs. Two duplicated regions showed redundancy of both flowering-related genes and QTLs. ID 12398025, which contains the homeologous regions between chr 7 and chr 16, was redundant for the LHY/CCA1 and SPA1 homologs and the QTLs. Retaining of the CRY1 gene and the pod maturity QTLs were observed in the duplicated region of ID 23546507 on chr 4 and chr 6. Functional DNA variation of the LHY/CCA1 gene (Glyma07g05410) was present in a counterpart of the duplicated region on chr 7, while the gene (Glyma16g01980) present in the other portion of the duplicated region on chr 16 did not show a functional sequence change. The gene list catalogued in this study provides primary insight for understanding the regulation of flowering time and maturity in soybean.

48 citations

Journal ArticleDOI
TL;DR: The results will help researchers and breeders increase energy efficiency of this important oil seed crop by improving yield and oil content, and eliminating toxic compound in seed cake for animal feed.
Abstract: Jatropha curcas (physic nut), a non‐edible oilseed crop, represents one of the most promising alternative energy sources due to its high seed oil content, rapid growth and adaptability to various environments. We report ~339 Mbp draft whole genome sequence of J. curcas var. Chai Nat using both the PacBio and Illumina sequencing platforms. We identified and categorized differentially expressed genes related to biosynthesis of lipid and toxic compound among four stages of seed development. Triacylglycerol (TAG), the major component of seed storage oil, is mainly synthesized by phospholipid:diacylglycerol acyltransferase in Jatropha, and continuous high expression of homologs of oleosin over seed development contributes to accumulation of high level of oil in kernels by preventing the breakdown of TAG. A physical cluster of genes for diterpenoid biosynthetic enzymes, including casbene synthases highly responsible for a toxic compound, phorbol ester, in seed cake, was syntenically highly conserved between Jatropha and castor bean. Transcriptomic analysis of female and male flowers revealed the up‐regulation of a dozen family of TFs in female flower. Additionally, we constructed a robust species tree enabling estimation of divergence times among nine Jatropha species and five commercial crops in Malpighiales order. Our results will help researchers and breeders increase energy efficiency of this important oil seed crop by improving yield and oil content, and eliminating toxic compound in seed cake for animal feed.

47 citations


Cited by
More filters
Journal ArticleDOI
22 May 2017-Nature
TL;DR: It is found that the genomic architecture of flowering time has been shaped by the most recent whole-genome duplication, which suggests that ancient paralogues can remain in the same regulatory networks for dozens of millions of years.
Abstract: The domesticated sunflower, Helianthus annuus L, is a global oil crop that has promise for climate change adaptation, because it can maintain stable yields across a wide variety of environmental conditions, including drought Even greater resilience is achievable through the mining of resistance alleles from compatible wild sunflower relatives, including numerous extremophile species Here we report a high-quality reference for the sunflower genome (36 gigabases), together with extensive transcriptomic data from vegetative and floral organs The genome mostly consists of highly similar, related sequences and required single-molecule real-time sequencing technologies for successful assembly Genome analyses enabled the reconstruction of the evolutionary history of the Asterids, further establishing the existence of a whole-genome triplication at the base of the Asterids II clade and a sunflower-specific whole-genome duplication around 29 million years ago An integrative approach combining quantitative genetics, expression and diversity data permitted development of comprehensive gene networks for two major breeding traits, flowering time and oil metabolism, and revealed new candidate genes in these networks We found that the genomic architecture of flowering time has been shaped by the most recent whole-genome duplication, which suggests that ancient paralogues can remain in the same regulatory networks for dozens of millions of years This genome represents a cornerstone for future research programs aiming to exploit genetic diversity to improve biotic and abiotic stress resistance and oil production, while also considering agricultural constraints and human nutritional needs

497 citations

Journal ArticleDOI
TL;DR: Intergenomic comparisons identified lineage-specific genes and genes with copy number variation or large-effect mutations, some of which show evidence of positive selection and may contribute to variation of agronomic traits such as biotic resistance, seed composition, flowering and maturity time, organ size and final biomass.
Abstract: Wild relatives of crops are an important source of genetic diversity for agriculture, but their gene repertoire remains largely unexplored. We report the establishment and analysis of a pan-genome of Glycine soja, the wild relative of cultivated soybean Glycine max, by sequencing and de novo assembly of seven phylogenetically and geographically representative accessions. Intergenomic comparisons identified lineage-specific genes and genes with copy number variation or large-effect mutations, some of which show evidence of positive selection and may contribute to variation of agronomic traits such as biotic resistance, seed composition, flowering and maturity time, organ size and final biomass. Approximately 80% of the pan-genome was present in all seven accessions (core), whereas the rest was dispensable and exhibited greater variation than the core genome, perhaps reflecting a role in adaptation to diverse environments. This work will facilitate the harnessing of untapped genetic diversity from wild soybean for enhancement of elite cultivars.

485 citations

Journal ArticleDOI
TL;DR: A comprehensive landscape of different modes of gene duplication across the plant kingdom is identified by comparing 141 genomes, which provides a solid foundation for further investigation of the dynamic evolution of duplicate genes.
Abstract: The sharp increase of plant genome and transcriptome data provide valuable resources to investigate evolutionary consequences of gene duplication in a range of taxa, and unravel common principles underlying duplicate gene retention. We survey 141 sequenced plant genomes to elucidate consequences of gene and genome duplication, processes central to the evolution of biodiversity. We develop a pipeline named DupGen_finder to identify different modes of gene duplication in plants. Genes derived from whole-genome, tandem, proximal, transposed, or dispersed duplication differ in abundance, selection pressure, expression divergence, and gene conversion rate among genomes. The number of WGD-derived duplicate genes decreases exponentially with increasing age of duplication events—transposed duplication- and dispersed duplication-derived genes declined in parallel. In contrast, the frequency of tandem and proximal duplications showed no significant decrease over time, providing a continuous supply of variants available for adaptation to continuously changing environments. Moreover, tandem and proximal duplicates experienced stronger selective pressure than genes formed by other modes and evolved toward biased functional roles involved in plant self-defense. The rate of gene conversion among WGD-derived gene pairs declined over time, peaking shortly after polyploidization. To provide a platform for accessing duplicated gene pairs in different plants, we constructed the Plant Duplicate Gene Database. We identify a comprehensive landscape of different modes of gene duplication across the plant kingdom by comparing 141 genomes, which provides a solid foundation for further investigation of the dynamic evolution of duplicate genes.

461 citations

Journal ArticleDOI
TL;DR: PLAZA 4.0 is presented, the latest iteration of the PLAZA framework, providing a large increase in newly available species, and offers access to updated and newly implemented tools and visualizations, helping users with the ever-increasing demands for complex and in-depth analyzes.
Abstract: PLAZA (https://bioinformatics.psb.ugent.be/plaza) is a plant-oriented online resource for comparative, evolutionary and functional genomics. The PLAZA platform consists of multiple independent instances focusing on different plant clades, while also providing access to a consistent set of reference species. Each PLAZA instance contains structural and functional gene annotations, gene family data and phylogenetic trees and detailed gene colinearity information. A user-friendly web interface makes the necessary tools and visualizations accessible, specific for each data type. Here we present PLAZA 4.0, the latest iteration of the PLAZA framework. This version consists of two new instances (Dicots 4.0 and Monocots 4.0) providing a large increase in newly available species, and offers access to updated and newly implemented tools and visualizations, helping users with the ever-increasing demands for complex and in-depth analyzes. The total number of species across both instances nearly doubles from 37 species in PLAZA 3.0 to 71 species in PLAZA 4.0, with a much broader coverage of crop species (e.g. wheat, palm oil) and species of evolutionary interest (e.g. spruce, Marchantia). The new PLAZA instances can also be accessed by a programming interface through a RESTful web service, thus allowing bioinformaticians to optimally leverage the power of the PLAZA platform.

378 citations

Journal ArticleDOI
TL;DR: It is proposed that future practical breeding platforms should adopt automated genotyping technologies, either array or sequencing based, target functional polymorphisms underpinning economic traits, and provide desirable prediction accuracy for quantitative traits, with universal applications under wide genetic backgrounds in crops.

338 citations