scispace - formally typeset
Search or ask a question

Showing papers by "Qijian Song published in 2022"


Journal ArticleDOI
TL;DR: In this paper , a causative TE insertion truncates its CCT domain and substantially increases seed oil content, weight, and yield while decreasing protein content, which significantly contributes to shaping soybean with increased yield/seed weight/oil but reduced protein content.
Abstract: Abstract Seed protein, oil content and yield are highly correlated agronomically important traits that essentially account for the economic value of soybean. The underlying molecular mechanisms and selection of these correlated seed traits during soybean domestication are, however, less known. Here, we demonstrate that a CCT gene, POWR1 , underlies a large-effect protein/oil QTL. A causative TE insertion truncates its CCT domain and substantially increases seed oil content, weight, and yield while decreasing protein content. POWR1 pleiotropically controls these traits likely through regulating seed nutrient transport and lipid metabolism genes. POWR1 is also a domestication gene. We hypothesize that the TE insertion allele is exclusively fixed in cultivated soybean due to selection for larger seeds during domestication, which significantly contributes to shaping soybean with increased yield/seed weight/oil but reduced protein content. This study provides insights into soybean domestication and is significant in improving seed quality and yield in soybean and other crop species.

19 citations


Journal ArticleDOI
TL;DR: In this paper , the raw resequencing data of 1465 soybean genomes available in the public and 91 highly diverse wild genomes newly sequenced were consolidated and annotated with 30 categories of structural and functional information.
Abstract: With advances in next-generation sequencing technologies, an unprecedented amount of soybean accessions has been sequenced by many individual studies and made available as raw sequencing reads for post-genomic research.To develop a consolidated and user-friendly genomic resource for post-genomic research, we consolidated the raw resequencing data of 1465 soybean genomes available in the public and 91 highly diverse wild soybean genomes newly sequenced. These altogether provided a collection of 1556 sequenced genomes of 1501 diverse accessions (1.5 K). The collection comprises of wild, landraces and elite cultivars of soybean that were grown in East Asia or major soybean cultivating areas around the world. Our extensive sequence analysis discovered 32 million single nucleotide polymorphisms (32mSNPs) and revealed a SNP density of 30 SNPs/kb and 12 non-synonymous SNPs/gene reflecting a high structural and functional genomic diversity of the new collection. Each SNP was annotated with 30 categories of structural and/or functional information. We further identified paired accessions between the 1.5 K and 20,087 (20 K) accessions in US collection as genomic "equivalent" accessions sharing the highest genomic identity for minimizing the barriers in soybean germplasm exchange between countries. We also exemplified the utility of 32mSNPs in enhancing post-genomics research through in-silico genotyping, high-resolution GWAS, discovering and/or characterizing genes and alleles/mutations, identifying germplasms containing beneficial alleles that are potentially experiencing artificial selection.The comprehensive analysis of publicly available large-scale genome sequencing data of diverse cultivated accessions and the newly in-house sequenced wild accessions greatly increased the soybean genome-wide variation resolution. This could facilitate a variety of genetic and molecular-level analyses in soybean. The 32mSNPs and 1.5 K accessions with their comprehensive annotation have been made available at the SoyBase and Ag Data Commons. The dataset could further serve as a versatile and expandable core resource for exploring the exponentially increasing genome sequencing data for a variety of post-genomic research.

8 citations


Journal ArticleDOI
TL;DR: In this paper , the raw resequencing data of 1465 soybean genomes available in the public and 91 highly diverse wild genomes newly sequenced were consolidated and annotated with 30 categories of structural and functional information.
Abstract: With advances in next-generation sequencing technologies, an unprecedented amount of soybean accessions has been sequenced by many individual studies and made available as raw sequencing reads for post-genomic research.To develop a consolidated and user-friendly genomic resource for post-genomic research, we consolidated the raw resequencing data of 1465 soybean genomes available in the public and 91 highly diverse wild soybean genomes newly sequenced. These altogether provided a collection of 1556 sequenced genomes of 1501 diverse accessions (1.5 K). The collection comprises of wild, landraces and elite cultivars of soybean that were grown in East Asia or major soybean cultivating areas around the world. Our extensive sequence analysis discovered 32 million single nucleotide polymorphisms (32mSNPs) and revealed a SNP density of 30 SNPs/kb and 12 non-synonymous SNPs/gene reflecting a high structural and functional genomic diversity of the new collection. Each SNP was annotated with 30 categories of structural and/or functional information. We further identified paired accessions between the 1.5 K and 20,087 (20 K) accessions in US collection as genomic "equivalent" accessions sharing the highest genomic identity for minimizing the barriers in soybean germplasm exchange between countries. We also exemplified the utility of 32mSNPs in enhancing post-genomics research through in-silico genotyping, high-resolution GWAS, discovering and/or characterizing genes and alleles/mutations, identifying germplasms containing beneficial alleles that are potentially experiencing artificial selection.The comprehensive analysis of publicly available large-scale genome sequencing data of diverse cultivated accessions and the newly in-house sequenced wild accessions greatly increased the soybean genome-wide variation resolution. This could facilitate a variety of genetic and molecular-level analyses in soybean. The 32mSNPs and 1.5 K accessions with their comprehensive annotation have been made available at the SoyBase and Ag Data Commons. The dataset could further serve as a versatile and expandable core resource for exploring the exponentially increasing genome sequencing data for a variety of post-genomic research.

8 citations


Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors mapped the quantitative trait locus QNE1 (QTL near E1) for flowering time to the region proximal to E1 on chromosome 6 in two mapping populations.
Abstract: The soybean E1 gene is a major regulator that plays an important role in flowering time and maturity. However, it remains unclear how cultivars carrying the dominant E1 allele adapt to the higher latitudinal areas of northern China. We mapped the novel quantitative trait locus QNE1 (QTL near E1) for flowering time to the region proximal to E1 on chromosome 6 in two mapping populations. Positional cloning revealed Glyma.06G204300, encoding a TCP-type transcription factor, as a strong candidate gene for QNE1. Association analysis further confirmed that functional single nucleotide polymorphisms (SNPs) at nucleotides 686 and 1,063 in the coding region of Glyma.06G204300 were significantly associated with flowering time. The protein encoded by the candidate gene is localized primarily to the nucleus. Furthermore, soybean and Brassica napus plants overexpressing Glyma.06G204300 exhibited early flowering. We conclude that despite their similar effects on flowering time, QNE1 and E4 may control flowering time through different regulatory mechanisms, based on expression studies and weighted gene co-expression network analysis of flowering time-related genes. Deciphering the molecular basis of QNE1 control of flowering time enriches our knowledge of flowering gene networks in soybean and will facilitate breeding soybean cultivars with broader latitudinal adaptation.

5 citations


Journal ArticleDOI
TL;DR: In this paper , quantitative trait loci (QTL) were mapped using PI 90763 through two biparental F3:4 recombinant inbred line (RIL) populations segregating for rhg1-a and rhg 1-b alleles against a SCN HG type 1.5.
Abstract: An epistatic interaction between SCN resistance loci rhg1-a and rhg2 in PI 90763 imparts resistance against virulent SCN populations which can be employed to diversify SCN resistance in soybean cultivars. With more than 95% of the $46.1B soybean market dominated by a single type of genetic resistance, breeding for soybean cyst nematode (SCN)-resistant soybean that can effectively combat the widespread increase in virulent SCN populations presents a significant challenge. Rhg genes (for Resistance to Heterodera glycines) play a key role in resistance to SCN; however, their deployment beyond the use of the rhg1-b allele has been limited. In this study, quantitative trait loci (QTL) were mapped using PI 90763 through two biparental F3:4 recombinant inbred line (RIL) populations segregating for rhg1-a and rhg1-b alleles against a SCN HG type 1.2.5.7 (Race 2) population. QTL located on chromosome 18 (rhg1-a) and chromosome 11 (rhg2) were determined to confer SCN resistance in PI 90763. The rhg2 gene was fine-mapped to a 169-Kbp region pinpointing GmSNAP11 as the strongest candidate gene. We demonstrated a unique epistatic interaction between rhg1-a and rhg2 loci that not only confers resistance to multiple virulent SCN populations. Further, we showed that pyramiding rhg2 with the conventional mode of resistance, rhg1-b, is ineffective against these virulent SCN populations. This highlights the importance of pyramiding rhg1-a and rhg2 to maximize the impact of gene pyramiding strategies toward management of SCN populations virulent on rhg1-b sources of resistance. Our results lay the foundation for the next generation of soybean resistance breeding to combat the number one pathogen of soybean.

5 citations


Journal ArticleDOI
TL;DR: In this article , quantitative trait locus (QTL) controlling protein content was explored via genome-wide association studies (GWAS) and linkage mapping approaches based on 284 soybean accessions and 180 recombinant inbred lines (RILs), respectively, which were evaluated for protein content for 4 years.
Abstract: Soybean is a primary meal protein for human consumption, poultry, and livestock feed. In this study, quantitative trait locus (QTL) controlling protein content was explored via genome-wide association studies (GWAS) and linkage mapping approaches based on 284 soybean accessions and 180 recombinant inbred lines (RILs), respectively, which were evaluated for protein content for 4 years. A total of 22 single nucleotide polymorphisms (SNPs) associated with protein content were detected using mixed linear model (MLM) and general linear model (GLM) methods in Tassel and 5 QTLs using Bayesian interval mapping (IM), single-trait multiple interval mapping (SMIM), single-trait composite interval mapping maximum likelihood estimation (SMLE), and single marker regression (SMR) models in Q-Gene and IciMapping. Major QTLs were detected on chromosomes 6 and 20 in both populations. The new QTL genomic region on chromosome 6 (Chr6_18844283–19315351) included 7 candidate genes and the Hap.XAA at the Chr6_19172961 position was associated with high protein content. Genomic selection (GS) of protein content was performed using Bayesian Lasso (BL) and ridge regression best linear unbiased prediction (rrBULP) based on all the SNPs and the SNPs significantly associated with protein content resulted from GWAS. The results showed that BL and rrBLUP performed similarly; GS accuracy was dependent on the SNP set and training population size. GS efficiency was higher for the SNPs derived from GWAS than random SNPs and reached a plateau when the number of markers was >2,000. The SNP markers identified in this study and other information were essential in establishing an efficient marker-assisted selection (MAS) and GS pipelines for improving soybean protein content.

5 citations


Journal ArticleDOI
TL;DR: It is concluded that the genomic diversity shaped by multiple selective breeding programs can result in gene pools of highly productive elite lines with similar allelic compositions in a genome-wide perspective.

3 citations


Journal ArticleDOI
TL;DR: The results provide better insights into utilizing wild soybean as a source of genetic diversity for soybean cultivar improvement utilizing native traits and a new Kompetitive allele-specific polymerase chain reaction (KASP) assay, which will be useful for introgression of this trait into modern elite G. max cultivars.
Abstract: Modern soybean [Glycine max (L.) Merr] cultivars have low overall genetic variation due to repeated bottleneck events that arose during domestication and from selection strategies typical of many soybean breeding programs. In both public and private soybean breeding programs, the introgression of wild soybean (Glycine soja Siebold and Zucc.) alleles is a viable option to increase genetic diversity and identify new sources for traits of value. The objectives of our study were to examine the genetic architecture responsible for seed protein and oil using a recombinant inbred line (RIL) population derived from hybridizing a G. max line (‘Osage’) with a G. soja accession (PI 593983). Linkage mapping identified a total of seven significant quantitative trait loci on chromosomes 14 and 20 for seed protein and on chromosome 8 for seed oil with LOD scores ranging from 5.3 to 31.7 for seed protein content and from 9.8 to 25.9 for seed oil content. We analyzed 3,015 single F4:9 soybean plants to develop two residual heterozygotes derived near isogenic lines (RHD-NIL) populations by targeting nine SNP markers from genotype-by-sequencing, which corresponded to two novel quantitative trait loci (QTL) derived from G. soja: one for a novel seed oil QTL on chromosome 8 and another for a novel protein QTL on chromosome 14. Single marker analysis and linkage analysis using 50 RHD-NILs validated the chromosome 14 protein QTL, and whole genome sequencing of RHD-NILs allowed us to reduce the QTL interval from ∼16.5 to ∼4.6 Mbp. We identified two genomic regions based on recombination events which had significant increases of 0.65 and 0.72% in seed protein content without a significant decrease in seed oil content. A new Kompetitive allele-specific polymerase chain reaction (KASP) assay, which will be useful for introgression of this trait into modern elite G. max cultivars, was developed in one region. Within the significantly associated genomic regions, a total of eight genes are considered as candidate genes, based on the presence of gene annotations associated with the protein or amino acid metabolism/movement. Our results provide better insights into utilizing wild soybean as a source of genetic diversity for soybean cultivar improvement utilizing native traits.

3 citations


Journal ArticleDOI
TL;DR: The diagnostic SNP markers developed for each gene in the current study will facilitate marker-assisted selections of resistance genes in sunflower breeding programs.
Abstract: Rust and downy mildew (DM) are two important sunflower diseases that lead to significant yield losses globally. The use of resistant hybrids to control rust and DM in sunflower has a long history. The rust resistance genes, R13a and R16, were previously mapped to a 3.4 Mb region at the lower end of sunflower chromosome 13, while the DM resistance gene, Pl33, was previously mapped to a 4.2 Mb region located at the upper end of chromosome 4. High-resolution fine mapping was conducted using whole genome sequencing of HA-R6 (R13a) and TX16R (R16 and Pl33) and large segregated populations. R13a and R16 were fine mapped to a 0.48 cM region in chromosome 13 corresponding to a 790 kb physical interval on the XRQr1.0 genome assembly. Four disease defense-related genes with nucleotide-binding leucine-rich repeat (NLR) motifs were found in this region from XRQr1.0 gene annotation as candidate genes for R13a and R16. Pl33 was fine mapped to a 0.04 cM region in chromosome 4 corresponding to a 63 kb physical interval. One NLR gene, HanXRQChr04g0095641, was predicted as the candidate gene for Pl33. The diagnostic SNP markers developed for each gene in the current study will facilitate marker-assisted selections of resistance genes in sunflower breeding programs.

3 citations


Journal ArticleDOI
TL;DR: In this article , the authors compared imputation performance of software BEAGLE 5.0, IMPUTE 5 and AlphaPlantImpute and tested software parameters that may help to improve imputation accuracy in soybean populations.
Abstract: Abstract Key message Software for high imputation accuracy in soybean was identified. Imputed dataset could significantly reduce the interval of genomic regions controlling traits, thus greatly improve the efficiency of candidate gene identification. Abstract Genotype imputation is a strategy to increase marker density of existing datasets without additional genotyping. We compared imputation performance of software BEAGLE 5.0, IMPUTE 5 and AlphaPlantImpute and tested software parameters that may help to improve imputation accuracy in soybean populations. Several factors including marker density, extent of linkage disequilibrium (LD), minor allele frequency (MAF), etc., were examined for their effects on imputation accuracy across different software. Our results showed that AlphaPlantImpute had a higher imputation accuracy than BEAGLE 5.0 or IMPUTE 5 tested in each soybean family, especially if the study progeny were genotyped with an extremely low number of markers. LD extent, MAF and reference panel size were positively correlated with imputation accuracy, a minimum number of 50 markers per chromosome and MAF of SNPs > 0.2 in soybean line were required to avoid a significant loss of imputation accuracy. Using the software, we imputed 5176 soybean lines in the soybean nested mapping population (NAM) with high-density markers of the 40 parents. The dataset containing 423,419 markers for 5176 lines and 40 parents was deposited at the Soybase. The imputed NAM dataset was further examined for the improvement of mapping quantitative trait loci (QTL) controlling soybean seed protein content. Most of the QTL identified were at identical or at similar position based on initial and imputed datasets; however, QTL intervals were greatly narrowed. The resulting genotypic dataset of NAM population will facilitate QTL mapping of traits and downstream applications. The information will also help to improve genotyping imputation accuracy in self-pollinated crops.

2 citations


Journal ArticleDOI
TL;DR: In this paper , the authors compared imputation performance of software BEAGLE 5.0, IMPUTE 5 and AlphaPlantImpute and tested software parameters that may help to improve imputation accuracy in soybean populations.
Abstract: Abstract Key message Software for high imputation accuracy in soybean was identified. Imputed dataset could significantly reduce the interval of genomic regions controlling traits, thus greatly improve the efficiency of candidate gene identification. Abstract Genotype imputation is a strategy to increase marker density of existing datasets without additional genotyping. We compared imputation performance of software BEAGLE 5.0, IMPUTE 5 and AlphaPlantImpute and tested software parameters that may help to improve imputation accuracy in soybean populations. Several factors including marker density, extent of linkage disequilibrium (LD), minor allele frequency (MAF), etc., were examined for their effects on imputation accuracy across different software. Our results showed that AlphaPlantImpute had a higher imputation accuracy than BEAGLE 5.0 or IMPUTE 5 tested in each soybean family, especially if the study progeny were genotyped with an extremely low number of markers. LD extent, MAF and reference panel size were positively correlated with imputation accuracy, a minimum number of 50 markers per chromosome and MAF of SNPs > 0.2 in soybean line were required to avoid a significant loss of imputation accuracy. Using the software, we imputed 5176 soybean lines in the soybean nested mapping population (NAM) with high-density markers of the 40 parents. The dataset containing 423,419 markers for 5176 lines and 40 parents was deposited at the Soybase. The imputed NAM dataset was further examined for the improvement of mapping quantitative trait loci (QTL) controlling soybean seed protein content. Most of the QTL identified were at identical or at similar position based on initial and imputed datasets; however, QTL intervals were greatly narrowed. The resulting genotypic dataset of NAM population will facilitate QTL mapping of traits and downstream applications. The information will also help to improve genotyping imputation accuracy in self-pollinated crops.

Journal ArticleDOI
TL;DR: The identified seven quantitative trait loci (QTLs) with additive effects and indicated that the plant–pathogen interaction played an important role in the SCN resistance for Handou 10, and the novel resistant gene will be a source for improving soybeans’ resistance to SCN.
Abstract: Soybean cyst nematode (SCN; Heterodera glycines Ichinohe) is a highly destructive pathogen for soybean production worldwide. The use of resistant varieties is the most effective way of preventing yield loss. Handou 10 is a commercial soybean variety with desirable agronomic traits and SCN resistance, however genes underlying the SCN resistance in the variety are unknown. An F2:8 recombinant inbred line (RIL) population derived from a cross between Zheng 9525 (susceptible) and Handou 10 was developed and its resistance to SCN HG type 2.5.7 (race 1) and 1.2.5.7 (race 2) was identified. We identified seven quantitative trait loci (QTLs) with additive effects. Among these, three QTLs on Chromosomes 7, 8, and 18 were resistant to both races. These QTLs could explain 1.91–7.73% of the phenotypic variation of SCN’s female index. The QTLs on chromosomes 8 and 18 have already been reported and were most likely overlapped with rhg1 and Rhg4 loci, respectively. However, the QTL on chromosome 7 was novel. Candidate genes for the three QTLs were predicted through genes functional analysis and transcriptome analysis of infected roots of Handou 10 vs. Zheng 9525. Transcriptome analysis performed also indicated that the plant–pathogen interaction played an important role in the SCN resistance for Handou 10. The information will facilitate SCN–resistant gene cloning, and the novel resistant gene will be a source for improving soybeans’ resistance to SCN.

Journal ArticleDOI
TL;DR: The accuracy of MIPs combined with its low per sample cost makes it a powerful tool to enable genomic selection within soybean breeding programs.
Abstract: Increasing rate of genetic gain for key agronomic traits through genomic selection requires the development of new molecular methods to run genome-wide single nucleotide polymorphisms (SNPs). The main limitation of current methods is the cost is too high to screen breeding populations. Molecular inversion probes (MIPs) is a targeted genotyping-by-sequencing method that could be used for soybeans that is both cost effective, high-throughput, and provides high data quality to screen breeder’s germplasm for genomic selection. A 1K MIP SNP set was developed for soybean with uniformly distributed markers across the genome. The SNPs were selected to maximize the number of informative markers in germplasm being tested in soybean breeding programs located in the North Central and Mid-South regions of the United States. The 1K SNP MIP set was tested on diverse germplasm and a recombinant inbred line population. Targeted sequencing with MIPs obtained an 85% enrichment for the targeted SNPs. MIP’s genotyping accuracy was 93% overall while homozoygous call accuracy was 98% with less than 10% missing data. The accuracy of MIPs combined with its low per sample cost makes it a powerful tool to enable genomic selection within soybean breeding programs.

Journal ArticleDOI
TL;DR: USDA-N7005 is an early maturity group VII, F4 derived germplasm with excellent yield potential that traces 62.5% of its pedigree to Japanese accessions that are not part of the historical genetic base of U.S. soybean breeding as discussed by the authors .
Abstract: USDA-N7005 soybean [Glycine max (L.) Merr.] (Reg. no. GP-457, PI 699962) was released by the USDA-ARS and the North Carolina Agricultural Research Service in August 2021. USDA-N7005 is an early-maturity group VII, F4-derived germplasm with excellent yield potential that traces 62.5% of its pedigree to Japanese accessions that are not part of the historical genetic base of U.S. soybean breeding. Currently, Japanese germplasm constitutes only a small portion of the U.S. soybean base. USDA-N7005 was derived from the cross of cultivar ‘USDA-N7002’ × Japanese cultivar ‘Tamahikari’. USDA-N7005 traces 12.5% of its ancestry to Japanese landrace PI 416937 via USDA-N7002 and 50% to Japanese cultivar Tamahikari. USDA-N7005 is the second public release in the United States derived from Tamahikari. Over 13 environments of the United Soybean Board Southern Diversity Yield Trials and 19 test environments of the USDA Uniform Soybean Tests-Southern States, USDA-N7005 yielded 108% (p < .05) and 101% of the adapted parent cultivar USDA-N7002, respectively, and matured 2–3 d earlier. The new release also yielded 100% of check cultivars ‘NC-Roy’ and ‘USDA-N7003CN’ and 96% of cultivar ‘NC-Dilday’ in the Uniform Tests. USDA-N7005 was similar in height and lodging to parent USDA-N7002 but exhibited elevated seed oil content and larger seed size. USDA-N7005 was resistant to root-knot nematode and stem canker, with resistance comparable to that of resistant parent USDA-N7002. The superior agronomic performance and diverse pedigree of USDA-N7005 make it desirable parental stock for broadening the base of U.S. soybean breeding.

Journal ArticleDOI
TL;DR: In this paper , the effect of freeze-thaw on isoflavone composition in germinated soybeans was studied, particularly the conversion of aglycones, one of the monomers with high biological activity.

Journal ArticleDOI
TL;DR: In this paper , a GS was performed on soybean protein and oil content using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) based on 1,007 soybean accessions.
Abstract: Introduction Genomic selection (GS) is a potential breeding approach for soybean improvement. Methods In this study, GS was performed on soybean protein and oil content using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) based on 1,007 soybean accessions. The SoySNP50K SNP dataset of the accessions was obtained from the USDA-ARS, Beltsville, MD lab, and the protein and oil content of the accessions were obtained from GRIN. Results Our results showed that the prediction accuracy of oil content was higher than that of protein content. When the training population size was 100, the prediction accuracies for protein content and oil content were 0.60 and 0.79, respectively. The prediction accuracy increased with the size of the training population. Training populations with similar phenotype or with close genetic relationships to the prediction population exhibited better prediction accuracy. A greatest prediction accuracy for both protein and oil content was observed when approximately 3,000 markers with -log10(P) greater than 1 were included. Discussion This information will help improve GS efficiency and facilitate the application of GS.

Journal ArticleDOI
17 Sep 2022-bioRxiv
TL;DR: The characterization of recombination hotspots in these two large soybean bi-parental populations demonstrates that hotspots do occur throughout the soybean genome and are enriched for specific motifs but their locations may not be conserved between different populations.
Abstract: Recombination allows for the exchange of genetic material between two parents which plant breeders exploit to make new and improved cultivars. This recombination is not distributed evenly across the chromosome. In crops, recombination mostly occurs in euchromatic regions of the genome and even then, recombination is focused into clusters of crossovers termed recombination hotspots. Understanding the distribution of these hotspots along with the sequence motifs associated with them may lead to methods that enable breeders to better exploit recombination in breeding. To map recombination hotspots and identify sequence motifs associated with hotspots in soybean [Glycine max (L.) Merr.], two bi-parental recombinant inbred lines (RILs) populations were genotyped with 50,000 SNP markers using the SoySNP50k Illumina Infinium assay. A total of 451 recombination hotspots were identified in the two populations. Despite being half-sib populations, only 18 hotspots were in common between the two populations. While pericentromeric regions did exhibit extreme suppression of recombination, twenty-seven percent of the detected hotspots were located in the pericentromic regions of the chromosomes. Two genomic motifs associated with hotspots are similar to human, dog, rice, wheat, drosophila, and arabidopsis. These motifs were a CCN repeat motif and a poly-A motif. Genomic regions spanning other hotspots were significantly enriched with the tourist family of mini-inverted-repeat transposable elements (MITEs) that resides in less than 0.34% of the soybean genome. The characterization of recombination hotspots in these two large soybean bi-parental populations demonstrates that hotspots do occur throughout the soybean genome and are enriched for specific motifs but their locations may not be conserved between different populations.