scispace - formally typeset
Search or ask a question

Showing papers on "Microsatellite published in 2018"


Journal ArticleDOI
TL;DR: This review attempts to give an account of different molecular markers currently available for genome mapping and for tagging different traits - restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNAs (RAPDs), amplified fragment length polymerisms (AFLPs) and microsatellites.
Abstract: In recent years, molecular markers have been developed based on the more detailed knowledge of genome structure. Considerable emphasis has been laid on the use of molecular markers in practical breeding and genotype identification. This review attempts to give an account of different molecular markers currently available for genome mapping and for tagging different traits - restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNAs (RAPDs), amplified fragment length polymorphisms (AFLPs) and microsatellites. Other markers, expressed sequence tags (ESTs) and single nucle- otide polymorphisms (SNPs) are also mentioned. The importance of structural, functional genomic and comparative mapping is also discussed.

73 citations


Journal ArticleDOI
TL;DR: This work presents a double‐digest restriction site‐associated DNA sequencing (ddRAD‐seq) analysis pipeline that simultaneously achieves the SNP discovery and genotyping steps and which is optimized to return a statistically powerful set of SNP markers from large numbers of individuals.
Abstract: Information on genetic relationships among individuals is essential to many studies of the behaviour and ecology of wild organisms. Parentage and relatedness assays based on large numbers of single nucleotide polymorphism (SNP) loci hold substantial advantages over the microsatellite markers traditionally used for these purposes. We present a double-digest restriction site-associated DNA sequencing (ddRAD-seq) analysis pipeline that, as such, simultaneously achieves the SNP discovery and genotyping steps and which is optimized to return a statistically powerful set of SNP markers (typically 150-600 after stringent filtering) from large numbers of individuals (up to 240 per run). We explore the trade-offs inherent in this approach through a set of experiments in a species with a complex social system, the variegated fairy-wren (Malurus lamberti) and further validate it in a phylogenetically broad set of other bird species. Through direct comparisons with a parallel data set from a robust panel of highly variable microsatellite markers, we show that this ddRAD-seq approach results in substantially improved power to discriminate among potential relatives and considerably more precise estimates of relatedness coefficients. The pipeline is designed to be universally applicable to all bird species (and with minor modifications to many other taxa), to be cost- and time-efficient, and to be replicable across independent runs such that genotype data from different study periods can be combined and analysed as field samples are accumulated.

65 citations


Journal ArticleDOI
TL;DR: The objective of this review is to provide the comprehensive summary of the update knowledge of colorectal cancer classification and diagnostic features of microsatellite instability.
Abstract: Colorectal cancer (CRC) is a heterogeneous disease that is caused by the interaction of genetic and environmental factors. Although it is one of the most common cancers worldwide, CRC would be one of the most curable cancers if it is detected in the early stages. Molecular changes that occur in colorectal cancer may be categorized into three main groups: 1) Chromosomal Instability (CIN), 2) Microsatellite Instability (MSI), and 3) CpG Island Methylator phenotype (CIMP). Microsatellites, also known as Short Tandem Repeats (STRs) are small (1-6 base pairs) repeating stretches of DNA scattered throughout the entire genome and account for approximately 3 % of the human genome. Due to their repeated structure, microsatellites are prone to high mutation rate. Microsatellite instability (MSI) is a unique molecular alteration and hyper-mutable phenotype, which is the result of a defective DNA mismatch repair (MMR) system, and can be defined as the presence of alternate sized repetitive DNA sequences which are not present in the corresponding germ line DNA. The presence of MSI is found in sporadic colon, gastric, sporadic endometrial and the majority of other cancers. Approximately, 15-20 % of colorectal cancers display MSI. Determination of MSI status in CRC has prognostic and therapeutic implications. As well, detecting MSI is used diagnostically for tumor detection and classification. For these reasons, microsatellite instability analysis is becoming more and more important in colorectal cancer patients. The objective of this review is to provide the comprehensive summary of the update knowledge of colorectal cancer classification and diagnostic features of microsatellite instability.

65 citations


Journal ArticleDOI
TL;DR: This work suggests that the development and application of large sequenced microsatellite panels presents great potential for stock resolution in Atlantic salmon and more broadly in other exploited anadromous and marine species.
Abstract: Individual assignment and genetic mixture analysis are commonly utilized in contemporary wildlife and fisheries management. Although microsatellite loci provide unparalleled numbers of alleles per locus, their use in assignment applications is increasingly limited. However, next-generation sequencing, in conjunction with novel bioinformatic tools, allows large numbers of microsatellite loci to be simultaneously genotyped, presenting new opportunities for individual assignment and genetic mixture analysis. Here, we scanned the published Atlantic salmon genome to identify 706 microsatellite loci, from which we developed a final panel of 101 microsatellites distributed across the genome (average 3.4 loci per chromosome). Using samples from 35 Atlantic salmon populations (n = 1,485 individuals) from coastal Labrador, Canada, a region characterized by low levels of differentiation in this species, this panel identified 844 alleles (average of 8.4 alleles per locus). Simulation-based evaluations of assignment and mixture identification accuracy revealed unprecedented resolution, clearly identifying 26 rivers or groups of rivers spanning 500 km of coastline. This baseline was used to examine the stock composition of 696 individuals harvested in the Labrador Atlantic salmon fishery and revealed that coastal fisheries largely targeted regional groups (<300 km). This work suggests that the development and application of large sequenced microsatellite panels presents great potential for stock resolution in Atlantic salmon and more broadly in other exploited anadromous and marine species.

56 citations


Journal ArticleDOI
21 Feb 2018
TL;DR: Identification of diverse parents based on cluster analysis can be effectively done with EST-SSR as the genetic similarity estimates are based on functional attributes related to morphological/agronomical traits.
Abstract: Twenty-five primer pairs developed from genomic simple sequence repeats (SSR) were compared with 25 expressed sequence tags (EST) SSRs to evaluate the efficiency of these two sets of primers using 59 sugarcane genetic stocks. The mean polymorphism information content (PIC) of genomic SSR was higher (0.72) compared to the PIC value recorded by EST-SSR marker (0.62). The relatively low level of polymorphism in EST-SSR markers may be due to the location of these markers in more conserved and expressed sequences compared to genomic sequences which are spread throughout the genome. Dendrogram based on the genomic SSR and EST-SSR marker data showed differences in grouping of genotypes. A total of 59 sugarcane accessions were grouped into 6 and 4 clusters using genomic SSR and EST-SSR, respectively. The highly efficient genomic SSR could subcluster the genotypes of some of the clusters formed by EST-SSR markers. The difference in dendrogram observed was probably due to the variation in number of markers produced by genomic SSR and EST-SSR and different portion of genome amplified by both the markers. The combined dendrogram (genomic SSR and EST-SSR) more clearly showed the genetic relationship among the sugarcane genotypes by forming four clusters. The mean genetic similarity (GS) value obtained using EST-SSR among 59 sugarcane accessions was 0.70, whereas the mean GS obtained using genomic SSR was 0.63. Although relatively lower level of polymorphism was displayed by the EST-SSR markers, genetic diversity shown by the EST-SSR was found to be promising as they were functional marker. High level of PIC and low genetic similarity values of genomic SSR may be more useful in DNA fingerprinting, selection of true hybrids, identification of variety specific markers and genetic diversity analysis. Identification of diverse parents based on cluster analysis can be effectively done with EST-SSR as the genetic similarity estimates are based on functional attributes related to morphological/agronomical traits.

55 citations


Journal ArticleDOI
TL;DR: A mini-review puts recent developments in age estimation via (epi)genetic methods in the context of the requirements and goals of forensic genetics and highlights paths to follow in the future of forensic genomics.
Abstract: Forensic genetics developed from protein-based techniques a quarter of a century ago and became famous as "DNA fingerprinting," this being based on restriction fragment length polymorphisms (RFLPs) of high-molecular-weight DNA. The amplification of much smaller short tandem repeat (STR) sequences using the polymerase chain reaction soon replaced RFLP analysis and advanced to become the gold standard in genetic identification. Meanwhile, STR multiplexes have been developed and made commercially available which simultaneously amplify up to 30 STR loci from as little as 15 cells or fewer. The enormous information content that comes with the large variety of observed STR genotypes allows for genetic individualisation (with the exception of identical twins). Carefully selected core STR loci form the basis of intelligence-led DNA databases that provide investigative leads by linking unsolved crime scenes and criminals through their matched STR profiles. Nevertheless, the success of modern DNA fingerprinting depends on the availability of reference material from suspects. In order to provide new investigative leads in cases where such reference samples are absent, forensic scientists started to explore the prediction of phenotypic traits from the DNA of the evidentiary sample. This paradigm change now uses DNA and epigenetic markers to forecast characteristics that are useful to triage further investigative work. So far, the best investigated externally visible characteristics are eye, hair and skin colour, as well as geographic ancestry and age. Information on the chronological age of a stain donor (or any sample donor) is elemental for forensic investigations in a number of aspects and has, therefore, been explored by researchers in some detail. Among different methodological approaches tested to date, the methylation-sensitive analysis of carefully selected DNA markers (CpG sites) has brought the most promising results by providing prediction accuracies of ±3-4 years, which can be comparable to, or even surpass those from, eyewitness reports. This mini-review puts recent developments in age estimation via (epi)genetic methods in the context of the requirements and goals of forensic genetics and highlights paths to follow in the future of forensic genomics.

54 citations


Journal ArticleDOI
TL;DR: This genetic map would provide a basis for genome assembly and comparative genomics studies, and those QTL-derived candidate genes and genetic markers are useful genomic resources for marker-assisted selection (MAS) of growth-related traits in the Yangtze River common carp.
Abstract: A high-density genetic linkage map is essential for QTL fine mapping, comparative genome analysis, identification of candidate genes and marker-assisted selection for economic traits in aquaculture species. The Yangtze River common carp (Cyprinus carpio haematopterus) is one of the most important aquacultured strains in China. However, quite limited genetics and genomics resources have been developed for genetic improvement of economic traits in such strain. A high-resolution genetic linkage map was constructed by using 7820 2b-RAD (2b-restriction site-associated DNA) and 295 microsatellite markers in a F2 family of the Yangtze River common carp (C. c. haematopterus). The length of the map was 4586.56 cM with an average marker interval of 0.57 cM. Comparative genome mapping revealed that a high proportion (70%) of markers with disagreed chromosome location was observed between C. c. haematopterus and another common carp strain (subspecies) C. c. carpio. A clear 2:1 relationship was observed between C. c. haematopterus linkage groups (LGs) and zebrafish (Danio rerio) chromosomes. Based on the genetic map, 21 QTLs for growth-related traits were detected on 12 LGs, and contributed values of phenotypic variance explained (PVE) ranging from 16.3 to 38.6%, with LOD scores ranging from 4.02 to 11.13. A genome-wide significant QTL (LOD = 10.83) and three chromosome-wide significant QTLs (mean LOD = 4.84) for sex were mapped on LG50 and LG24, respectively. A 1.4 cM confidence interval of QTL for all growth-related traits showed conserved synteny with a 2.06 M segment on chromosome 14 of D. rerio. Five potential candidate genes were identified by blast search in this genomic region, including a well-studied multi-functional growth related gene, Apelin. We mapped a set of suggestive and significant QTLs for growth-related traits and sex based on a high-density genetic linkage map using SNP and microsatellite markers for Yangtze River common carp. Several candidate growth genes were also identified from the QTL regions by comparative mapping. This genetic map would provide a basis for genome assembly and comparative genomics studies, and those QTL-derived candidate genes and genetic markers are useful genomic resources for marker-assisted selection (MAS) of growth-related traits in the Yangtze River common carp.

48 citations


Journal ArticleDOI
TL;DR: The identified genome-wide SSRs and newly developed SSR markers will provide a powerful means for genetic researches in tea plant, including genetic diversity and evolutionary origin analysis, fingerprinting, QTL mapping, and marker-assisted selection for breeding.
Abstract: The tea plant (Camellia sinensis (L.) O. Kuntze) is one of the most popular non-alcoholic beverage crops worldwide. The availability of complete genome sequences for the Camellia sinensis var. ‘Shuchazao’ has provided the opportunity to identify all types of simple sequence repeat (SSR) markers by genome-wide scan. In this study, a total of 667,980 SSRs were identified in the ~ 3.08 Gb genome, with an overall density of 216.88 SSRs/Mb. Dinucleotide repeats were predominant among microsatellites (72.25%), followed by trinucleotide repeats (15.35%), while the remaining SSRs accounted for less than 13%. The motif AG/CT (49.96%) and AT/TA (40.14%) were the most and the second most abundant among all identified SSR motifs, respectively; meanwhile, AAT/ATT (41.29%) and AAAT/ATTT (67.47%) were the most common among trinucleotides and tetranucleotides, respectively. A total of 300 primer pairs were designed to screen six tea cultivars for polymorphisms of SSR markers using the five selected repeat types of microsatellite sequences. The resulting 96 SSR markers that yielded polymorphic and unambiguous bands were further deployed on 47 tea cultivars for genetic diversity assessment, demonstrating high polymorphism of these SSR markers. Remarkably, the dendrogram revealed that the phylogenetic relationships among these tea cultivars are highly consistent with their genetic backgrounds or places of origin. The identified genome-wide SSRs and newly developed SSR markers will provide a powerful means for genetic researches in tea plant, including genetic diversity and evolutionary origin analysis, fingerprinting, QTL mapping, and marker-assisted selection for breeding.

47 citations


Journal ArticleDOI
TL;DR: A probe capture assay targeting forensically relevant nuclear SNP markers for clonal and massively parallel sequencing (MPS) of degraded and limited DNA samples as well as mixtures using next-generation sequencing (NGS) technologies is designed and tested.
Abstract: DNA from biological forensic samples can be highly fragmented and present in limited quantity. When DNA is highly fragmented, conventional PCR based Short Tandem Repeat (STR) analysis may fail as primer binding sites may not be present on a single template molecule. Single Nucleotide Polymorphisms (SNPs) can serve as an alternative type of genetic marker for analysis of degraded samples because the targeted variation is a single base. However, conventional PCR based SNP analysis methods still require intact primer binding sites for target amplification. Recently, probe capture methods for targeted enrichment have shown success in recovering degraded DNA as well as DNA from ancient bone samples using next-generation sequencing (NGS) technologies. The goal of this study was to design and test a probe capture assay targeting forensically relevant nuclear SNP markers for clonal and massively parallel sequencing (MPS) of degraded and limited DNA samples as well as mixtures. A set of 411 polymorphic markers totaling 451 nuclear SNPs (375 SNPs and 36 microhaplotype markers) was selected for the custom probe capture panel. The SNP markers were selected for a broad range of forensic applications including human individual identification, kinship, and lineage analysis as well as for mixture analysis. Performance of the custom SNP probe capture NGS assay was characterized by analyzing read depth and heterozygote allele balance across 15 samples at 25 ng input DNA. Performance thresholds were established based on read depth ≥500X and heterozygote allele balance within ±10% deviation from 50:50, which was observed for 426 out of 451 SNPs. These 426 SNPs were analyzed in size selected samples (at ≤75 bp, ≤100 bp, ≤150 bp, ≤200 bp, and ≤250 bp) as well as mock degraded samples fragmented to an average of 150 bp. Samples selected for ≤75 bp exhibited 99-100% reportable SNPs across varied DNA amounts and as low as 0.5 ng. Mock degraded samples at 1 ng and 10 ng exhibited >90% reportable SNPs. Finally, two-person male-male mixtures were tested at 10 ng in contributor varying ratios. Overall, 85-100% of alleles unique to the minor contributor were observed at all mixture ratios. Results from these studies using the SNP probe capture NGS system demonstrates proof of concept for application to forensically relevant degraded and mixed DNA samples.

44 citations


Journal ArticleDOI
TL;DR: The genome data obtained in this research provided a large amount of gene resources for further investigating Pennisetum species and indicated a highly heterozygous genome.
Abstract: Elephant grass (Pennisetum purpureum) is a perennial grass in the Poaceae family with high tolerance and one of the best forage plants. Despite its economic importance, the inheritance information of P. purpureum has remained largely unknown. To obtain the whole reference genome, we first conducted a genome survey of P. purpureum. Next-generation sequencing (NGS) was used to perform the de novo whole genome sequencing. As a result, the estimated genome size of elephant grass was 2.01 Gb, with 71.36% repetitive elements. The heterozygosity was 1.02%, which indicates a highly heterozygous genome. The retroelements (9.36%) were the most repetitive elements, followed by DNA transposons (3.66%). In the meantime, 83,706 high-quality genomic simple sequence repeat (SSR) markers, in which the greatest SSR unit length was 3, were developed. Thirty pairs of SSR markers were randomly selected to verify the efficiency and all of them yielded clear amplification products, among which 28 pairs (93.3%) of the primers showed polymorphism. The genome data obtained in this research provided a large amount of gene resources for further investigating Pennisetum species.

43 citations


Journal ArticleDOI
TL;DR: The information obtained by next‐generation sequencing offers a better resolution than the traditional way of SSR genotyping and allows for more accurate evolutionary interpretations.
Abstract: Microsatellites (or simple sequence repeats, SSR) are widely used markers in population genetics. Traditionally, genotyping was and still is carried out through recording fragment length. Now, next-generation sequencing (NGS) makes it easy to obtain also sequence information for the loci of interest. This avoids misinterpretations that otherwise could arise due to size homoplasy. Here, an NGS strategy is described that allows to genotype hundreds of individuals at many custom-designed SSR loci simultaneously, combining multiplex PCR, barcoding, and Illumina sequencing. We created three different datasets for which alleles were coded according to (a) length of the repetitive region, (b) total fragment length, and (c) sequence identity, in order to evaluate the eventual benefits from having sequence data at hand, not only fragment length data. For each dataset, genetic diversity statistics, as well as F ST and R ST values, were calculated. The number of alleles per locus, as well as observed and expected heterozygosity, was highest in the sequence identity dataset, because of single-nucleotide polymorphisms and insertions/deletions in the flanking regions of the SSR motif. Size homoplasy was found to be very common, amounting to 44.7%-63.5% (mean over all loci) in the three study species. Thus, the information obtained by next-generation sequencing offers a better resolution than the traditional way of SSR genotyping and allows for more accurate evolutionary interpretations.

Journal ArticleDOI
09 Apr 2018
TL;DR: The performance of the MiSeq FGx™ Forensic Genomics System was high, however, locus and allele drop-outs were relatively frequent at six loci due to low read depth or skewed heterozygote balances, and the stutter ratios were larger than those observed with conventional STR genotyping methods.
Abstract: The MiSeq FGx™ Forensic Genomics System types 231 genetic markers in one multiplex polymerase chain reaction (PCR) assay. The markers include core forensic short tandem repeats (STRs) as well as identity, ancestry and phenotype informative short nucleotide polymorphisms (SNPs). In this work, the MiSeq FGx™ Forensic Genomics System was evaluated by analysing reproducibility, sensitivity, mixture identification and forensic phenotyping capabilities of the assay. Furthermore, the genotype calling of the ForenSeq™ Universal Analysis Software was verified by analysing fastq.gz files from the MiSeq FGx™ platform using the softwares STRinNGS and GATK. Overall, the performance of the MiSeq FGx™ Forensic Genomics System was high. However, locus and allele drop-outs were relatively frequent at six loci (two STRs and four human identification SNPs) due to low read depth or skewed heterozygote balances, and the stutter ratios were larger than those observed with conventional STR genotyping methods. The risk of locus and allele drop-outs increased dramatically when the amount of DNA in the first PCR was lower than 250 pg. Two-person 50:1 mixtures were identified as mixtures, whereas 100:1 and 1 000:1 mixtures were not. Y-chromosomal short tandem repeats (Y-STRs) alleles were detected in the 100:1 and 1 000:1 female/male mixtures. The ForenSeq™ Universal Analysis Software provided the data analyst with useful alerts that simplified the analysis of the large number of markers. Many of the alerts were due to user-defined, locus-specific criteria. The results shown here indicated that the default settings should be altered for some loci. Also, recommended changes to the assay and software are discussed.

Journal ArticleDOI
TL;DR: A nuclease-based approach that uses overlapping oligonucleotides to eliminate unaltered micro-satellites at the genomic DNA level, prior to PCR is described, improving detection sensitivity by 500–1000-fold relative to current HRM approaches.
Abstract: Detection of microsatellite-instability in colonoscopy-obtained polyps, as well as in plasma-circulating DNA, is frequently confounded by sensitivity issues due to co-existing excessive amounts of wild-type DNA. While also an issue for point mutations, this is particularly problematic for microsatellite changes, due to the high false-positive artifacts generated by polymerase slippage (stutter-bands). Here, we describe a nuclease-based approach, NaME-PrO, that uses overlapping oligonucleotides to eliminate unaltered micro-satellites at the genomic DNA level, prior to PCR. By appropriate design of the overlapping oligonucleotides, NaME-PrO eliminates WT alleles in long single-base homopolymers ranging from 10 to 27 nucleotides in length, while sparing targets containing variable-length indels at any position within the homopolymer. We evaluated 5 MSI targets individually or simultaneously, NR27, NR21, NR24, BAT25 and BAT26 using DNA from cell-lines, biopsies and circulating-DNA from colorectal cancer patients. NaME-PrO enriched altered microsatellites and detected alterations down to 0.01% allelic-frequency using high-resolution-melting, improving detection sensitivity by 500-1000-fold relative to current HRM approaches. Capillary-electrophoresis also demonstrated enhanced sensitivity and enrichment of indels 1-16 bases long. We anticipate application of this highly-multiplex-able method either with standard 5-plex reactions in conjunction with HRM/capillary electrophoresis or massively-parallel-sequencing-based detection of MSI on numerous targets for sensitive MSI-detection.

Journal ArticleDOI
TL;DR: The mechanisms and significance of inflammatory-associated microsatellite alterations are described, and three areas to deeply explore the consequences and prevention of inflammation’s effect upon the DNA MMR system are proposed.
Abstract: Microsatellite alterations within genomic DNA frameshift as a result of defective DNA mismatch repair (MMR). About 15% of sporadic colorectal cancers (CRCs) manifest hypermethylation of the DNA MMR gene MLH1, resulting in mono- and di-nucleotide frameshifts to classify it as microsatellite instability-high (MSI-H) and hypermutated, and due to frameshifts at coding microsatellites generating neo-antigens, produce a robust protective immune response that can be enhanced with immune checkpoint blockade. More commonly, approximately 50% of sporadic non-MSI-H CRCs demonstrate frameshifts at di- and tetra-nucleotide microsatellites to classify it as MSI-low/elevated microsatellite alterations at selected tetranucleotide repeats (EMAST) as a result of functional somatic inactivation of the DNA MMR protein MSH3 via a nuclear-to-cytosolic displacement. The trigger for MSH3 displacement appears to be inflammation and/or oxidative stress, and unlike MSI-H CRC patients, patients with MSI-L/EMAST CRCs show poor prognosis. These inflammatory-associated microsatellite alterations are a consequence of the local tumor microenvironment, and in theory, if the microenvironment is manipulated to lower inflammation, the microsatellite alterations and MSH3 dysfunction should be corrected. Here we describe the mechanisms and significance of inflammatory-associated microsatellite alterations, and propose three areas to deeply explore the consequences and prevention of inflammation's effect upon the DNA MMR system.

Journal ArticleDOI
TL;DR: The highest density genetic map ever reported was constructed since the largest mapping population of tea plants was adopted in present study, and novel QTLs related to flavonoids and CAF were identified based on the new high-density genetic map.
Abstract: Flavonoids are important components that confer upon tea plants a unique flavour and health functions. However, the traditional breeding method for selecting a cultivar with a high or unique flavonoid content is time consuming and labour intensive. High-density genetic map construction associated with quantitative trait locus (QTL) mapping provides an effective way to facilitate trait improvement in plant breeding. In this study, an F1 population (LJ43×BHZ) was genotyped using 2b-restriction site-associated DNA (2b-RAD) sequencing to obtain massive single nucleotide polymorphism (SNP) markers to construct a high-density genetic map for a tea plant. Furthermore, QTLs related to flavonoids were identified using our new genetic map. A total of 13,446 polymorphic SNP markers were developed using 2b-RAD sequencing, and 4,463 of these markers were available for constructing the genetic linkage map. A 1,678.52-cM high-density map at an average interval of 0.40 cM with 4,217 markers, including 427 frameset simple sequence repeats (SSRs) and 3,800 novel SNPs, mapped into 15 linkage groups was successfully constructed. After QTL analysis, a total of 27 QTLs related to flavonoids or caffeine content (CAF) were mapped to 8 different linkage groups, LG01, LG03, LG06, LG08, LG10, LG11, LG12, and LG13, with an LOD from 3.14 to 39.54, constituting 7.5% to 42.8% of the phenotypic variation. To our knowledge, the highest density genetic map ever reported was constructed since the largest mapping population of tea plants was adopted in present study. Moreover, novel QTLs related to flavonoids and CAF were identified based on the new high-density genetic map. In addition, two markers were located in candidate genes that may be involved in flavonoid metabolism. The present study provides valuable information for gene discovery, marker-assisted selection breeding and map-based cloning for functional genes that are related to flavonoid content in tea plants.

Journal ArticleDOI
TL;DR: A large number of SNPs will supply ample choices of DNA markers in analysing the genetic diversity, population structure and evolution of oil palm, and will contribute to mapping quantitative trait loci (QTL) for important traits, thus accelerating oil palm genetic improvement.
Abstract: Oil palm (Elaeis guineensis Jacq.) is the leading oil-producing crops and the most important edible oil resource worldwide. DNA markers and genetic linkage maps are essential resources for marker-assisted selection to accelerate genetic improvement. We conducted RAD-seq on an Illumina NextSeq500 to discover genome-wide SNPs, and used the SNPs to construct a linkage map for an oil palm (Tenera) population derived from a cross between a Deli Dura and an AVROS Pisifera. The RAD-seq produced 1,076 million single-end reads across the breeding population containing 155 trees. Mining this dataset detected 510,251 loci. After filtering out loci with low accuracy and more than 20% missing data, 11,394 SNPs were retained. Using these SNPs, in combination with 188 anchor SNPs and 123 microsatellites, we constructed a linkage map containing 10,023 markers covering 16 chromosomes. The map length is 2,938.2 cM with an average marker space of 0.29 cM. The large number of SNPs will supply ample choices of DNA markers in analysing the genetic diversity, population structure and evolution of oil palm. This high-density linkage map will contribute to mapping quantitative trait loci (QTL) for important traits, thus accelerating oil palm genetic improvement.

Journal ArticleDOI
18 Dec 2018-Genes
TL;DR: The shotgun data reveal that nuclear DNA is present in shed hair and surprisingly abundant relative to mitochondrial DNA, even in the most distal fragments.
Abstract: While shed hairs are one of the most commonly encountered evidence types, they are among the most limited in terms of DNA quantity and quality. As a result, nuclear DNA short tandem repeat (STR) profiling is generally unsuccessful and DNA testing of shed hair is instead performed by targeting the mitochondrial DNA control region. Although the high copy number of mitochondrial DNA relative to nuclear DNA routinely permits the recovery of mitochondrial DNA (mtDNA) data in these cases, mtDNA profiles do not offer the discriminatory power of nuclear DNA profiles. In order to better understand the total content and degradation state of DNA in single shed hairs and assess the feasibility of recovering highly discriminatory nuclear DNA data from this common evidence type, high throughput shotgun sequencing was performed on both recently collected and aged (approximately 50-year-old) hair samples. The data reflect trends that have been demonstrated previously with other technologies, namely that mtDNA quantity and quality decrease along the length of the hair shaft. In addition, the shotgun data reveal that nuclear DNA is present in shed hair and surprisingly abundant relative to mitochondrial DNA, even in the most distal fragments. Nuclear DNA comprised, at minimum, 88% of the total human reads in any given sample, and generally more than 95%. Here, we characterize both the nuclear and mitochondrial DNA content of shed hairs and discuss the implications of these data for forensic investigations.

Journal ArticleDOI
TL;DR: The results of the present study confirm that the analysed set of 11 microsatellite markers recommended by ISAG is suitable for paternity testing in Yugoslav Pied cattle in Serbia.
Abstract: Eleven microsatellite loci ( TGLA227, BM2113, TGLA53, ETH10, SPS115, TGLA126, TGLA122, INRA023, ETH3, ETH225, BM1824) were evaluated for their use in paternity testing in the Yugoslav Pied cattle (YU Simmental cattle) population in Serbia. A total of 40 animals were tested. At the 11 tested loci, a total of 91 alleles were detected. The mean number of alleles per locus was 8.273. Polymorphism information content (PIC) values ranged from 0.58 to 0.88 with the mean value of 0.72. The most informative loci were: TGLA53 (14 alleles, PIC = 0.88), TGLA227 (11 alleles, PIC = 0.82), INRA023 (11 alleles, PIC = 0.86), BM2113 (9 alleles, PIC = 0.80). Combined power of discrimination (CPD) for the 11 microsatellite loci was 0.999. The results of the present study confirm that the analysed set of 11 microsatellite markers recommended by ISAG is suitable for paternity testing in Yugoslav Pied cattle in Serbia.

Journal ArticleDOI
TL;DR: It is found that DNA sequences identified from scat samples through the DArTseq™ process can provide genetic identification of koala diet species, bacterial and viral pathogens, and parasitic organisms, and a subset of 209 conserved loci can accurately identify individual koalas, even from olderScat samples.
Abstract: Maintaining genetic diversity is a crucial component in conserving threatened species. For the iconic Australian koala, there is little genetic information on wild populations that is not either skewed by biased sampling methods (e.g., sampling effort skewed toward urban areas) or of limited usefulness due to low numbers of microsatellites used. The ability to genotype DNA extracted from koala scats using next-generation sequencing technology will not only help resolve location sample bias but also improve the accuracy and scope of genetic analyses (e.g., neutral vs. adaptive genetic diversity, inbreeding, and effective population size). Here, we present the successful SNP genotyping (1272 SNP loci) of koala DNA extracted from scat, using a proprietary DArTseq™ protocol. We compare genotype results from two-day- old scat DNA and 14-day- old scat DNA to a blood DNA template, to test accuracy of scat genotyping. We find that DNA from fresher scat results in fewer loci with missing information than DNA from older scat; however, 14-day- old scat can still provide useful genetic information, depending on the research question. We also find that a subset of 209 conserved loci can accurately identify individual koalas, even from older scat samples. In addition, we find that DNA sequences identified from scat samples through the DArTseq™ process can provide genetic identification of koala diet species, bacterial and viral pathogens, and parasitic organisms.;

Journal ArticleDOI
TL;DR: This study performs genomic and floral transcriptomic sequencing of modern rose and describes a reproducible organ-specific strategy for molecular marker development and selection in plants, which can be applied to other crops.
Abstract: Rosa hybrida is a valuable ornamental, food and medicinal crop worldwide, but with relatively limited molecular marker resources, especially for flower-specific markers. In this study, we performed genomic and floral transcriptomic sequencing of modern rose. We obtained comprehensive nucleotide information, from which numerous potential simple sequence repeat (SSR) markers were identified but were found to have high rates of amplification failure and PCR product redundancy. We applied a filtering strategy for BLAST analysis with the assembled genomic sequence and identified 124,591 genomic and 2,292 EST markers with unique annealing sites. These markers had much greater reliability than those obtained before filtering. Additional BLAST analysis against the transcriptomic sequences uncovered 5225 genomic SSRs associated with 4100 transcripts, 2138 of which were associated with functional genes that were annotated against the non-redundant database. More than 90% of these newly developed molecular markers were polymorphic, based on PCR using a subset of SSRs to analyze tetraploid modern rose accessions, diploid Rosa species and one strawberry accession. The relationships among Rosa species determined by cluster analysis (based on these results) were in agreement with modern rose breeding history, whereas strawberry was isolated in a separate cluster, as expected. Our results provide valuable molecular-genetic tools for rose flower trait improvement, breeding and taxonomy. Importantly, we describe a reproducible organ-specific strategy for molecular marker development and selection in plants, which can be applied to other crops.

Journal ArticleDOI
TL;DR: The integration of genotypic and phenotypic data was useful for implementing guidelines for precision hybrid breeding schemes in fennel and the degree of homozygosity for the individual inbred lines was calculated.
Abstract: The development of F1 hybrid varieties benefits from the synergistic effect of conventional and molecular marker-assisted breeding schemes A sequencing run was carried out in Foeniculum vulgare (2n = 2x = 22) to develop the first genome draft and to identify microsatellites suitable for implementing multilocus SSR marker assays A preliminary cytometric analysis allowed us to estimate the genome size (2C = 264–286 pg), equal to about 134 Mbp for 1C genome, and to calculate the sequencing coverage (53×) The genome draft assembly into 300,408 scaffolds and its bioinformatic analysis enabled the annotation of coding and non-coding regions across the genome, including 103,306 SSR elements A total of 100 microsatellites were randomly chosen among those with dinucleotide and trinucleotide repeat motifs and with a repeat motif length ≥ 25 times and were preliminarily tested Of these, 27 SSR markers, classified as suitable for genetic diversity analyses, were efficiently organized in five PCR multiplex assays and validated using a core collection of 100 fennel individuals potentially useful for the development of inbred lines and F1 hybrids All SSR loci were found to be polymorphic, scoring an observed number of marker alleles Na = 207 and an average polymorphism information content PIC = 069 The SSR data were used to calculate (i) the degree of homozygosity for the individual inbred lines (035 < Ho < 096), to eventually plan additional selfing or sibling cycles, and (ii) the degree of genetic similarity for all possible pair-wise comparisons between parental inbred lines (GS = 055–077), to identify the most divergent combinations for the constitution of experimental F1 hybrids The integration of genotypic and phenotypic data was useful for implementing guidelines for precision hybrid breeding schemes in fennel

Journal ArticleDOI
TL;DR: The sequence variation within the autosomal STR marker SE33 was evaluated using a customized bioinformatic approach to identify and characterize the locus in the 1036 data set and resulted in 100% concordance.
Abstract: A set of 1036 U.S. Population Samples were sequenced using the Illumina ForenSeq DNA Signature Prep Kit. This sample set has been highly characterized using a variety of marker systems for human identification. The FASTQ files obtained from a ForenSeq DNA Signature Prep Kit experiment include several STR loci that are not reported in the associated software. These include SE33, DXS8377, DXS10148, DYS456, and DYS461. The sequence variation within the autosomal STR marker SE33 was evaluated using a customized bioinformatic approach to identify and characterize the locus in the 1036 data set. The analysis identified 53 unique alleles by length and 264 by sequence. An additional 10 alleles were detected when selected extended flanking regions were examined to resolve discordances. Allele frequencies and SE33 sequence motif patterns are reported for the 1036 data set. The comparison of numerical allele calls derived from sequence data to the allele calls obtained from commercial capillary electrophoresis-based STR typing kits resulted in 100% concordance, after manual data review and confirmation sequencing of three flanking region deletions. The analysis of this data set involved significant manual sequence curation and information support from length-based genotypes to ensure high confidence in the sequence-based allele calls. The challenges of interpreting the sequence data for SE33 consisted of high sequence noise, allele-size dependent variance in coverage, and heterozygote imbalance. As allele length increased, sequence depth of coverage and quality decreased at the terminal end. Accordingly, heterozygous genotype imbalance increased in proportion to increased distance between alleles.

Journal ArticleDOI
TL;DR: There was high within population genetic variations in all the studied sheep populations, poor level of population differentiations and high levels of inbreeding, and low estimates of heterozygosities and mean number of alleles were observed in some of the studies.
Abstract: Microsatellites have been widely accepted and employed as useful molecular markers for measuring genetic diversity and divergence within and among populations. The various parameters developed so f...

Journal ArticleDOI
TL;DR: A dendrogram based on genetic analysis suggests a high level of similarity between some of the accessions presumed to be distant and, at the same time, genetic variability between accessions of the same or similar name.
Abstract: Genetic variability among 41 accessions of red pepper ( Capsicum annuum L.) was assessed using eight microsatellite markers. Three of the microsatellite markers ( Hpms 1-1, Hpms 1-168, and Hpms 1-274) had uniform spectra in all the analyzed plants. Two to eight alleles were detected for the remaining loci. In total, 28 alleles were detected, i.e. 3.5 alleles per one microsatellite locus on average. The highest number of different alleles was detected with Hpms 1-5 (8 alleles) and Hpms 2-21 primers (7 alleles). Molecular data were complemented with morphological measurements according to the descriptor list for the genus Capsicum. A dendrogram based on our genetic analysis suggests a high level of similarity between some of the accessions presumed to be distant and, at the same time, genetic variability between accessions of the same or similar name. These results show the possibility of duplicities in the cur -

Journal ArticleDOI
02 Jan 2018-Fly
TL;DR: Reducing the number of microsatellites to the minimum necessary to correctly detect the population structure of two Drosophila nigrosparsa populations is presented, demonstrating that more than 95% of the individuals can still be correctly assigned when using eight loci and that the major population structure is still visible when using two highly polymorphic loci.
Abstract: Small, isolated populations are constantly threatened by loss of genetic diversity due to drift. Such situations are found, for instance, in laboratory culturing. In guarding against diversity loss, monitoring of potential changes in population structure is paramount; this monitoring is most often achieved using microsatellite markers, which can be costly in terms of time and money when many loci are scored in large numbers of individuals. Here, we present a case study reducing the number of microsatellites to the minimum necessary to correctly detect the population structure of two Drosophila nigrosparsa populations. The number of loci was gradually reduced from 11 to 1, using the Allelic Richness (AR) and Private Allelic Richness (PAR) as criteria for locus removal. The effect of each reduction step was evaluated by the number of genetic clusters detectable from the data and by the allocation of individuals to the clusters; in the latter, excluding ambiguous individuals was tested to reduce the rate of incorrect assignments. We demonstrate that more than 95% of the individuals can still be correctly assigned when using eight loci and that the major population structure is still visible when using two highly polymorphic loci. The differences between sorting the loci by AR and PAR were negligible. The method presented here will most efficiently reduce genotyping costs when small sets of loci ("core sets") for long-time use in large-scale population screenings are compiled.

Journal ArticleDOI
Jiantao Zhao1, Yao Xu1, Linjie Xi1, Junwei Yang1, Hongwu Chen1, Jing Zhang1 
TL;DR: In this article, the authors sequenced and compared the chloroplast genome of Acer miaotaiense from five ecological regions in the Qingling and Mashan Regions of China.
Abstract: Acer miaotaiense is an endangered species within the Aceraceae family, and has only a few small natural distributions in China's Qingling Mountains and Bashan Mountains. Comparative analyses of the complete chloroplast genome could provide useful knowledge on the diversity and evolution of this species in different environments. In this study, we sequenced and compared the chloroplast genome of Acer miaotaiense from five ecological regions in the Qingling and Mashan Regions of China. The size of the chloroplast genome ranged from 156,260 bp to 156,204 bp, including two inverted repeat regions, a small single-copy region, and a large single-copy region. Across the whole chloroplast genome, there were 130 genes in total, and 92 of them were protein-coding genes. We observed four genes with non-synonymous mutations involving post-transcriptional modification (matK), photosynthesis (atpI), and self-replication (rps4 and rpl20). A total of 415 microsatellite loci were identified, and the dominant microsatellite types were composed of dinucleotide and trinucleotide motifs. The dominant repeat units were AT and AG, accounting for 37.92% and 31.16% of the total microsatellite loci, respectively. A phylogenetic analysis showed that samples with the same altitude (Xunyangba, Ningshan country, and Zhangliangmiao, Liuba country) had a strong bootstrap value (88%), while the remaining ones shared a similar longitude. These results provided clues about the importance of longitude/altitude for the genetic diversity of Acer miaotaiense. This information will be useful for the conservation and improved management of this endangered species.

Journal ArticleDOI
TL;DR: The results indicate that modified TENT preservative and FTA® Elute Cards both preserved DNA from relatively fresh tissue for up to six months at room temperature, however, mostly partial profiles were produced from decomposed tissues when stored for up-to- six months compared to when tissues were processed immediately following collection.
Abstract: Short tandem repeats (STR) are currently the gold standard in human identification for forensic casework purposes, and successful STR typing is dependent on sufficient quantity and quality DNA. In the aftermath of a mass disaster and some forensic cases, human remains are recovered for identification in various stages of decomposition, and ideally these remains are transported to a refrigerated facility in order to halt the decomposition process and preserve the integrity of DNA within the tissue. However, in situations where refrigeration is not available (e.g., after a mass disaster or in rural forensic casework), remains continue to be exposed to environmental insults after collection, causing further DNA damage and degradation. Therefore, successful STR typing is dependent on the time of collection and preservation of the DNA sample. This study aims to test two simple in-field collection and preservation methods for decomposing human tissues that are subsequently stored at room temperature for up to six months either in a tissue preservative solution (modified TENT buffer) or on an FTA® Elute Card. In addition, these collection and preservation methods were tested for their ability to facilitate more direct and faster processing of DNA from preserved tissues or DNA leached into the surrounding TENT preservative solution for STR typing. Pre-PCR methods tested in this study include a quick lysis of FTA® Elute Cards, silica-based purification (QIAquick®), enzyme-based extractions (PDQeX), and simple dilution of liquid preservative. The traditional DNA analysis pipeline, which includes DNA extraction and quantification, will be compared to an alternate direct PCR method, thereby allowing the elimination of these two time-consuming and costly steps. The results indicate that modified TENT preservative and FTA® Elute Cards both preserved DNA from relatively fresh tissue for up to six months at room temperature. However, mostly partial profiles were produced from decomposed tissues (day 6 - day 14 in this study) when stored for up to six months compared to when tissues were processed immediately following collection. Overall, the modified TENT preservative produced higher DNA concentrations and more successful STR results than FTA® Elute Cards. In addition, a rapid DNA extraction platform (PDQeX) generated the most successful STR typing results from the decomposed tissues stored in TENT for up to six months at room temperature. The direct PCR method used in this study generated comparable STR results to the traditional DNA analysis approach, warranting further investigation of direct PCR methods for forensic casework type samples.

Journal ArticleDOI
TL;DR: 11 novel SNP-STR markers are developed that provide an alternative method for the analysis of extremely unbalanced two- person DNA mixtures and show that all the allele-specific primers could target minor DNA even when the amount of major DNA was a 100-fold higher.
Abstract: Autosomal short tandem repeats (STR) markers analysed by PCR and capillary electrophoresis (CE) represent the gold-standard for forensic DNA analysis. With the improved sensitivity of detection equipment, a larger number of mixed DNA profiles can be obtained from trace amounts of DNA that conventionally used to appear as a single source. More specifically, two-source DNA mixtures, comprising the victim's and the perpetrator's DNAs, are often encountered in forensic casework, where the victim's DNA represents a major component of the mixture. Unfortunately, unbalanced two-person DNA mixtures with a ratio larger than 20:1 (here we have named this kind of mixture extremely unbalanced DNA mixture) provide limited information on the minor component. Although the development of probabilistic software has made interpretation of results from mixed DNA easier, high mixture ratios lead to an uninformative likelihood ratio (LR), considering the minor component. Therefore, a technique that can be performed on the conventional CE platform, while enhancing the ability to detect minor DNA in extremely unbalanced DNA mixtures, may be very useful in forensic casework. Our previous research has reported that SNP-STRs, in conjunction with a PCR technique based on amplification refractory mutation system (ARMS), can be used to resolve extremely unbalanced two-person DNA mixtures. To further explore the capacity of SNP-STR markers to help analyse such DNA mixtures, we developed 11 novel SNP-STR markers. The ARMS-based PCR was then used to design allele-specific primers, where each primer targeted one SNP allele located in the flanking region of the tandem repeats. This method allowed primers to specifically and selectively amplify minor DNA without interference from DNA of the major component because the selected SNP allele was not shared with the major contributor. A survey of the selected 11 SNP-STRs in a southwest Chinese Han population showed high levels of polymorphism. Assays on two-person DNA mixtures showed that all the allele-specific primers could target minor DNA even when the amount of major DNA was a 100-fold higher. Therefore, this novel set of SNP-STR markers provides an alternative method for the analysis of extremely unbalanced two-person DNA mixtures.

Journal ArticleDOI
TL;DR: Random Amplicon Sequencing (RAMseq), a novel approach for fast and cost‐effective detection of single nucleotide polymorphisms (SNPs) in nonmodel species by semideep sequencing of random amplicons, is presented.
Abstract: Biodiversity has suffered a dramatic global decline during the past decades, and monitoring tools are urgently needed providing data for the development and evaluation of conservation efforts both on a species and on a genetic level. However, in wild species, the assessment of genetic diversity is often hampered by the lack of suitable genetic markers. In this article, we present Random Amplicon Sequencing (RAMseq), a novel approach for fast and cost-effective detection of single nucleotide polymorphisms (SNPs) in nonmodel species by semideep sequencing of random amplicons. By applying RAMseq to the Eurasian otter (Lutra lutra), we identified 238 putative SNPs after quality filtering of all candidate loci and were able to validate 32 of 77 loci tested. In a second step, we evaluated the genotyping performance of these SNP loci in noninvasive samples, one of the most challenging genotyping applications, by comparing it with genotyping results of the same faecal samples at microsatellite markers. We compared (i) polymerase chain reaction (PCR) success rate, (ii) genotyping errors and (iii) Mendelian inheritance (population parameters). SNPs produced a significantly higher PCR success rate (75.5% vs. 65.1%) and lower mean allelic error rate (8.8% vs. 13.3%) than microsatellites, but showed a higher allelic dropout rate (29.7% vs. 19.8%). Genotyping results showed no deviations from Mendelian inheritance in any of the SNP loci. Hence, RAMseq appears to be a valuable tool for the detection of genetic markers in nonmodel species, which is a common challenge in conservation genetic studies.

Journal ArticleDOI
TL;DR: A novel method for microsatellite genotyping using Illumina combinatorial barcoding that dispenses exhaustive PCR calibrations is presented, since non-specific amplicons can be eliminated by bioinformatics analyses.
Abstract: Genetic diversity and population studies are essential for conservation and wildlife management programs. However, monitoring requires the analysis of multiple loci from many samples. These processes can be laborious and expensive. The choice of microsatellites and PCR calibration for genotyping are particularly daunting. Here we optimized a low-cost genotyping method using multiple microsatellite loci for simultaneous genotyping of up to 384 samples using next-generation sequencing (NGS). We designed primers with adapters to the combinatorial barcoding amplicon library and sequenced samples by MiSeq. Next, we adapted a bioinformatics pipeline for genotyping microsatellites based on read-length and sequence content. Using primer pairs for eight microsatellite loci from the fish Prochilodus costatus, we amplified, sequenced, and analyzed the DNA of 96, 288, or 384 individuals for allele detection. The most cost-effective methodology was a pseudo-multiplex reaction using a low-throughput kit of 1 M reads (Nano) for 384 DNA samples. We observed an average of 325 reads per individual per locus when genotyping eight loci. Assuming a minimum requirement of 10 reads per loci, two to four times more loci could be tested in each run, depending on the quality of the PCR reaction of each locus. In conclusion, we present a novel method for microsatellite genotyping using Illumina combinatorial barcoding that dispenses exhaustive PCR calibrations, since non-specific amplicons can be eliminated by bioinformatics analyses. This methodology rapidly provides genotyping data and is therefore a promising development for large-scale conservation-genetics studies.