scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The interpretation of single source and mixed DNA profiles.

01 Sep 2013-Forensic Science International-genetics (Elsevier)-Vol. 7, Iss: 5, pp 516-528
TL;DR: A method for interpreting autosomal mixed DNA profiles based on continuous modelling of peak heights is described, and MCMC is applied with a model for allelic and stutter heights to produce a probability for the data given a specified genotype combination.
Abstract: A method for interpreting autosomal mixed DNA profiles based on continuous modelling of peak heights is described. MCMC is applied with a model for allelic and stutter heights to produce a probability for the data given a specified genotype combination. The theory extends to handle any number of contributors and replicates, although practical implementation limits analyses to four contributors. The probability of the peak data given a genotype combination has proven to be a highly intuitive probability that may be assessed subjectively by experienced caseworkers. Whilst caseworkers will not assess the probabilities per se, they can broadly judge genotypes that fit the observed data well, and those that fit relatively less well. These probabilities are used when calculating a subsequent likelihood ratio. The method has been trialled on a number of mixed DNA profiles constructed from known contributors. The results have been assessed against a binary approach and also compared with the subjective judgement of an analyst.
Citations
More filters
Journal ArticleDOI
TL;DR: The software implements a model to explain the allelic peak height on a continuous scale in order to carry out weight-of-evidence calculations for profiles which could be from a mixture of contributors, and is the first freely open source, continuous model, to be reported in the literature.
Abstract: We have released a software named EuroForMix to analyze STR DNA profiles in a user-friendly graphical user interface. The software implements a model to explain the allelic peak height on a continuous scale in order to carry out weight-of-evidence calculations for profiles which could be from a mixture of contributors. Through a properly parameterized model we are able to do inference on mixture proportions, the peak height properties, stutter proportion and degradation. In addition, EuroForMix includes models for allele drop-out, allele drop-in and sub-population structure. EuroForMix supports two inference approaches for likelihood ratio calculations. The first approach uses maximum likelihood estimation of the unknown parameters. The second approach is Bayesian based which requires prior distributions to be specified for the parameters involved. The user may specify any number of known and unknown contributors in the model, however we find that there is a practical computing time limit which restricts the model to a maximum of four unknown contributors. EuroForMix is the first freely open source, continuous model (accommodating peak height, stutter, drop-in, drop-out, population substructure and degradation), to be reported in the literature. It therefore serves an important purpose to act as an unrestricted platform to compare different solutions that are available. The implementation of the continuous model used in the software showed close to identical results to the R-package DNAmixtures, which requires a HUGIN Expert license to be used. An additional feature in EuroForMix is the ability for the user to adapt the Bayesian inference framework by incorporating their own prior information.

171 citations


Cites methods from "The interpretation of single source..."

  • ...Our method differs from STRmix and TrueAllele in that we compute the marginalized likelihood expressions using exact methods without any need for MCMC sampling....

    [...]

  • ...STRmix and TrueAllele are based on a Bayesian approach through specifying prior distributions on the unknown model parameters....

    [...]

  • ...Commercial continuous software include: STRmix [14], TrueAllele[11] and DNAmixtures[6]....

    [...]

  • ...[14], EuroForMix can also accommodate allele drop-in....

    [...]

Journal ArticleDOI
TL;DR: Challenges and opportunities that will impact the future of forensic DNA are explored including the need for education and training to improve interpretation of complex DNA profiles.
Abstract: The author's thoughts and opinions on where the field of forensic DNA testing is headed for the next decade are provided in the context of where the field has come over the past 30 years. Similar t...

152 citations

Journal ArticleDOI
TL;DR: It is shown that the combination of evidence from several samples may give an evidential strength which is close to that of a single‐source trace and thus modelling of peak height information provides a potentially very efficient mixture analysis.
Abstract: DNA is now routinely used in criminal investigations and court cases, although DNA samples taken at crime scenes are of varying quality and therefore present challenging problems for their interpretation. We present a statistical model for the quantitative peak information obtained from an electropherogram of a forensic DNA sample and illustrate its potential use for the analysis of criminal cases. In contrast with most previously used methods, we directly model the peak height information and incorporate important artefacts that are associated with the production of the electropherogram. Our model has a number of unknown parameters, and we show that these can be estimated by the method of maximum likelihood in the presence of multiple unknown individuals contributing to the sample, and their approximate standard errors calculated; the computations exploit a Bayesian network representation of the model. A case example from a UK trial, as reported in the literature, is used to illustrate the efficacy and use of the model, both in finding likelihood ratios to quantify the strength of evidence, and in the deconvolution of mixtures for finding likely profiles of the individuals contributing to the sample. Our model is readily extended to simultaneous analysis of more than one mixture as illustrated in a case example. We show that the combination of evidence from several samples may give an evidential strength which is close to that of a single-source trace and thus modelling of peak height information provides a potentially very efficient mixture analysis.

119 citations


Cites methods from "The interpretation of single source..."

  • ...Tvedebrink et al. (2010) evaluate the weight of evidence for two person mixtures, using a multivariate normal distribution of peak heights....

    [...]

  • ...Recently, Taylor et al. (2013) used a log-normal model for the ratio between observed and expected peak heights....

    [...]

Journal ArticleDOI
TL;DR: There is now a detailed understanding of the causes of stochastic effects that cause DNA profiles to exhibit the phenomena of drop-out and drop-in, along with artefacts such as stutters, and the development of national DNA databases is discussed.
Abstract: The introduction of Short Tandem Repeat (STR) DNA was a revolution within a revolution that transformed forensic DNA profiling into a tool that could be used, for the first time, to create National DNA databases. This transformation would not have been possible without the concurrent development of fluorescent automated sequencers, combined with the ability to multiplex several loci together. Use of the polymerase chain reaction (PCR) increased the sensitivity of the method to enable the analysis of a handful of cells. The first multiplexes were simple: 'the quad', introduced by the defunct UK Forensic Science Service (FSS) in 1994, rapidly followed by a more discriminating 'six-plex' (Second Generation Multiplex) in 1995 that was used to create the world's first national DNA database. The success of the database rapidly outgrew the functionality of the original system - by the year 2000 a new multiplex of ten-loci was introduced to reduce the chance of adventitious matches. The technology was adopted world-wide, albeit with different loci. The political requirement to introduce pan-European databases encouraged standardisation - the development of European Standard Set (ESS) of markers comprising twelve-loci is the latest iteration. Although development has been impressive, the methods used to interpret evidence have lagged behind. For example, the theory to interpret complex DNA profiles (low-level mixtures), had been developed fifteen years ago, but only in the past year or so, are the concepts starting to be widely adopted. A plethora of different models (some commercial and others non-commercial) have appeared. This has led to a confusing 'debate' about the 'best' to use. The different models available are described along with their advantages and disadvantages. A section discusses the development of national DNA databases, along with details of an associated controversy to estimate the strength of evidence of matches. Current methodology is limited to searches of complete profiles - another example where the interpretation of matches has not kept pace with development of theory. STRs have also transformed the area of Disaster Victim Identification (DVI) which frequently requires kinship analysis. However, genotyping efficiency is complicated by complex, degraded DNA profiles. Finally, there is now a detailed understanding of the causes of stochastic effects that cause DNA profiles to exhibit the phenomena of drop-out and drop-in, along with artefacts such as stutters. The phenomena discussed include: heterozygote balance; stutter; degradation; the effect of decreasing quantities of DNA; the dilution effect.

117 citations

Journal ArticleDOI
TL;DR: The two approaches to probabilistic genotyping (semi-continuous and fully continuous) are described and issues such as validation and court acceptance are addressed and areas of future needs for Probabilistic software are discussed.
Abstract: The interpretation of mixed profiles from DNA evidentiary material is one of the more challenging duties of the forensic scientist. Traditionally, analysts have used a "binary" approach to interpretation where inferred genotypes are either included or excluded from the mixture using a stochastic threshold and other biological parameters such as heterozygote balance, mixture ratio, and stutter ratios. As the sensitivity of STR multiplexes and capillary electrophoresis instrumentation improved over the past 25 years, coupled with the change in the type of evidence being submitted for analysis (from high quality and quantity (often single-source) stains to low quality and quantity (often mixed) "touch" samples), the complexity of DNA profile interpretation has equally increased. This review provides a historical perspective on the movement from binary methods of interpretation to probabilistic methods of interpretation. We describe the two approaches to probabilistic genotyping (semi-continuous and fully continuous) and address issues such as validation and court acceptance. Areas of future needs for probabilistic software are discussed.

114 citations

References
More filters
Journal ArticleDOI
TL;DR: It is demonstrated that an apparent mis-match between crime-stain and a suspect DNA profile does not necessarily result in an exclusion, and the duplication guideline is robust by applying a statistical theory that models three key parameters - namely the incidence of allele drop-out, laboratory contamination and stutter.

609 citations


"The interpretation of single source..." refers background in this paper

  • ...A model that is partially continuous based on allowing a probability for dropout and drop-in (hereafter the ‘‘drop model’’) [7]....

    [...]

Journal ArticleDOI
TL;DR: This work proposes an alternative approach which addresses directly the effect of kinship in DNA profile analysis, which is simple, logically coherent and makes efficient use of the data.

450 citations

Journal ArticleDOI
TL;DR: The sequence analysis and results obtained using various DNA polymerases appear to support the slipped strand displacement model as a potential explanation for how these stutter products are generated.
Abstract: The PCR amplification of tetranucleotide short tandem repeat (STR) loci typically produces a minor product band 4 bp shorter than the corresponding main allele band; this is referred to as the stutter band. Sequence analysis of the main and stutter bands for two sample alleles of the STR locus vWA reveals that the stutter band lacks one repeat unit relative to the main allele. Sequencing results also indicate that the number and location of the different 4 bp repeat units vary between samples containing a typical verses low proportion of stutter product. The results also suggest that the proportion of stutter product relative to the main allele increases as the number of uninterrupted core repeat units increases. The sequence analysis and results obtained using various DNA polymerases appear to support the slipped strand displacement model as a potential explanation for how these stutter products are generated.

389 citations


"The interpretation of single source..." refers methods in this paper

  • ...A summary of nomenclature used within this paper a the allele a 1 signifies the stutter product for allele a A the mass variable for locus amplification efficiency, Al : l ¼ 1; :::; L n o – locus offset at locus l. c a constant in modelling the variance in peak height D the mass variable for degradation, dn : n ¼ 1; :::; Nf g – degradation in template vs. molecular weight for contributor n E the vector of expected peak heights Elanr ¼ Tlanr=ð1 þ plaÞ the contribution of contributor n to the expected height of the allelic peaks at locus l formed from allele a in replicate r Elða 1Þnr ¼ plaðTlanrÞ=ð1 þ plaÞ the contribution of contributor n to expected height of the stutter peaks at locus l formed from allele a where a 1 signifies the stutter product in replicate r GC the evidence of the crime stain across all R replicates Hm hypotheses, H1 and H2 hypotheses chosen to align with the prosecution and the defence, respectively J the number of contributors with j representing a specific contributor L the number of loci with l representing a specific locus LRC the continuous LR LRB the binary LR LUS the longest uninterrupted sequence within an allele M is the mass variables D, A, R and T collectively mla is the molecular weight of allele a at locus l N number of contributors with n representing a specific contributor O the vector of observed peak heights Olar the observed peak height for allele a at locus l for replicate r P the probability of observed data given mass parameters Q a catch-all allele to cover all possibilities outside a specified set R number of replicates with r representing a specific replicate R the mass variable for replicate amplification, {Rr : r = 1, ..., R} is a multiplier applied to replicate r....

    [...]

  • ...Since we have not yet implemented a LUS based model it is inappropriate to apply this variance and a larger variance is required and applied....

    [...]

  • ...We currently implement a simplified stutter model that models stutter ratio as linear with respect to the allelic designation rather than longest uninterrupted sequence (LUS) [21,22]....

    [...]

  • ...We are uncertain whether currently available LUS data derived from largely African American and Caucasian samples translates simply to Maori, Polynesian and Australian Aboriginal samples because specific sequence data is not available....

    [...]

  • ...Empirical data suggests that the variance in a stutter peak in a model based on LUS follows a different pattern to an allelic peak [17]....

    [...]

BookDOI
29 Nov 2004
TL;DR: The Frequentist Approaches Bayesian Approaches Statistical Evaluation of Mixtures Low Copy Number and Interpretation Issues Associated with DNA Databases are discussed.
Abstract: Biological Basis for DNA Evidence, Peter Gill and John Buckleton Historical and Background Biology Understanding PCR Profiles A Framework for Interpreting Evidence, John Buckleton The Frequentist Approach The Logical Approach The Full Bayesian Approach A Possible Solution A Comparison of the Different Approaches Population Genetic Models, John Buckleton Product Rule Simulation Testing Discussion of the Product Rule and the Subpopulation Model A Complex Case Example - DNA Evidence and Orethral James Simpson Relatedness, John Buckleton and Christopher Triggs Conditional Probabilities Joint Probabilities The Unifying Formula The Effect of Linkage Validating Databases, John Buckleton Which Is the Relevant Population? Population Databases Validating the Population Genetic Model Estimating Q Descriptive Statistics for Databases Sampling Effects, John Buckleton and James Curran Bounds and a Level Methods for Assessing Sampling Uncertainty Minimum Allele Probabilities Discussion of the Appropriateness of Sampling Uncertainty Estimates Mixtures, Tim Clayton and John Buckleton Frequentist Approaches Bayesian Approaches Statistical Evaluation of Mixtures Low Copy Number, John Buckleton and Peter Gill Changes in LCN Profile Morphology The Interpretation of LCN Profiles Non-autosomal Forensic Markers, Simon Walsh, SallyAnn Harbison, and John Buckleton Forensic Mitochondrial DNA Typing Forensic Y Chromosome Analysis Forensic X Chromosome Analysis A Famous Case Example - The Romanovs Parentage Testing, John Buckleton, Tim Clayton, and Chris Triggs Evaluation Of Evidence Paternity Trios: Mother, Child and Alleged Father Non-autosomal DNA Use of the Sub-Population Model of Balding and Nichols to Evaluate the Paternity Index Relatedness in Paternity Cases Multiple Children Inconsistencies in the Mendelian Pattern 'Exclusions' Paternity Trios: Mother, Child and Alleged Father Considering the Possibility of Silent (Null) Alleles Disaster Victim Identification, Identification of Missing Persons, and Immigration Cases, John Buckleton, Chris Triggs, and Tim Clayton Mitochondrial or Nuclear DNA? Human Remains - Obtaining a Profile from Bodily Remains Extraction of DNA from Bone, Tooth, Hair and Nail Complicating Factors DNA Intelligence Databases, Simon Walsh and John Buckleton A Brief History Functional Aspects Legislation Aspects of Forensic Significance Social and ethical considerations Interpretation Issues Associated with DNA Databases

362 citations


Additional excerpts

  • ...stated ‘‘Once reliable continuous methods become available the binary method will have to be viewed as ‘‘second best’’ and will become obsolete’’ [16]....

    [...]

Journal ArticleDOI
TL;DR: New procedures are introduced which can cope efficiently with parameters of all sizes which require sampling from the normal distribution as an intermediate step.
Abstract: Accurate computer methods are evaluated which transform uniformly distributed random numbers into quantities that follow gamma, beta, Poisson, binomial and negative-binomial distributions. All algorithms are designed for variable parameters. The known convenient methods are slow when the parameters are large. Therefore new procedures are introduced which can cope efficiently with parameters of all sizes. Some algorithms require sampling from the normal distribution as an intermediate step. In the reported computer experiments the normal deviates were obtained from a recent method which is also described.

321 citations

Related Papers (5)