scispace - formally typeset
Search or ask a question
Author

Fritz A. Seiler

Bio: Fritz A. Seiler is an academic researcher. The author has contributed to research in topics: Risk management plan & Risk assessment. The author has an hindex of 6, co-authored 11 publications receiving 11221 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the Central Limit Theorem of probability theory applied to sums and products of stochastic variables is applied to the selection of appropriate probability distribution functions for all stochastically variables.
Abstract: One of the main steps in an uncertainty analysis is the selection of appropriate probability distribution functions for all stochastic variables. In this paper, criteria for such selections are reviewed, the most important among them being any a priori knowledge about the nature of a stochastic variable, and the Central Limit Theorem of probability theory applied to sums and products of stochastic variables. In applications of these criteria, it is shown that many of the popular selections, such as the uniform distribution for a poorly known variable, require far more knowledge than is actually available. However, the knowledge available is usually sufficient to make use of other, more appropriate distributions. Next, functions of stochastic variables and the selection of probability distributions for their arguments as well as the use of different methods of error propagation through these functions are discussed. From these evaluations, priorities can be assigned to determine which of the stochastic variables in a function need the most care in selecting the type of distribution and its parameters. Finally, a method is proposed to assist in the assignment of an appropriate distribution which is commensurate with the total information on a particular stochastic variable, and is based on the scientific method. Two examples are given to elucidate the method for cases of little or almost no information.

55 citations

Journal ArticleDOI
TL;DR: The purpose of this paper is to demonstrate that despite large errors, an analytical treatment of error propagation is possible in many instances.
Abstract: An essential facet of a risk assessment is the correct evaluation of uncertainties inherent in the numerical results. If the calculation is based on an explicit algebraic expression, an analytical treatment of error propagation is possible, usually as an approximation valid for small errors. In many instances, however, the errors are large and uncertain. It is the purpose of this paper to demonstrate that despite large errors, an analytical treatment is possible in many instances. These cases can be identified by an analysis of the algebraic structure and a detailed examination of the errors in input parameters and mathematical models. From a general formula, explicit formulas for some simple algebraic structures that occur often in risk assessments are derived and applied to practical problems.

23 citations

Journal ArticleDOI
TL;DR: This manuscript provides risk estimators for acute lethality from radiation-induced injury to the bone marrow of humans after uniform total-body exposure to low linear energy transfer (LET) radiation for nuclear disaster risk assessment.
Abstract: This manuscript provides risk estimators for acute lethality from radiation-induced injury to the bone marrow of humans after uniform total-body exposure to low linear energy transfer (LET) radiation. The risk estimators are needed for nuclear disaster risk assessment. The approach used is based on the dose X, in units of D50 (i.e., the dose required for 50% lethality). Using animal data, it is demonstrated that the use of dose in units of D50 eliminates most of the variability associated with mammalian species, type of low-LET radiation, and low-LET dose rate. Animal data are used to determine the shape of the dose-effect curve for marrow-syndrome lethality in man and to develop a functional relationship for the dependence of the D50 on dose rate. The functional relationship is used, along with the Weibull model, to develop acute lethality risk estimators for complex temporal patterns of continuous exposure to low-LET radiation. Animal data are used to test model predictions.

19 citations

Journal ArticleDOI
TL;DR: The data that a risk assessment needs as a minimum requirement for making a valid risk estimate is discussed, demonstrating that the “true” exposure-effect relationship is less and less important the more population characteristic factors are identified and the larger they are.
Abstract: Ecological studies of health effects due to agent exposure are generally considered to be a blunt instrument of scientific investigation, unfit to determine the “true” exposure-effect relationship for an agent. Based on this widely accepted tenet, ecological studies of the correlation between the local air concentration of radon and the local lung cancer mortality as measured by Cohen have been criticized as being subject to the “Ecological Fallacy” and thus producing invalid risk data. Here we discuss the data that a risk assessment needs as a minimum requirement for making a valid risk estimate. The examination of these data and a “thought experiment” show that it is Cohen's raw ecological data, uncorrected for population characteristic factors, which are the proper data for a risk assessment. Consequently, the “true” exposure-effect relationship is less and less important the more population characteristic factors are identified and the larger they are. This reversal of the usual argument is due to our...

15 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.
Abstract: The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

15,696 citations

Journal ArticleDOI
TL;DR: A simplified scoring system is proposed that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length.
Abstract: A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homologous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT-NS-2) and the iterative refinement method (FFT-NS-i), are implemented in MAFFT. The performances of FFT-NS-2 and FFT-NS-i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT-NS-2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT-NS-i is over 100 times faster than T-COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.

12,003 citations

Journal ArticleDOI
TL;DR: This work proposes a principled statistical framework for discerning and quantifying power-law behavior in empirical data by combining maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov-Smirnov (KS) statistic and likelihood ratios.
Abstract: Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the detection and characterization of power laws is complicated by the large fluctuations that occur in the tail of the distribution—the part of the distribution representing large but rare events—and by the difficulty of identifying the range over which power-law behavior holds. Commonly used methods for analyzing power-law data, such as least-squares fitting, can produce substantially inaccurate estimates of parameters for power-law distributions, and even in cases where such methods return accurate answers they are still unsatisfactory because they give no indication of whether the data obey a power law at all. Here we present a principled statistical framework for discerning and quantifying power-law behavior in empirical data. Our approach combines maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov-Smirnov (KS) statistic and likelihood ratios. We evaluate the effectiveness of the approach with tests on synthetic data and give critical comparisons to previous approaches. We also apply the proposed methods to twenty-four real-world data sets from a range of different disciplines, each of which has been conjectured to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data, while in others the power law is ruled out.

8,753 citations

Journal ArticleDOI
TL;DR: In this article, a base-calling program for automated sequencer traces, phred, with improved accuracy was proposed. But it was not shown to achieve a lower error rate than the ABI software, averaging 40%-50% fewer errors in the data sets examined independent of position in read, machine running conditions, or sequencing chemistry.
Abstract: The availability of massive amounts of DNA sequence information has begun to revolutionize the practice of biology. As a result, current large-scale sequencing output, while impressive, is not adequate to keep pace with growing demand and, in particular, is far short of what will be required to obtain the 3-billion-base human genome sequence by the target date of 2005. To reach this goal, improved automation will be essential, and it is particularly important that human involvement in sequence data processing be significantly reduced or eliminated. Progress in this respect will require both improved accuracy of the data processing software and reliable accuracy measures to reduce the need for human involvement in error correction and make human review more efficient. Here, we describe one step toward that goal: a base-calling program for automated sequencer traces, phred, with improved accuracy. phred appears to be the first base-calling program to achieve a lower error rate than the ABI software, averaging 40%-50% fewer errors in the data sets examined independent of position in read, machine running conditions, or sequencing chemistry.

7,627 citations

Journal ArticleDOI
08 Nov 2004
TL;DR: The motivation, development, use, and implications of the UT are reviewed, which show it to be more accurate, easier to implement, and uses the same order of calculations as linearization.
Abstract: The extended Kalman filter (EKF) is probably the most widely used estimation algorithm for nonlinear systems. However, more than 35 years of experience in the estimation community has shown that is difficult to implement, difficult to tune, and only reliable for systems that are almost linear on the time scale of the updates. Many of these difficulties arise from its use of linearization. To overcome this limitation, the unscented transformation (UT) was developed as a method to propagate mean and covariance information through nonlinear transformations. It is more accurate, easier to implement, and uses the same order of calculations as linearization. This paper reviews the motivation, development, use, and implications of the UT.

6,098 citations