scispace - formally typeset
Search or ask a question

Showing papers on "False positive paradox published in 1994"


Journal ArticleDOI
TL;DR: It is shown that by pooling sera samples the authors not only achieve a cost saving but also, which is counterintuitive, an increase in the estimation accuracy.
Abstract: Estimating the prevalence of the human immunodeficiency virus (HIV) in a group is challenging; this is especially so when the prevalence is small. One reason is that the presence of measurement errors resulting from the limited precision of tests makes estimation, using traditional methods, impossible in some screening situations. Measurement error is real, ignoring it leads to severe bias, and inference about the prevalence becomes unsatisfactory. Indeed, in a low prevalence situation the expected number of false positives is very high, often even higher than the number of true positives. The second reason is that in the low prevalence areas the large sample is needed in order to obtain non-zero estimate. This is usually a very costly, and often unrealistic, solution. This paper considers the advantages and disadvantages of pooled testing as an alternative solution to this problem. We show that by pooling sera samples we not only achieve a cost saving but also, which is counterintuitive, an increase in the estimation accuracy. We also discuss the statistical issues associated with the resulting estimator.

74 citations


Journal ArticleDOI
W F Lamboy1
TL;DR: The maximum percent bias, computed from the estimated proportions of false positives and false negatives in the RAPD data set, is proposed as a criterion for determining whether bias correction of the similarity coefficients is required or not.
Abstract: The production of informative random amplified polymorphic DNA (RAPD) markers using PCR and a single primer is often accompanied by the generation of artifactual (noninformative) bands as well. When RAPD data are used to compute genetic similarity coefficients, these artifacts (false positives, false negatives, or both) can cause large biases in the numerical values of the coefficients. As a result, some workers have been reluctant to use RAPD markers in the estimation of genetic similarities. Artifactual bands are of two types: those caused by variation in experimental conditions, and those caused by characteristics of the DNA to be amplified. A procedure is described that allows for correction of the bias caused by the first type of artifact, providing that replicate DNA samples have been extracted, amplified, and scored. The resulting data are used to obtain an estimate of the proportion of false-positive and false-negatives bands. These values are then used to correct the bias in the computed similarity coefficients. Two examples are given, one in which bias correction is critical to the results, and one in which it is less important. The maximum percent bias, computed from the estimated proportions of false positives and false negatives in the RAPD data set, is proposed as a criterion for determining whether bias correction of the similarity coefficients is required or not. Although all reasonable efforts should be made to optimize PCR protocols to eliminate artifactual bands, when this is not possible, the methods described allow RAPD markers to compute genetic similarities reliably and accurately, even when artifactual bands resulting from variation in experimental conditions are present.

35 citations


Journal ArticleDOI
TL;DR: The present study demonstrates that lowering the alpha level to some fixed, predetermined value is not a recommended strategy, and a procedure for adjusting the probability level for a test of association between genotypes and a disorder is given.
Abstract: Recent analysis of the candidate gene, association study for psychiatric disorders have concluded that most statistically significant results are likely to be false positives because there are a large number of potential candidate loci and a low a priori probability that a given candidate locus will in fact be trait relevant. Hence, it was recommended that the α level (P level) be lowered for association studies. The present study demonstrates that lowering the α level to some fixed, predetermined value is not a recommended strategy. Rather, the probability of false positives (and false negatives) depends on such parameters as the prevalence of the disorder, the prevalence of the genotypes at the candidate locus, and the relative risk. In some areas of the parameter space, the adjustment to α may be modest. In other areas, however, even the requirement of one or more independent replications of the original results gives false positive rates exceeding 80% or 90%. Hence, the P levels required to minimize false positives may have to be changed from one statistical test to another even within the same study. A procedure for adjusting the probability level for a test of association between genotypes and a disorder is given. © 1994 Wiley-Liss, Inc.

33 citations


Journal ArticleDOI
TL;DR: The author contends that clinical efficacy in no way assures that a false negative or a false positive has been avoided and plea is made for theorists and researchers to acknowledge that both categories of errors can occur and to conduct future clinical and laboratory research accordingly.
Abstract: Logically, two broad types of mnemonic errors are possible when adult psychotherapy or hypnosis patients reflect on whether they were sexually abused or not as a child. They may believe that they were not abused when in fact they were (false negative error), or they may believe they were abused when in fact they were not (false positive error). The author briefly reviews the empirical evidence for the occurrence of each of these types of errors, and illustrates each with a clinical case. Further, in considering the incidence, importance, and clinical implications of these errors, the author contends that clinical efficacy in no way assures that a false negative or a false positive has been avoided. A plea is made for theorists and researchers to acknowledge that both categories of errors can occur and to conduct future clinical and laboratory research accordingly.

29 citations


Proceedings ArticleDOI
13 Nov 1994
TL;DR: The adaptive iris filter has been developed, which can enhance only rounded opacities and is insensitive to long and slender shadows and has shown the effectiveness of the proposed automated detection system.
Abstract: The automated detection system consists of two processing steps. The first processing is to enhance cancerous tumors. For the purpose, the adaptive iris filter has been developed, which can enhance only rounded opacities and is insensitive to long and slender shadows. The second one is to discriminate between malignant tumors and the others by applying shape analysis to the tumor candidates. Nine feature parameters have been developed for reliable identification of malignant tumors. Experiments to test the performance of the proposed system have been made. The average number of false positives per image is only 0.18 where the true positive detection rate is 100%. These experimental results have shown the effectiveness of the proposed system. >

25 citations


Journal ArticleDOI
TL;DR: The authors consider the role of misclassification costs in developing classification trees by varying the ratio of costs assigned to false negatives and false positives, and a set of sensitivity-specificity combinations define a curve that can be used like an ROC curve.
Abstract: A common problem in medical diagnosis is to combine information from several tests or patient characteristics into a decision rule to distinguish diseased from healthy patients. Among the statistical procedures proposed to solve this problem, recursive partitioning is appealing for the easily-used and intuitive nature of the rules it produces. The rules have the form of classification trees, in which each node of the tree represents a simple question about one of the predictor variables, and the branch taken depends on the answer. The authors consider the role of misclassification costs in developing classification trees. By varying the ratio of costs assigned to false negatives and false positives, a series of classification trees are generated, each optimal for some range of cost ratios, and each with a different sensitivity and specificity. The set of sensitivity-specificity combinations define a curve that can be used like an ROC curve.

24 citations


Proceedings ArticleDOI
11 May 1994
TL;DR: The effects of image quality, particularly image noise, on the performance of an on-going CAD scheme for the detection of clustered microcalcifications in digital mammograms are examined to determine the causes of false-negative and false-positive clusters.
Abstract: The accuracy of computer-aided detection (CAD) schemes involves a tradeoff between high sensitivity and low false-positive rate. In an on-going study, we are analyzing our CAD scheme for the detection of clustered microcalcifications in digital mammograms to determine the causes of false-negative and false-positive clusters. Two different limitations that lead to false-negatives and false-positives have been identified. The first limitation is imposed by the quality of the digital mammogram, whereas the second is a consequence of the similarities of radiographic features between true and false clusters. In this paper, we examine the effects of image quality, particularly image noise, on the performance of our CAD scheme. Preliminary results indicate that the performance of our scheme is limited by anatomic noise and x-ray quantum noise. Almost all the false positives detected in clinical images by our CAD scheme are caused by a combination of these two forms of noise.

8 citations


Patent
16 Sep 1994
TL;DR: In this article, an automated interactive cytology system provides expedited handling of samples, minimizing false negatives, while not substantially increasing the number false positives, by identifying and displaying the cells which are of greatest interest to the cytologist.
Abstract: An automated interactive cytology system provides expedited handling of samples, minimizing false negatives, while not substantially increasing the number false positives. A computerized system identifies and displays the cells which are of greatest interest to the cytologist. The system then processes this information on all cells identified to classify the slide as normal, abnormal, or questionable based on a statistical analysis of cells meeting given criteria. Before displaying the results of the statistical analysis, a cytologist reviews the cells which the computer has determined to be most significant. It is only then after the cytologist has determined whether the cells are positive, negative, or questionable, that the determination is inputted into the automated system. The automated system then compares the cytologist's analysis with its own statistical analysis. Based on the two opinions, the cytologist determines how to advise a doctor regarding the sample.

5 citations



Journal ArticleDOI
16 Nov 1994-JAMA
TL;DR: It is suggested that some confusion surrounds the use of the term "false-positive rate" in the article comparing capillary blood lead levels with simultaneously drawn venous samples, and the data show excellent correlation between the two sampling methods.
Abstract: In Reply. —Drs Rainey and Schonfeld have suggested that some confusion surrounds the use of the term "false-positive rate" in our article comparing capillary blood lead levels with simultaneously drawn venous samples. The quantity we use (false positives/total screenings) is, strictly speaking, a proportion and not a rate.1,2This proportion, the proportion preferred by Rainey and Schonfeld (false positives/[false positives + true positives]) called a false-positive rate, and the positive predictive value are all useful in evaluating screening tests but need to be translated into practical terms. Our data show excellent correlation between the two sampling methods. Even when capillary results differed from venous results, they were usually not very far off the mark. How best to quantitatively represent this was the subject of much discussion among the authors. The false-positive rate, in our opinion, is limited by its necessarily strict construction. For example, using 0.72 μmol/L (15 μg/dL) of

2 citations


Journal ArticleDOI
TL;DR: Results indicated that in general, women with a false positive result at ovarian cancer screening did not have elevated scores on psychometric measures of psychiatric morbidity or anxiety, although significantly more of the women who had false positive than negative results described themselves as “more worried” about cancer since taking part in the screening.
Abstract: Over the past few years rising concern has been voiced regarding the emotional aftermath of false positive results in screening for cancer. The present study compared psychological status in women with negative and false positive results at ovarian cancer screening, one year after the event. Women with false positive results were subdivided into those who were positive at the first scan, but negative thereafter and those who were referred for surgery before they could be reassured that they did not have cancer. The aim was to assess longer term distress using standardised questionnaire measures of psychological disturbance. Results indicated that in general, women with a false positive result at ovarian cancer screening did not have elevated scores on psychometric measures of psychiatric morbidity (GHQ-28) or anxiety (STAI), although significantly more of the women who had false positive than negative results described themselves as “more worried” about cancer since taking part in the screening p...

Journal ArticleDOI
TL;DR: Analisou-se a validade da utilizacao da fita "CIMDER de 3 cores" pelos agentes de saude do Estado of Rondonia, Brasil, como instrumento de deteccao de risco nutricional entre criancas menores of 5 anos, para fins de encaminhamento para controle em unidades de maior complexidade that o posto of saude.
Abstract: The value of the use, by the health agents in Rondonia, Brazil, of the nutritional classification proposed by the Multidisciplinary Research Center for Rural Development (CIMDER), Colombia, known as the three color CIMDER band, is analyzed. The band, used to measure arm circumference, would be used as an instrument for the detection of nutritional risk in children under five years of age and for refering them to larger, more complex, health units. For this purpose, a sample of 1,268 children were studied. The results of the nutritional classification obtained by the band and the results of Gomez classification were compared. The application of the validations tests resulted in the following values: sensibility = 77.1%; specificity = 68.8%; positive predictive value = 59.0%; negative predictive value = 83.7%; rate of false positives = 31.2% and rate of false negatives = 22.9%. Except for the rate of false positives, the rest of the results were considered to be satisfactory, sufficiently so to recommend the use of the CIMDER band as an instrument of selection by the health agents in Rondonia. More specific indicators should be adopted at the larger, more complex health units, with a view to reducing the number of false positives in the programs for attendance to the undernourished.

Journal ArticleDOI
16 Nov 1994-JAMA
TL;DR: Capillary and venous blood lead levels in children were compared and it was shown that between 0% and 5% (depending on sampling protocol) of the capillary results were false positives and 1% to 8% false negatives.
Abstract: To the Editor. —To achieve the goal of blood lead screening for all children aged 6 months to 6 years, 1 testing should be easy to perform, widely available, inexpensive, and reliable. Specimens are more easily obtained from children by fingerstick than by venipuncture. However, increased opportunity for contamination of capillary specimens has raised concerns about their reliability. Dr Schlenker and colleagues 2 compared capillary and venous blood lead levels in children and showed that between 0% and 5% (depending on sampling protocol) of the capillary results were false positives and 1% to 8% false negatives. 2 These percentages were referred to as false-positive and false-negative rates, which may lead to confusion. For example, the term "false-positive rate" traditionally has meant the percentage of false positives in the total positives (false positives/[false positives + true positives]). 3 Only this statistic has predictive value for determining the significance of a positive test.

Proceedings ArticleDOI
11 Nov 1994
TL;DR: It was found that AdaWise generated a small number of total warnings, and that false positives usually indicated areas of weakness in the products tested, and analyzes the warnings that were issued.
Abstract: AdaWise, a set of tools currently under development at ORA, performs automatic checks to verify the absence of common run-time errors affecting the correctness or portability of Ada programs The tools can be applied to programs of arbitrary size, and they are conservative—that is, the absence of a warning guarantees the absence of a problem If AdaWise issues a warning, there is a potential error that should be investigated by the programmer AdaWise checks at compile-time for such potential errors as incorrect order dependence and erroneous execution due to improper aliasing These errors are not detected by typical compilers We ran two of the tools on several publicly available Ada software products to determine if the tools issue useful warnings without bombarding the user with “false positives” We found that AdaWise generated a small number of total warnings, and that false positives usually indicated areas of weakness in the products testedThis paper describes our preliminary tests using the AdaWise toolset, and analyzes the warnings that were issued

Journal ArticleDOI

Journal ArticleDOI
TL;DR: The value of the use, by the health agents in Rondonia, Brazil, of the nutritional classification proposed by the Multidisciplinary Research Center for Rural Development (CIMDER), Colombia, known as the three color CIMDER band is analyzed.
Abstract: The value of the use, by the health agents in Rondonia, Brazil, of the nutritional classification proposed by the Multidisciplinary Research Center for Rural Development (CIMDER), Colombia, known as the three color CIMDER band, is analyzed. The band, used to measure arm circumference, would be used as an instrument for the detection of nutritional risk in children under five years of age and for refering them to larger, more complex, health units. For this purpose, a sample of 1,268 children were studied. The results of the nutritional classification obtained by the band and the results of Gomez classification were compared. The application of the validations tests resulted in the following values: sensibility = 77.1%; specificity = 68.8%; positive predictive value = 59.0%; negative predictive value = 83.7%; rate of false positives = 31.2% and rate of false negatives = 22.9%. Except for the rate of false positives, the rest of the results were considered to be satisfactory, sufficiently so to recommend the use of the CIMDER band as an instrument of selection by the health agents in Rondonia. More specific indicators should be adopted at the larger, more complex health units, with a view to reducing the number of false positives in the programs for attendance to the undernourished.

Journal ArticleDOI
10 Aug 1994-JAMA
TL;DR: The analysis could not quantify all the potential costs and benefits of testing programs, but does include the costs of physician visits and of monitoring CD4 + counts in individuals with diabetes.
Abstract: In Reply. —We thank Drs Roizen, Foss, and Mantha for reemphasizing the potential impact of false-positive test results that may occur when testing a low-prevalence population such as physicians and dentists. As they noted, with our baseline estimate as 0.4% prevalence, 20% of physicians testing positive would be false positives. As Roizen et al discuss, the costs of false-positive test results will depend on the prevalence in the population, the sensitivity and specificity of the testing sequence, and the consequences of false results. We stated in our article that the true values for these factors are uncertain; therefore, we used a wide range of values in our sensitivity analyses. Our analysis could not quantify all the potential costs and benefits of testing programs, some of which are nicely described by Roizen et al. Our analysis does include the costs of physician visits and of monitoring CD4 + counts in individuals with

Journal ArticleDOI
10 Aug 1994-JAMA
TL;DR: The authors' analysis indicates that this article may have underestimated the costs for false-positive testing in relation to human immunodeficiency virus (HIV) infection.
Abstract: To the Editor. —Although we found the article by Dr Phillips and colleagues1 interesting, we have several questions regarding their assumptions. First of all, were the correct costs for false-positive testing included? If one assumes the sensitivity and specificity of the tests cited in the article (99% and 99.9% for two of three enzyme-linked immunosorbent assay tests and a Western blot for confirmation), for every 100 000 surgeons tested (assuming a prevalence of 0.06%; data cited in the article), 59.4 true positives and 99 false positives would be detected. Thus, two thirds of the surgeons detected by even this careful scheme of testing would not have human immunodeficiency virus (HIV) infection. Alternatively, if a prevalence rate of 0.4% is assumed, 396 surgeons would be true positives and 99.6 surgeons (20% of the surgeons tested positive) would be false positives.Our analysis indicates that this article may have underestimated the

01 Jan 1994
TL;DR: In a recent Research Note, Hart, Webster, and Menzies as discussed by the authors discuss several problems with describing clinical judgments about violence in this way, among which is inconsistent use of statistical terminology in publications about violence prediction.
Abstract: In a recent Research Note, Hart, Webster, and Menzies (1993) recognize that most published research (e.g., Klassen & O'Connor, 1988; McNiel & Binder, 1987; Otto, 1992) on violence prediction has described accuracy using 2 x 2 contingency tables. These tables treat dangerousness assessments as binary (yesor-no) predictions about the future, and portray the results of these predictions as true positive (TP), true negative (TN), false positive (FP), or false negative (FN) (see Table 1, top portion). Hart and colleagues discuss several problems with describing clinical judgments about violence in this way, among which is the inconsistent use of statistical terminology in publications about violence prediction. They note that Monahan's landmark monograph (1981) uses "percent false positives" to designate the fraction of persons who were predicted to be violent but were not, whereas Otto's review of "second-generation" (post-1980) studies described prediction accuracy