scispace - formally typeset
Search or ask a question

Showing papers on "False positive paradox published in 2015"


Journal ArticleDOI
TL;DR: The level of replication required for accurate detection of targeted taxa in different contexts was evaluated and whether statistical approaches developed to estimate occupancy in the presence of observational errors can successfully estimate true prevalence, detection probability and false‐positive rates was evaluated.
Abstract: Environmental DNA (eDNA) metabarcoding is increasingly used to study the present and past biodiversity. eDNA analyses often rely on amplification of very small quantities or degraded DNA. To avoid missing detection of taxa that are actually present (false negatives), multiple extractions and amplifications of the same samples are often performed. However, the level of replication needed for reliable estimates of the presence/absence patterns remains an unaddressed topic. Furthermore, degraded DNA and PCR/sequencing errors might produce false positives. We used simulations and empirical data to evaluate the level of replication required for accurate detection of targeted taxa in different contexts and to assess the performance of methods used to reduce the risk of false detections. Furthermore, we evaluated whether statistical approaches developed to estimate occupancy in the presence of observational errors can successfully estimate true prevalence, detection probability and false-positive rates. Replications reduced the rate of false negatives; the optimal level of replication was strongly dependent on the detection probability of taxa. Occupancy models successfully estimated true prevalence, detection probability and false-positive rates, but their performance increased with the number of replicates. At least eight PCR replicates should be performed if detection probability is not high, such as in ancient DNA studies. Multiple DNA extractions from the same sample yielded consistent results; in some cases, collecting multiple samples from the same locality allowed detecting more species. The optimal level of replication for accurate species detection strongly varies among studies and could be explicitly estimated to improve the reliability of results.

490 citations


Journal ArticleDOI
TL;DR: The results of recent theoretical studies suggest that balancing selection may be ubiquitous but transient, leaving few signatures detectable by existing methods, and novel solutions, recently developed model‐based approaches and good practices that should be implemented in future studies looking for signals of balancing selection are emphasized.
Abstract: In spite of the long-term interest in the process of balancing selection, its frequency in genomes and evolutionary significance remain unclear due to challenges related to its detection. Current statistical approaches based on patterns of variation observed in molecular data suffer from low power and a high incidence of false positives. This raises the question whether balancing selection is rare or is simply difficult to detect. We discuss genetic signatures produced by this mode of selection and review the current approaches used for their identification in genomes. Advantages and disadvantages of the available methods are presented, and areas where improvement is possible are identified. Increased specificity and reduced rate of false positives may be achieved by using a demographic model, applying combinations of tests, appropriate sampling scheme and taking into account intralocus variation in selection pressures. We emphasize novel solutions, recently developed model-based approaches and good practices that should be implemented in future studies looking for signals of balancing selection. We also draw attention of the readers to the results of recent theoretical studies, which suggest that balancing selection may be ubiquitous but transient, leaving few signatures detectable by existing methods. Testing this new theory may require the development of novel high-throughput methods extending beyond genomic scans.

188 citations


Posted Content
TL;DR: Experimental evaluations show significant performance gain using dataset bootstrapping and demonstrate state-of-the-art results achieved by the proposed deep metric learning methods.
Abstract: Existing fine-grained visual categorization methods often suffer from three challenges: lack of training data, large number of fine-grained categories, and high intraclass vs. low inter-class variance. In this work we propose a generic iterative framework for fine-grained categorization and dataset bootstrapping that handles these three challenges. Using deep metric learning with humans in the loop, we learn a low dimensional feature embedding with anchor points on manifolds for each category. These anchor points capture intra-class variances and remain discriminative between classes. In each round, images with high confidence scores from our model are sent to humans for labeling. By comparing with exemplar images, labelers mark each candidate image as either a "true positive" or a "false positive". True positives are added into our current dataset and false positives are regarded as "hard negatives" for our metric learning model. Then the model is retrained with an expanded dataset and hard negatives for the next round. To demonstrate the effectiveness of the proposed framework, we bootstrap a fine-grained flower dataset with 620 categories from Instagram images. The proposed deep metric learning scheme is evaluated on both our dataset and the CUB-200-2001 Birds dataset. Experimental evaluations show significant performance gain using dataset bootstrapping and demonstrate state-of-the-art results achieved by the proposed deep metric learning methods.

149 citations


Journal ArticleDOI
TL;DR: This article serves as an introduction to hypothesis testing and multiple comparisons for practical research applications, with a particular focus on its use in the analysis of functional magnetic resonance imaging data.
Abstract: Objective The need for appropriate multiple comparisons correction when performing statistical inference is not a new problem. However, it has come to the forefront in many new modern data-intensive disciplines. For example, researchers in areas such as imaging and genetics are routinely required to simultaneously perform thousands of statistical tests. Ignoring this multiplicity can cause severe problems with false positives, thereby introducing nonreproducible results into the literature. Methods This article serves as an introduction to hypothesis testing and multiple comparisons for practical research applications, with a particular focus on its use in the analysis of functional magnetic resonance imaging data. Results We discuss hypothesis testing and a variety of principled techniques for correcting for multiple tests. We also illustrate potential pitfalls problems that can occur if the multiple comparisons issue is not dealt with properly. We conclude, by discussing effect size estimation, an issue often linked with the multiple comparisons problem. Conclusions Failure to properly account for multiple comparisons will ultimately lead to heightened risks for false positives and erroneous conclusions.

134 citations


Journal ArticleDOI
TL;DR: A novel algorithm to identify fetal microdeletion events in maternal plasma has been developed and used in clinical laboratory‐based noninvasive prenatal testing to identify the subchromosomal events 5pdel, 22q11del, 15qdel, 1p36del, 4pDel, 11qDel, and 8qdel.
Abstract: Objective A novel algorithm to identify fetal microdeletion events in maternal plasma has been developed and used in clinical laboratory-based noninvasive prenatal testing. We used this approach to identify the subchromosomal events 5pdel, 22q11del, 15qdel, 1p36del, 4pdel, 11qdel, and 8qdel in routine testing. We describe the clinical outcomes of those samples identified with these subchromosomal events. Methods Blood samples from high-risk pregnant women submitted for noninvasive prenatal testing were analyzed using low coverage whole genome massively parallel sequencing. Sequencing data were analyzed using a novel algorithm to detect trisomies and microdeletions. Results In testing 175 393 samples, 55 subchromosomal deletions were reported. The overall positive predictive value for each subchromosomal aberration ranged from 60% to 100% for cases with diagnostic and clinical follow-up information. The total false positive rate was 0.0017% for confirmed false positives results; false negative rate and sensitivity were not conclusively determined. Conclusion Noninvasive testing can be expanded into the detection of subchromosomal copy number variations, while maintaining overall high test specificity. In the current setting, our results demonstrate high positive predictive values for testing of rare subchromosomal deletions. © 2015 The Authors. Prenatal Diagnosis published by John Wiley & Sons Ltd.

131 citations


Journal ArticleDOI
01 Feb 2015-Ecology
TL;DR: A general framework to model false positives in occupancy studies and extend existing modeling approaches to encompass a broader range of sampling designs is established and three common sampling designs that are likely to cover most scenarios encountered by researchers are identified.
Abstract: The occurrence of false positive detections in presence-absence data, even when they occur infrequently, can lead to severe bias when estimating species occupancy patterns. Building upon previous efforts to account for this source of observational error, we established a general framework to model false positives in occupancy studies and extend existing modeling approaches to encompass a broader range of sampling designs. Specifically, we identified three common sampling designs that are likely to cover most scenarios encountered by researchers. The different designs all included ambiguous detections, as well as some known-truth data, but their modeling differed in the level of the model hierarchy at which the known-truth information was incorporated (site level or observation level). For each model, we provide the likelihood, as well as R and BUGS code needed for implementation. We also establish a clear terminology and provide guidance to help choosing the most appropriate design and modeling approach.

119 citations


Journal ArticleDOI
TL;DR: It is discovered that it is not possible to achieve high sensitivity and high specificity at the same time, and the crucial challenge in making tractography a truly useful and reliable tool in brain research and neurology lies in the acquisition of better data.
Abstract: In this study, we used invasive tracing to evaluate white matter tractography methods based on ex vivo diffusion-weighted magnetic resonance imaging (dwMRI) data. A representative selection of tractography methods were compared to manganese tracing on a voxel-wise basis, and a more qualitative assessment examined whether, and to what extent, certain fiber tracts and gray matter targets were reached. While the voxel-wise agreement was very limited, qualitative assessment revealed that tractography is capable of finding the major fiber tracts, although there were some differences between the methods. However, false positive connections were very common and, in particular, we discovered that it is not possible to achieve high sensitivity (i.e., few false negatives) and high specificity (i.e., few false positives) at the same time. Closer inspection of the results led to the conclusion that these problems mainly originate from regions with complex fiber arrangements or high curvature and are not easily resolved by sophisticated local models alone. Instead, the crucial challenge in making tractography a truly useful and reliable tool in brain research and neurology lies in the acquisition of better data. In particular, the increase of spatial resolution, under preservation of the signal-to-noise-ratio, is key.

116 citations


Book ChapterDOI
05 Oct 2015
TL;DR: The feasibility of convolutional neural networks (CNNs) as an effective mechanism for eliminating false positives is investigated and a vessel-aligned multi-planar image representation of emboli is developed.
Abstract: Computer-aided detection (CAD) can play a major role in diagnosing pulmonary embolism (PE) at CT pulmonary angiography (CTPA). However, despite their demonstrated utility, to achieve a clinically acceptable sensitivity, existing PE CAD systems generate a high number of false positives, imposing extra burdens on radiologists to adjudicate these superfluous CAD findings. In this study, we investigate the feasibility of convolutional neural networks (CNNs) as an effective mechanism for eliminating false positives. A critical issue in successfully utilizing CNNs for detecting an object in 3D images is to develop a “right” image representation for the object. Toward this end, we have developed a vessel-aligned multi-planar image representation of emboli. Our image representation offers three advantages: (1) efficiency and compactness—concisely summarizing the 3D contextual information around an embolus in only 2 image channels, (2) consistency—automatically aligning the embolus in the 2-channel images according to the orientation of the affected vessel, and (3) expandability—naturally supporting data augmentation for training CNNs. We have evaluated our CAD approach using 121 CTPA datasets with a total of 326 emboli, achieving a sensitivity of 83% at 2 false positives per volume. This performance is superior to the best performing CAD system in the literature, which achieves a sensitivity of 71% at the same level of false positives. We have further evaluated our system using the entire 20 CTPA test datasets from the PE challenge. Our system outperforms the winning system from the challenge at 0mm localization error but is outperformed by it at 2mm and 5mm localization errors. In our view, the performance at 0mm localization error is more important than those at 2mm and 5mm localization errors.

113 citations


Journal ArticleDOI
TL;DR: The authors recommend different de-duplication options based on the skill level of the searcher and the purpose of de- DUplication efforts.
Abstract: Objective The purpose of this study was to compare effectiveness of different options for de-duplicating records retrieved from systematic review searches. Methods Using the records from a published systematic review, five de-duplication options were compared. The time taken to de-duplicate in each option and the number of false positives (were deleted but should not have been) and false negatives (should have been deleted but were not) were recorded. Results The time for each option varied. The number of positive and false duplicates returned from each option also varied greatly. Conclusion The authors recommend different de-duplication options based on the skill level of the searcher and the purpose of de-duplication efforts.

95 citations


Journal ArticleDOI
26 Aug 2015-PLOS ONE
TL;DR: A mathematical model of scientific discovery that combines hypothesis formation, replication, publication bias, and variation in research quality is developed and it is found that communication of negative replications may aid true discovery even when attempts to replicate have diminished power.
Abstract: Many published research results are false (Ioannidis, 2005), and controversy continues over the roles of replication and publication policy in improving the reliability of research. Addressing these problems is frustrated by the lack of a formal framework that jointly represents hypothesis formation, replication, publication bias, and variation in research quality. We develop a mathematical model of scientific discovery that combines all of these elements. This model provides both a dynamic model of research as well as a formal framework for reasoning about the normative structure of science. We show that replication may serve as a ratchet that gradually separates true hypotheses from false, but the same factors that make initial findings unreliable also make replications unreliable. The most important factors in improving the reliability of research are the rate of false positives and the base rate of true hypotheses, and we offer suggestions for addressing each. Our results also bring clarity to verbal debates about the communication of research. Surprisingly, publication bias is not always an obstacle, but instead may have positive impacts—suppression of negative novel findings is often beneficial. We also find that communication of negative replications may aid true discovery even when attempts to replicate have diminished power. The model speaks constructively to ongoing debates about the design and conduct of science, focusing analysis and discussion on precise, internally consistent models, as well as highlighting the importance of population dynamics.

84 citations


Journal ArticleDOI
TL;DR: It is recommended that other data mining techniques be explored; a study using k-means data mining algorithm followed by signature-based approach is proposed in order to lessen the false negative rate; and a system for automatically identifying the number of clusters may be developed.

Journal ArticleDOI
TL;DR: A simple machine learning approach to real–bogus classification is explored by constructing a training set from the image data of ∼32 000 real astrophysical transients and bogus detections from the Pan-STARRS1 Medium Deep Survey and derives the feature representation from the pixel intensity values of a 20 × 20 pixel stamp around the centre of the candidates.
Abstract: Efficient identification and follow-up of astronomical transients is hindered by the need for humans to manually select promising candidates from data streams that contain many false positives. These artefacts arise in the difference images that are produced by most major ground-based time-domain surveys with large format CCD cameras. This dependence on humans to reject bogus detections is unsustainable for next generation all-sky surveys and significant effort is now being invested to solve the problem computationally. In this paper, we explore a simple machine learning approach to real–bogus classification by constructing a training set from the image data of ∼32 000 real astrophysical transients and bogus detections from the Pan-STARRS1 Medium Deep Survey. We derive our feature representation from the pixel intensity values of a 20 × 20 pixel stamp around the centre of the candidates. This differs from previous work in that it works directly on the pixels rather than catalogued domain knowledge for feature design or selection. Three machine learning algorithms are trained (artificial neural networks, support vector machines and random forests) and their performances are tested on a held-out subset of 25 per cent of the training data. We find the best results from the random forest classifier and demonstrate that by accepting a false positive rate of 1 per cent, the classifier initially suggests a missed detection rate of around 10 per cent. However, we also find that a combination of bright star variability, nuclear transients and uncertainty in human labelling means that our best estimate of the missed detection rate is approximately 6 per cent.

Book ChapterDOI
17 Jun 2015
TL;DR: The overall objective is to have as few false positive face detections as possible without losing mask detections in order to trigger alarms only for healthcare personnel who do not wear the surgical mask.
Abstract: This paper introduces a system that detects the presence or absence of the mandatory medical mask in the operating room. The overall objective is to have as few false positive face detections as possible without losing mask detections in order to trigger alarms only for healthcare personnel who do not wear the surgical mask. The medical mask detection is performed with two face detectors; one of them for the face itself, and the other one for the medical mask. Both detectors run color processing in order to enhance the true positives to false positives ratio. The proposed system renders a recall above 95 % with a false positive rate below 5 % for the detection of faces and surgical masks. The system provides real-time image processing, reaching 10 fps on VGA resolution when processing the whole image. The Mixture of Gaussians technique for background subtraction increases the performance up to 20 fps on VGA images. VGA resolution allows for face or mask detection up to 5 m from the camera.

Proceedings ArticleDOI
16 Apr 2015
TL;DR: An automatic method via deep learning based 3D feature representation is presented, which solves this detection problem with three steps: candidates localization with high sensitivity, feature representation, and precise classification for reducing false positives.
Abstract: Clinical identification and rating of the cerebral microbleeds (CMBs) are important in vascular diseases and dementia diagnosis. However, manual labeling is time-consuming with low reproducibility. In this paper, we present an automatic method via deep learning based 3D feature representation, which solves this detection problem with three steps: candidates localization with high sensitivity, feature representation, and precise classification for reducing false positives. Different from previous methods by exploiting low-level features, e.g., shape features and intensity values, we utilize the deep learning based high-level feature representation. Experimental results validate the efficacy of our approach, which outperforms other methods by a large margin with a high sensitivity while significantly reducing false positives per subject.

Patent
20 Feb 2015
TL;DR: In this paper, the authors present a system and methods for detecting anomalies in computer network traffic with fewer false positives and without the need for time-consuming and unreliable historical baselines.
Abstract: The present invention relates to systems and methods for detecting anomalies in computer network traffic with fewer false positives and without the need for time-consuming and unreliable historical baselines. Upon detection, traffic anomalies can be processed to determine valuable network insights, including health of interfaces, devices and network services, as well as to provide timely alerts in the event of attack.

Book ChapterDOI
28 Jun 2015
TL;DR: This work employs a unique edge classifier and an original voting scheme to capture geometric features of polyps in context and then harness the power of convolutional neural networks in a novel score fusion approach to extract and combine shape, color, texture, and temporal information of the candidates.
Abstract: Computer-aided detection (CAD) can help colonoscopists reduce their polyp miss-rate, but existing CAD systems are handicapped by using either shape, texture, or temporal information for detecting polyps, achieving limited sensitivity and specificity. To overcome this limitation, our key contribution of this paper is to fuse all possible polyp features by exploiting the strengths of each feature while minimizing its weaknesses. Our new CAD system has two stages, where the first stage builds on the robustness of shape features to reliably generate a set of candidates with a high sensitivity, while the second stage utilizes the high discriminative power of the computationally expensive features to effectively reduce false positives. Specifically, we employ a unique edge classifier and an original voting scheme to capture geometric features of polyps in context and then harness the power of convolutional neural networks in a novel score fusion approach to extract and combine shape, color, texture, and temporal information of the candidates. Our experimental results based on FROC curves and a new analysis of polyp detection latency demonstrate a superiority over the state-of-the-art where our system yields a lower polyp detection latency and achieves a significantly higher sensitivity while generating dramatically fewer false positives. This performance improvement is attributed to our reliable candidate generation and effective false positive reduction methods.

Journal ArticleDOI
TL;DR: A new, less restrictive definition increases detection of Klebsiella pneumoniae carbapenemase producers and reduces the likelihood of underestimating the number of infections.
Abstract: Preventing transmission of carbapenemase-producing, carbapenem-resistant Enterobacteriaceae (CP-CRE) is a public health priority. A phenotype-based definition that reliably identifies CP-CRE while minimizing misclassification of non–CP-CRE could help prevention efforts. To assess possible definitions, we evaluated enterobacterial isolates that had been tested and deemed nonsusceptible to >1 carbapenem at US Emerging Infections Program sites. We determined the number of non-CP isolates that met (false positives) and CP isolates that did not meet (false negatives) the Centers for Disease Control and Prevention CRE definition in use during our study: 30% (94/312) of CRE had carbapenemase genes, and 21% (14/67) of Klebsiella pneumoniae carbapenemase–producing Klebsiella isolates had been misclassified as non-CP. A new definition requiring resistance to 1 carbapenem rarely missed CP strains, but 55% of results were false positive; adding the modified Hodge test to the definition decreased false positives to 12%. This definition should be considered for use in carbapenemase-producing CRE surveillance and prevention.

Journal ArticleDOI
TL;DR: A novel test to detect p-hacking in research is suggested, that is, when researchers report excessive rates of "significant effects" that are truly false positives.
Abstract: Simonsohn, Nelson, and Simmons (2014) have suggested a novel test to detect p-hacking in research, that is, when researchers report excessive rates of "significant effects" that are truly false positives. Although this test is very useful for identifying true effects in some cases, it fails to identify false positives in several situations when researchers conduct multiple statistical tests (e.g., reporting the most significant result). In these cases, p-curves are right-skewed, thereby mimicking the existence of real effects even if no effect is actually present.

Journal ArticleDOI
TL;DR: This work presents the most comprehensive test of occupancy estimation methods to date, using more than 33 000 auditory call observations collected under standard field conditions and where the true occupancy status of sites was known.
Abstract: Summary Populations are rarely censused. Instead, observations are subject to incomplete detection, misclassification and detection heterogeneity that result from human and environmental constraints. Though numerous methods have been developed to deal with observational uncertainty, validation under field conditions is rare because truth is rarely known in these cases. We present the most comprehensive test of occupancy estimation methods to date, using more than 33 000 auditory call observations collected under standard field conditions and where the true occupancy status of sites was known. Basic occupancy estimation approaches were biased when two key assumptions were not met: that no false positives occur and that no unexplained heterogeneity in detection parameters occurs. The greatest bias occurred for dynamic parameters (i.e. local colonization and extinction), and in many cases, the degree of inaccuracy would render results largely useless. We examined three approaches to increase adherence or relax these assumptions: modifying the sampling design, employing estimators that account for false-positive detections and using covariates to account for site-level heterogeneity in both false-negative and false-positive detection probabilities. We demonstrate that bias can be substantially reduced by modifications to sampling methods and by using estimators that simultaneously account for false-positive detections and site-level covariates to explain heterogeneity. Our results demonstrate that even small probabilities of misidentification and among-site detection heterogeneity can have severe effects on estimator reliability if ignored. We challenge researchers to place greater attention on both heterogeneity and false positives when designing and analysing occupancy studies. We provide 9 specific recommendations for the design, implementation and analysis of occupancy studies to better meet this challenge.

Proceedings ArticleDOI
14 Jun 2015
TL;DR: In this article, the potential root causes of false indications of motor current signature analysis (MCSA) have been investigated and guidelines on how commercially available off-line and on-line tests can be applied for identifying false indications from a field engineers' perspective.
Abstract: Motor current signature analysis (MCSA) has become an essential part of the preventive maintenance program for monitoring the condition of the rotor cage in medium voltage induction motors in the pulp and paper industry. However, many cases of false indications due to interference from the motor or load have been reported. False indications can result in unnecessary inspection and outage costs (false positives) or major repair/replacement costs and loss of production (false negatives). The objective of this paper is to present the potential root causes of false indications, and provide guidelines on how commercially available off-line and on-line tests can be applied for identifying false indications from a field engineers' perspective. Case studies of false MCSA indications and results of alternative commercial tests for improving the reliability of the diagnosis are provided through measurements on 6.6 kV and laboratory motor samples. Finally, new test methods under research and development for reliable rotor fault detection are summarized and unresolved problems are listed. This paper is expected to help field maintenance engineers prevent unnecessary motor inspection and forced outages, and guide researchers target future research towards industrial needs.

Posted Content
TL;DR: In this paper, a simple permutation test is adapted to the needs of QCA users and adjusted the Type I error rate of the test to take into account the multiple hypothesis tests inherent in QCA.
Abstract: The various methodological techniques that fall under the umbrella description of qualitative comparative analysis (QCA) are increasingly popular for modeling causal complexity and necessary or sufficient conditions in medium-N settings. Because QCA methods are not designed as statistical techniques, however, there is no way to assess the probability that the patterns they uncover are the result of chance. Moreover, the implications of the multiple hypothesis tests inherent in these techniques for the false positive rate of the results are not widely understood. This article fills both gaps by tailoring a simple permutation test to the needs of QCA users and adjusting the Type I error rate of the test to take into account the multiple hypothesis tests inherent in QCA. An empirical application — a reexamination of a study of protest-movement success in the Arab Spring — highlights the need for such a test by showing that even very strong QCA results may plausibly be the result of chance.

Journal ArticleDOI
27 Jul 2015-PLOS ONE
TL;DR: The newly proposed procedures improve the identification of significant variables and enable us to derive a new insight into epidemiological association analysis.
Abstract: Objectives In epidemiological studies, it is important to identify independent associations between collective exposures and a health outcome. The current stepwise selection technique ignores stochastic errors and suffers from a lack of stability. The alternative LASSO-penalized regression model can be applied to detect significant predictors from a pool of candidate variables. However, this technique is prone to false positives and tends to create excessive biases. It remains challenging to develop robust variable selection methods and enhance predictability. Material and methods Two improved algorithms denoted the two-stage hybrid and bootstrap ranking procedures, both using a LASSO-type penalty, were developed for epidemiological association analysis. The performance of the proposed procedures and other methods including conventional LASSO, Bolasso, stepwise and stability selection models were evaluated using intensive simulation. In addition, methods were compared by using an empirical analysis based on large-scale survey data of hepatitis B infection-relevant factors among Guangdong residents. Results The proposed procedures produced comparable or less biased selection results when compared to conventional variable selection models. In total, the two newly proposed procedures were stable with respect to various scenarios of simulation, demonstrating a higher power and a lower false positive rate during variable selection than the compared methods. In empirical analysis, the proposed procedures yielding a sparse set of hepatitis B infection-relevant factors gave the best predictive performance and showed that the procedures were able to select a more stringent set of factors. The individual history of hepatitis B vaccination, family and individual history of hepatitis B infection were associated with hepatitis B infection in the studied residents according to the proposed procedures. Conclusions The newly proposed procedures improve the identification of significant variables and enable us to derive a new insight into epidemiological association analysis.

Journal ArticleDOI
TL;DR: In this paper, a simple permutation test is adapted to the needs of QCA users and adjusted the Type I error rate of the test to take into account the multiple hypothesis tests inherent in QCA.
Abstract: The various methodological techniques that fall under the umbrella description of qualitative comparative analysis (QCA) are increasingly popular for modeling causal complexity and necessary or sufficient conditions in medium-N settings. Because QCA methods are not designed as statistical techniques, however, there is no way to assess the probability that the patterns they uncover are the result of chance. Moreover, the implications of the multiple hypothesis tests inherent in these techniques for the false positive rate of the results are not widely understood. This article fills both gaps by tailoring a simple permutation test to the needs of QCA users and adjusting the Type I error rate of the test to take into account the multiple hypothesis tests inherent in QCA. An empirical application — a reexamination of a study of protest-movement success in the Arab Spring — highlights the need for such a test by showing that even very strong QCA results may plausibly be the result of chance.

Journal ArticleDOI
TL;DR: PCR shows moderate diagnostic accuracy when used as screening tests for IA in high-risk patient groups and could be used to trigger radiological and other investigations or for pre-emptive therapy in the absence of specific radiological signs when the clinical suspicion of infection is high.
Abstract: Background This is an update of the original review published in the Cochrane Database of Systematic Reviews Issue 10, 2015.Invasive aspergillosis (IA) is the most common life-threatening opportunistic invasive mould infection in immunocompromised people. Early diagnosis of IA and prompt administration of appropriate antifungal treatment are critical to the survival of people with IA. Antifungal drugs can be given as prophylaxis or empirical therapy, instigated on the basis of a diagnostic strategy (the pre-emptive approach) or for treating established disease. Consequently, there is an urgent need for research into both new diagnostic tools and drug treatment strategies. Increasingly, newer methods such as polymerase chain reaction (PCR) to detect fungal nucleic acids are being investigated. Objectives To provide an overall summary of the diagnostic accuracy of PCR-based tests on blood specimens for the diagnosis of IA in immunocompromised people. Search methods We searched MEDLINE (1946 to June 2015) and Embase (1980 to June 2015). We also searched LILACS, DARE, Health Technology Assessment, Web of Science and Scopus to June 2015. We checked the reference lists of all the studies identified by the above methods and contacted relevant authors and researchers in the field. For this review update we updated electronic searches of the Cochrane Central Register of Controlled Trials (CENTRAL; 2018, Issue 3) in the Cochrane Library; MEDLINE via Ovid (June 2015 to March week 2 2018); and Embase via Ovid (June 2015 to 2018 week 12). Selection criteria We included studies that: i) compared the results of blood PCR tests with the reference standard published by the European Organisation for Research and Treatment of Cancer/Mycoses Study Group (EORTC/MSG); ii) reported data on false-positive, true-positive, false-negative and true-negative results of the diagnostic tests under investigation separately; and iii) evaluated the test(s) prospectively in cohorts of people from a relevant clinical population, defined as a group of individuals at high risk for invasive aspergillosis. Case-control and retrospective studies were excluded from the analysis. Data collection and analysis Authors independently assessed quality and extracted data. For PCR assays, we evaluated the requirement for either one or two consecutive samples to be positive for diagnostic accuracy. We investigated heterogeneity by subgroup analyses. We plotted estimates of sensitivity and specificity from each study in receiver operating characteristics (ROC) space and constructed forest plots for visual examination of variation in test accuracy. We performed meta-analyses using the bivariate model to produce summary estimates of sensitivity and specificity. Main results We included 29 primary studies (18 from the original review and 11 from this update), corresponding to 34 data sets, published between 2000 and 2018 in the meta-analyses, with a mean prevalence of proven or probable IA of 16.3 (median prevalence 11.1% , range 2.5% to 57.1%). Most patients had received chemotherapy for haematological malignancy or had undergone hematopoietic stem cell transplantation. Several PCR techniques were used among the included studies. The sensitivity and specificity of PCR for the diagnosis of IA varied according to the interpretative criteria used to define a test as positive. The summary estimates of sensitivity and specificity were 79.2% (95% confidence interval (CI) 71.0 to 85.5) and 79.6% (95% CI 69.9 to 86.6) for a single positive test result, and 59.6% (95% CI 40.7 to 76.0) and 95.1% (95% CI 87.0 to 98.2) for two consecutive positive test results. Authors' conclusions PCR shows moderate diagnostic accuracy when used as screening tests for IA in high-risk patient groups. Importantly the sensitivity of the test confers a high negative predictive value (NPV) such that a negative test allows the diagnosis to be excluded. Consecutive positives show good specificity in diagnosis of IA and could be used to trigger radiological and other investigations or for pre-emptive therapy in the absence of specific radiological signs when the clinical suspicion of infection is high. When a single PCR positive test is used as the diagnostic criterion for IA in a population of 100 people with a disease prevalence of 16.3% (overall mean prevalence), three people with IA would be missed (sensitivity 79.2%, 20.8% false negatives), and 17 people would be unnecessarily treated or referred for further tests (specificity of 79.6%, 21.4% false positives). If we use the two positive test requirement in a population with the same disease prevalence, it would mean that nine IA people would be missed (sensitivity 59.6%, 40.4% false negatives) and four people would be unnecessarily treated or referred for further tests (specificity of 95.1%, 4.9% false positives). Like galactomannan, PCR has good NPV for excluding disease, but the low prevalence of disease limits the ability to rule in a diagnosis. As these biomarkers detect different markers of disease, combining them is likely to prove more useful.

Journal ArticleDOI
TL;DR: A model selection procedure based on Particle Swarm Optimization (PSO) for selecting the most discriminative textural features and for strengthening the generalization capacity of the supervised learning stage based on a Support Vector Machine (SVM) classifier is adopted.

Journal ArticleDOI
TL;DR: An adaptive algorithm capable of analyzing and image and telling if it is dense or non-dense, and a novel use of the micro-genetic algorithm to create a texture proximity mask and select the regions suspect of containing lesions.
Abstract: Segmentation of the breast separates the skin and the background of the image is kept, with a good performance.High performance at the detection of the density of the breast.Efficient texture description method, based on the combination of Phylogenetic Trees, LBP and analysis in sub-regions.Adjustment of parameters according to the density classification of the breast. Breast cancer is the second commonest type of cancer in the world, and the commonest among women, corresponding to 22% of the new cases every year. This work presents a new computational methodology, which helps the specialists in the detection of breast masses based on the breast density. The proposed methodology is divided into stages with the objective of overcoming several difficulties associated with the detection of masses. In many of these stages, we brought contributions to the areas. The first stage is intended to detect the type of density of the breast, which can be either dense or non-dense. We proposed an adaptive algorithm capable of analyzing and image and telling if it is dense or non-dense. The first stage consists in the segmentation of the regions that look like masses. We propose a novel use of the micro-genetic algorithm to create a texture proximity mask and select the regions suspect of containing lesions. The next stage is the reduction of false positives, which were generated in the previous stage. To this end, we proposed two new approaches. The first reduction of false positives used DBSCAN and a proximity ranking of the textures extracted from the ROIs. In the second reduction of false positives, the resulting regions have their textures analyzed by the combination of Phylogenetic Trees, Local Binary Patterns and Support Vector Machines (SVM). A micro-genetic algorithm was used to choose the suspect regions that generate the best training models and maximize the classification of masses and non-masses used in the SVM. The best result produced a sensitivity of 92.99%, a rate of 0.15 false positives per image and an area under the FROC curve of 0.96 in the analysis of the non-dense breasts; and a sensitivity of 83.70%, a rate of 0.19 false positives per image and an area under the FROC curve of 0.85, in the analysis of the dense breasts.

Journal ArticleDOI
TL;DR: The risk of false positive HIV diagnosis in a tiebreaker algorithm is significant and this study recommends abandoning the tie-breaker algorithm in favour of WHO recommended serial or parallel algorithms, interpreting weakly reactive test lines as indeterminate results requiring further testing except in the setting of blood transfusion, and most importantly, adding a confirmation test to the RDT algorithm.
Abstract: In Ethiopia a tiebreaker algorithm using 3 rapid diagnostic tests (RDTs) in series is used to diagnose HIV. Discordant results between the first 2 RDTs are resolved by a third ‘tiebreaker’ RDT. Medecins Sans Frontieres uses an alternate serial algorithm of 2 RDTs followed by a confirmation test for all double positive RDT results. The primary objective was to compare the performance of the tiebreaker algorithm with a serial algorithm, and to evaluate the addition of a confirmation test to both algorithms. A secondary objective looked at the positive predictive value (PPV) of weakly reactive test lines. The study was conducted in two HIV testing sites in Ethiopia. Study participants were recruited sequentially until 200 positive samples were reached. Each sample was re-tested in the laboratory on the 3 RDTs and on a simple to use confirmation test, the Orgenics Immunocomb Combfirm® (OIC). The gold standard test was the Western Blot, with indeterminate results resolved by PCR testing. 2620 subjects were included with a HIV prevalence of 7.7%. Each of the 3 RDTs had an individual specificity of at least 99%. The serial algorithm with 2 RDTs had a single false positive result (1 out of 204) to give a PPV of 99.5% (95% CI 97.3%-100%). The tiebreaker algorithm resulted in 16 false positive results (PPV 92.7%, 95% CI: 88.4%-95.8%). Adding the OIC confirmation test to either algorithm eliminated the false positives. All the false positives had at least one weakly reactive test line in the algorithm. The PPV of weakly reacting RDTs was significantly lower than those with strongly positive test lines. The risk of false positive HIV diagnosis in a tiebreaker algorithm is significant. We recommend abandoning the tie-breaker algorithm in favour of WHO recommended serial or parallel algorithms, interpreting weakly reactive test lines as indeterminate results requiring further testing except in the setting of blood transfusion, and most importantly, adding a confirmation test to the RDT algorithm. It is now time to focus research efforts on how best to translate this knowledge into practice at the field level. Clinical Trial registration #: NCT01716299

Journal ArticleDOI
TL;DR: This analysis of 20 tumour and matched germline genomes from childhood acute lymphoblastic leukemia finds no significant evidence for integrations by known viruses and proposes a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection.
Abstract: Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.

Journal ArticleDOI
TL;DR: The choice of tools and parameters involved in variant calling can have a dramatic effect on the number of FP SNPs produced, with particularly poor combinations of software and/or parameter settings yielding tens of thousands in this experiment.
Abstract: Single Nucleotide Polymorphisms (SNPs) are widely used molecular markers, and their use has increased massively since the inception of Next Generation Sequencing (NGS) technologies, which allow detection of large numbers of SNPs at low cost. However, both NGS data and their analysis are error-prone, which can lead to the generation of false positive (FP) SNPs. We explored the relationship between FP SNPs and seven factors involved in mapping-based variant calling — quality of the reference sequence, read length, choice of mapper and variant caller, mapping stringency and filtering of SNPs by read mapping quality and read depth. This resulted in 576 possible factor level combinations. We used error- and variant-free simulated reads to ensure that every SNP found was indeed a false positive. The variation in the number of FP SNPs generated ranged from 0 to 36,621 for the 120 million base pairs (Mbp) genome. All of the experimental factors tested had statistically significant effects on the number of FP SNPs generated and there was a considerable amount of interaction between the different factors. Using a fragmented reference sequence led to a dramatic increase in the number of FP SNPs generated, as did relaxed read mapping and a lack of SNP filtering. The choice of reference assembler, mapper and variant caller also significantly affected the outcome. The effect of read length was more complex and suggests a possible interaction between mapping specificity and the potential for contributing more false positives as read length increases. The choice of tools and parameters involved in variant calling can have a dramatic effect on the number of FP SNPs produced, with particularly poor combinations of software and/or parameter settings yielding tens of thousands in this experiment. Between-factor interactions make simple recommendations difficult for a SNP discovery pipeline but the quality of the reference sequence is clearly of paramount importance. Our findings are also a stark reminder that it can be unwise to use the relaxed mismatch settings provided as defaults by some read mappers when reads are being mapped to a relatively unfinished reference sequence from e.g. a non-model organism in its early stages of genomic exploration.

Journal ArticleDOI
TL;DR: In this paper, a copula mixed model is proposed for bivariate meta-analysis of diagnostic test accuracy studies, which includes the generalized linear mixed model as a special case and can also operate on the original scale of sensitivity and specificity.
Abstract: Diagnostic test accuracy studies typically report the number of true positives, false positives, true negatives and false negatives. There usually exists a negative association between the number of true positives and true negatives, because studies that adopt less stringent criterion for declaring a test positive invoke higher sensitivities and lower specificities. A generalized linear mixed model (GLMM) is currently recommended to synthesize diagnostic test accuracy studies. We propose a copula mixed model for bivariate meta-analysis of diagnostic test accuracy studies. Our general model includes the GLMM as a special case and can also operate on the original scale of sensitivity and specificity. Summary receiver operating characteristic curves are deduced for the proposed model through quantile regression techniques and different characterizations of the bivariate random effects distribution. Our general methodology is demonstrated with an extensive simulation study and illustrated by re-analysing the data of two published meta-analyses. Our study suggests that there can be an improvement on GLMM in fit to data and makes the argument for moving to copula random effects models. Our modelling framework is implemented in the package CopulaREMADA within the open source statistical environment R.