scispace - formally typeset
Search or ask a question

Showing papers on "False positive paradox published in 2021"


Journal ArticleDOI
13 Feb 2021-Forests
TL;DR: A novel ensemble learning method is proposed to detect forest fires in different scenarios using two individual learners Yolov5 and EfficientDet and another individual learner EfficientNet, which is responsible for learning global information to avoid false positives.
Abstract: Due to the various shapes, textures, and colors of fires, forest fire detection is a challenging task. The traditional image processing method relies heavily on manmade features, which is not universally applicable to all forest scenarios. In order to solve this problem, the deep learning technology is applied to learn and extract features of forest fires adaptively. However, the limited learning and perception ability of individual learners is not sufficient to make them perform well in complex tasks. Furthermore, learners tend to focus too much on local information, namely ground truth, but ignore global information, which may lead to false positives. In this paper, a novel ensemble learning method is proposed to detect forest fires in different scenarios. Firstly, two individual learners Yolov5 and EfficientDet are integrated to accomplish fire detection process. Secondly, another individual learner EfficientNet is responsible for learning global information to avoid false positives. Finally, detection results are made based on the decisions of three learners. Experiments on our dataset show that the proposed method improves detection performance by 2.5% to 10.9%, and decreases false positives by 51.3%, without any extra latency.

263 citations


Journal ArticleDOI
TL;DR: In this paper, the advantages of the Matthews correlation coefficient (MCC) over accuracy and F1 score have already been shown, and the authors have shown that MCC is a robust metric that summarizes the classifier performance in a single value, if positive and negative cases are of equal importance.
Abstract: Evaluating binary classifications is a pivotal task in statistics and machine learning, because it can influence decisions in multiple areas, including for example prognosis or therapies of patients in critical conditions The scientific community has not agreed on a general-purpose statistical indicator for evaluating two-class confusion matrices (having true positives, true negatives, false positives, and false negatives) yet, even if advantages of the Matthews correlation coefficient (MCC) over accuracy and F1 score have already been shownIn this manuscript, we reaffirm that MCC is a robust metric that summarizes the classifier performance in a single value, if positive and negative cases are of equal importance We compare MCC to other metrics which value positive and negative cases equally: balanced accuracy (BA), bookmaker informedness (BM), and markedness (MK) We explain the mathematical relationships between MCC and these indicators, then show some use cases and a bioinformatics scenario where these metrics disagree and where MCC generates a more informative responseAdditionally, we describe three exceptions where BM can be more appropriate: analyzing classifications where dataset prevalence is unrepresentative, comparing classifiers on different datasets, and assessing the random guessing level of a classifier Except in these cases, we believe that MCC is the most informative among the single metrics discussed, and suggest it as standard measure for scientists of all fields A Matthews correlation coefficient close to +1, in fact, means having high values for all the other confusion matrix metrics The same cannot be said for balanced accuracy, markedness, bookmaker informedness, accuracy and F1 score

241 citations


Journal ArticleDOI
TL;DR: In this article, the authors briefly describe the PCR and antigen tests and then focus mainly on existing antibody tests and their limitations including inaccuracies and possible causes of unreliability False negatives in antibody immunoassays can arise from assay formats, selection of viral antigens and antibody types, diagnostic testing windows, individual variance, and fluctuation in antibody levels.
Abstract: COVID-19, caused by the SARS-CoV-2 virus, has developed into a global health crisis, causing over 2 million deaths and changing people's daily life the world over Current main-stream diagnostic methods in the laboratory include nucleic acid PCR tests and direct viral antigen tests for detecting active infections, and indirect human antibody tests specific to SARS-CoV-2 to detect prior exposure In this Perspective, we briefly describe the PCR and antigen tests and then focus mainly on existing antibody tests and their limitations including inaccuracies and possible causes of unreliability False negatives in antibody immunoassays can arise from assay formats, selection of viral antigens and antibody types, diagnostic testing windows, individual variance, and fluctuation in antibody levels Reasons for false positives in antibody immunoassays mainly involve antibody cross-reactivity from other viruses, as well as autoimmune disease The spectrum bias has an effect on both the false negatives and false positives For assay developers, not only improvement of assay formats but also selection of viral antigens and isotopes of human antibodies need to be carefully considered to improve sensitivity and specificity For clinicians, the factors influencing the accuracy of assays must be kept in mind to test patients using currently imperfect but available tests with smart tactics and realistic interpretation of the test results

117 citations


ReportDOI
TL;DR: This work evaluates different automated methods for record linkage, performing a series of comparisons across methods and against hand linking, and concludes that automated methods perform well.
Abstract: The recent digitization of complete count census data is an extraordinary opportunity for social scientists to create large longitudinal datasets by linking individuals from one census to another or from other sources to the census. We evaluate different automated methods for record linkage, performing a series of comparisons across methods and against hand linking. We have three main findings that lead us to conclude that automated methods perform well. First, a number of automated methods generate very low (less than 5 percent) false positive rates. The automated methods trace out a frontier illustrating the trade-off between the false positive rate and the (true) match rate. Relative to more conservative automated algorithms, humans tend to link more observations but at a cost of higher rates of false positives. Second, when human linkers and algorithms use the same linking variables, there is relatively little disagreement between them. Third, across a number of plausible analyses, coefficient estimates and parameters of interest are very similar when using linked samples based on each of the different automated methods. We provide code and Stata commands to implement the various automated methods.

113 citations


Journal ArticleDOI
TL;DR: The evaluated AI system can correctly identify a proportion of a screening population as cancer-free and also reduce false positives, indicating that AI has the potential to improve mammography screening efficiency.
Abstract: To evaluate the potential of artificial intelligence (AI) to identify normal mammograms in a screening population. In this retrospective study, 9581 double-read mammography screening exams including 68 screen-detected cancers and 187 false positives, a subcohort of the prospective population-based Malmo Breast Tomosynthesis Screening Trial, were analysed with a deep learning–based AI system. The AI system categorises mammograms with a cancer risk score increasing from 1 to 10. The effect on cancer detection and false positives of excluding mammograms below different AI risk thresholds from reading by radiologists was investigated. A panel of three breast radiologists assessed the radiographic appearance, type, and visibility of screen-detected cancers assigned low-risk scores (≤ 5). The reduction of normal exams, cancers, and false positives for the different thresholds was presented with 95% confidence intervals (CI). If mammograms scored 1 and 2 were excluded from screen-reading, 1829 (19.1%; 95% CI 18.3–19.9) exams could be removed, including 10 (5.3%; 95% CI 2.1–8.6) false positives but no cancers. In total, 5082 (53.0%; 95% CI 52.0–54.0) exams, including 7 (10.3%; 95% CI 3.1–17.5) cancers and 52 (27.8%; 95% CI 21.4–34.2) false positives, had low-risk scores. All, except one, of the seven screen-detected cancers with low-risk scores were judged to be clearly visible. The evaluated AI system can correctly identify a proportion of a screening population as cancer-free and also reduce false positives. Thus, AI has the potential to improve mammography screening efficiency. • Retrospective study showed that AI can identify a proportion of mammograms as normal in a screening population. • Excluding normal exams from screening using AI can reduce false positives.

61 citations


Journal ArticleDOI
TL;DR: In this paper, the authors show that the RT-LAMP reaction has a sensitivity of only 200 RNA virus copies, with a color change from pink to yellow occurring in 100% of the 62 clinical samples tested positive by RT-qPCR.
Abstract: The use of RT-LAMP (reverse transcriptase—loop mediated isothermal amplification) has been considered as a promising point-of-care method to diagnose COVID-19. In this manuscript we show that the RT-LAMP reaction has a sensitivity of only 200 RNA virus copies, with a color change from pink to yellow occurring in 100% of the 62 clinical samples tested positive by RT-qPCR. We also demonstrated that this reaction is 100% specific for SARS-CoV-2 after testing 57 clinical samples infected with dozens of different respiratory viruses and 74 individuals without any viral infection. Although the majority of manuscripts recently published using this technique describe only the presence of two-color states (pink = negative and yellow = positive), we verified by naked-eye and absorbance measurements that there is an evident third color cluster (orange), in general related to positive samples with low viral loads, but which cannot be defined as positive or negative by the naked eye. Orange colors should be repeated or tested by RT-qPCR to avoid a false diagnostic. RT-LAMP is therefore very reliable for samples with a RT-qPCR Ct < 30 being as sensitive and specific as a RT-qPCR test. All reactions were performed in 30 min at 65 °C. The use of reaction time longer than 30 min is also not recommended since nonspecific amplifications may cause false positives.

47 citations


Journal ArticleDOI
TL;DR: Nucleic acid amplification testing for SARS-CoV-2 is highly specific and when prevalence is low a significant proportion of initially positive results fail to confirm and confirmatory testing substantially reduces false positive detections.

45 citations


Journal ArticleDOI
TL;DR: It is demonstrated how a more principled approach to data collection and model design, based on realistic settings of vulnerability prediction, can lead to better solutions.
Abstract: Automated detection of software vulnerabilities is a fundamental problem in software security. Existing program analysis techniques either suffer from high false positives or false negatives. Recent progress in Deep Learning (DL) has resulted in a surge of interest in applying DL for automated vulnerability detection. Several recent studies have demonstrated promising results achieving an accuracy of up to 95% at detecting vulnerabilities. In this paper, we ask, "how well do the state-of-the-art DL-based techniques perform in a real-world vulnerability prediction scenario". To our surprise, we find that their performance drops by more than 50%. A systematic investigation of what causes such precipitous performance drop reveals that existing DL-based vulnerability prediction approaches suffer from challenges with the training data (e.g., data duplication, unrealistic distribution of vulnerable classes, etc.) and with the model choices (e.g., simple token-based models). As a result, these approaches often do not learn features related to the actual cause of the vulnerabilities. Instead, they learn unrelated artifacts from the dataset (e.g., specific variable/function names, etc.). Leveraging these empirical findings, we demonstrate how a more principled approach to data collection and model design, based on realistic settings of vulnerability prediction, can lead to better solutions. The resulting tools perform significantly better than the studied baseline up to 33.57% boost in precision and 128.38% boost in recall compared to the best performing model in the literature. Overall, this paper elucidates existing DL-based vulnerability prediction systems' potential issues and draws a roadmap for future DL-based vulnerability prediction research. In that spirit, we make available all the artifacts supporting our results: https://git.io/Jf6IA.

45 citations


Journal ArticleDOI
09 Jun 2021-PLOS ONE
TL;DR: A much needed practical synthesis of basic statistical concepts regarding multiple hypothesis testing in a comprehensible language with well-illustrated examples and an easy-to-follow guide for selecting the most suitable correction technique is provided.
Abstract: Scientists from nearly all disciplines face the problem of simultaneously evaluating many hypotheses. Conducting multiple comparisons increases the likelihood that a non-negligible proportion of associations will be false positives, clouding real discoveries. Drawing valid conclusions require taking into account the number of performed statistical tests and adjusting the statistical confidence measures. Several strategies exist to overcome the problem of multiple hypothesis testing. We aim to summarize critical statistical concepts and widely used correction approaches while also draw attention to frequently misinterpreted notions of statistical inference. We provide a step-by-step description of each multiple-testing correction method with clear examples and present an easy-to-follow guide for selecting the most suitable correction technique. To facilitate multiple-testing corrections, we developed a fully automated solution not requiring programming skills or the use of a command line. Our registration free online tool is available at www.multipletesting.com and compiles the five most frequently used adjustment tools, including the Bonferroni, the Holm (step-down), the Hochberg (step-up) corrections, allows to calculate False Discovery Rates (FDR) and q-values. The current summary provides a much needed practical synthesis of basic statistical concepts regarding multiple hypothesis testing in a comprehensible language with well-illustrated examples. The web tool will fill the gap for life science researchers by providing a user-friendly substitute for command-line alternatives.

45 citations


Journal ArticleDOI
TL;DR: In this article, the authors compare the MCC with the diagnostic odds ratio (DOR), a statistical rate employed sometimes in biomedical sciences, and describe the relationships between them, by also taking advantage of an innovative geometrical plot called confusion tetrahedron.
Abstract: To assess the quality of a binary classification, researchers often take advantage of a four-entry contingency table called confusion matrix , containing true positives, true negatives, false positives, and false negatives. To recap the four values of a confusion matrix in a unique score, researchers and statisticians have developed several rates and metrics. In the past, several scientific studies already showed why the Matthews correlation coefficient (MCC) is more informative and trustworthy than confusion-entropy error, accuracy, F1 score, bookmaker informedness, markedness, and balanced accuracy. In this study, we compare the MCC with the diagnostic odds ratio (DOR), a statistical rate employed sometimes in biomedical sciences. After examining the properties of the MCC and of the DOR, we describe the relationships between them, by also taking advantage of an innovative geometrical plot called confusion tetrahedron , presented here for the first time. We then report some use cases where the MCC and the DOR produce discordant outcomes, and explain why the Matthews correlation coefficient is more informative and reliable between the two. Our results can have a strong impact in computer science and statistics, because they clearly explain why the trustworthiness of the information provided by the Matthews correlation coefficient is higher than the one generated by the diagnostic odds ratio.

44 citations


Journal ArticleDOI
TL;DR: It is demonstrated how improved situational awareness can help reduce false positives in intrusion detection and improvement of detection accuracy by fusion of features from cyber, security, and physical domains.
Abstract: Modern power systems equipped with advanced communication infrastructure are cyber-physical in nature. The traditional approach of leveraging physical measurements for detecting cyber-induced physical contingencies is insufficient to reflect the accurate cyber-physical states. Moreover, deploying conventional rule-based and anomaly-based intrusion detection systems for cyberattack detection results in higher false positives. Hence, independent usage of detection tools of cyberattacks in cyber and physical sides has a limited capability. In this work, a mechanism to fuse real-time data from cyber and physical domains, to improve situational awareness of the whole system is developed. It is demonstrated how improved situational awareness can help reduce false positives in intrusion detection. This cyber and physical data fusion results in cyber-physical state space explosion which is addressed using different feature transformation and selection techniques. Our fusion engine is further integrated into a cyber-physical power system testbed as an application that collects cyber and power system telemetry from multiple sensors emulating real-world data sources found in a utility. These are synthesized into features for algorithms to detect cyber intrusions. Results are presented using the proposed data fusion application to infer False Data and Command Injection (FDI and FCI)-based Man-in-The-Middle attacks. Post collection, the data fusion application uses time-synchronized merge and extracts features. This is followed by pre-processing such as imputation, categorical encoding, and feature reduction, before training supervised, semi-supervised, and unsupervised learning models to evaluate the performance of the intrusion detection system. A major finding is the improvement of detection accuracy by fusion of features from cyber, security, and physical domains. Additionally, it is observed that the semi-supervised co-training technique performs at par with supervised learning methods with the proposed feature vector. The approach and toolset, as well as the dataset that is generated can be utilized to prevent threats such as false data or command injection attacks from being carried out by identifying cyber intrusions accurately.

Journal ArticleDOI
25 Jun 2021-Sensors
TL;DR: A method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold is proposed.
Abstract: When deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a high number of false positives is not penalized by standard evaluation metrics. However, in scenarios of Artificial Intelligence (AI) applications that require high confidence scores (e.g., due to legal requirements or consequences of incorrect detections are severe) or a certain level of model robustness is required, it is unclear which base model to use since they were mainly optimized for benchmark scores. In this paper, we propose a method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold.

Journal ArticleDOI
TL;DR: In this paper, a hyper-model of Convolutional Neural Networks and Long Short-Term Memory, called CONV-LSTM, was proposed to automatically extract features from MOOCs raw data and predict whether each student will drop out or complete courses.

Journal ArticleDOI
TL;DR: The proposed strategy of minimizing false negatives in conservative estimation achieves competitive performance both in terms of model-based and model-free indicators.
Abstract: We consider the problem of estimating the set of all inputs that leads a system to some particular behavior. The system is modeled by an expensive-to-evaluate function, such as a computer experiment, and we are interested in its excursion set, i.e. the set of points where the function takes values above or below some prescribed threshold. The objective function is emulated with a Gaussian Process (GP) model based on an initial design of experiments enriched with evaluation results at (batch-)sequentially determined input points. The GP model provides conservative estimates for the excursion set, which control false positives while minimizing false negatives. We introduce adaptive strategies that sequentially select new evaluations of the function by reducing the uncertainty on conservative estimates. Following the Stepwise Uncertainty Reduction approach we obtain new evaluations by minimizing adapted criteria. Tractable formulae for the conservative criteria are derived, which allow more convenient optimization. The method is benchmarked on random functions generated under the model assumptions in different scenarios of noise and batch size. We then apply it to a reliability engineering test case. Overall, the proposed strategy of minimizing false negatives in conservative estimation achieves competitive performance both in terms of model-based and model-free indicators.

Journal ArticleDOI
TL;DR: VulDeeLocator as discussed by the authors is a deep learning-based location-based vulnerability detector that can simultaneously achieve a high detection capability and a high locating precision, dubbed Vulnerability Deep Learning-based Locator.
Abstract: Automatically detecting software vulnerabilities is an important problem that has attracted much attention from the academic research community. However, existing vulnerability detectors still cannot achieve the vulnerability detection capability and the locating precision that would warrant their adoption for real-world use. In this paper, we present a vulnerability detector that can simultaneously achieve a high detection capability and a high locating precision, dubbed Vulnerability Deep learning-based Locator (VulDeeLocator).In the course of designing VulDeeLocator, we encounter difficulties including how to accommodate semantic relations between the definitions of types as well as macros and their uses across files, how to accommodate accurate control flows and variable define-use relations, and how to achieve high locating precision. We solve these difficulties by using two innovative ideas: (i) leveraging intermediate code to accommodate extra semantic information, and (ii) using the notion of granularity refinement to pin down locations of vulnerabilities. When applied to 200 files randomly selected from three real-world software products, VulDeeLocator detects 18 confirmed vulnerabilities (i.e., true-positives). Among them, 16 vulnerabilities correspond to known vulnerabilities; the other two are not reported in the National Vulnerability Database (NVD) but have been silently patched by the vendor of Libav when releasing newer versions.

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a binarized detection learning method (BiDet) for efficient object detection, where the amount of information in the high-level feature maps is constrained and the mutual information between the feature maps and object detection is maximized.
Abstract: In this paper, we propose a binarized detection learning method (BiDet) for efficient object detection. Conventional network binarization methods directly quantize the weights and activations in one-stage or two-stage detectors with constrained representational capacity, so that the information redundancy in the networks causes numerous false positives and degrades the performance significantly. Specifically, we generalize the information bottleneck (IB) principle to object detection, where the amount of information in the high-level feature maps is constrained and the mutual information between the feature maps and object detection is maximized. Meanwhile, we learn sparse object priors so that the posteriors are concentrated on informative detection prediction with false positive elimination. We further present binary neural networks with automatic information compression (AutoBiDet) to automatically adjust the IB trade-off for each input according to the amount of contained information. Moreover, we further propose the class-aware sparse object priors by assigning different sparsity to objects in various classes, so that the false positives are alleviated more effectively without recall decrease. Extensive experiments on the PASCAL VOC and COCO datasets show that our BiDet and AutoBiDet outperform the state-of-the-art binarized object detectors by a sizable margin.

Journal ArticleDOI
TL;DR: The proposed model manages to decrease the proportion of false positives and increase accuracy when compared to the rule-based system, and on reducing the false positive rate, the company’s costs for investigating suspicious customers also decrease significantly.
Abstract: This study proposes a comprehensive model that helps improve self-comparisons and group-comparisons for customers to detect suspicious transactions related to money laundering (ML) and terrorism financing (FT) in financial systems. The self-comparison is improved by establishing a more comprehensive know your customer (KYC) policy, adding non-transactional characteristics to obtain a set of variables that can be classified into four categories: inherent, product, transactional, and geographic. The group-comparison involving the clustering process is improved by using an innovative transaction abnormality indicator, based on the variance of the variables. To illustrate the way this methodology works, random samples were extracted from the data warehouse of an important financial institution in Mexico. To train the algorithms, 26,751 and 3527 transactions and their features, involving natural and legal persons, respectively, were selected randomly from January 2020. To measure the prediction accuracy, test sets of 1000 and 600 transactions were selected randomly for natural and legal persons, respectively, from February 2020. The proposed model manages to decrease the proportion of false positives and increase accuracy when compared to the rule-based system. On reducing the false positive rate, the company’s costs for investigating suspicious customers also decrease significantly.

Journal ArticleDOI
TL;DR: It is demonstrated that circulating miRNA pairs could potentially bring more benefits to PCa early diagnosis for clinical practice and be superior to a recent 2-miRNA model with 18 false positives and 80 false negatives.
Abstract: The accuracy of prostate-specific antigen or clinical examination in prostate cancer (PCa) screening is in question, and circulating microRNAs (miRNAs) can be alternatives to PCa diagnosis. However, recent circulating miRNA biomarkers either are identified upon small sample sizes or cannot have robust diagnostic performance in every aspect of performance indicators. These may decrease applicability of potential biomarkers for the early detection of PCa. We reviewed recent studies on blood-derived miRNAs for prostate cancer diagnosis and carried out a large case study to understand whether circulating miRNA pairs, rather than single circulating miRNAs, could contribute to a more robust diagnostic model to significantly improve PCa diagnosis. We used 1231 high-throughput miRNA-profiled serum samples from two cohorts to design and verify a model based on class separability miRNA pairs (cs-miRPs). The pairwise model was composed of five circulating miRNAs coupled to miR-5100 and miR-1290 (i.e. five miRNA pairs, 5-cs-miRPs), reaching approximately 99% diagnostic performance in almost all indicators (sensitivity = 98.96%, specificity = 100%, accuracy = 99.17%, PPV = 100%, NPV = 96.15%) shown by a test set (n = 484: PCa = 384, negative prostate biopsies = 100). The nearly 99% diagnostic performance was also verified by an additional validation set (n = 140: PCa = 40, healthy controls = 100). Overall, the 5-cs-miRP model had 1 false positive and 7 false negatives among the 1231 serum samples and was superior to a recent 2-miRNA model (so far the best for PCa diagnosis) with 18 false positives and 80 false negatives. The present large case study demonstrated that circulating miRNA pairs could potentially bring more benefits to PCa early diagnosis for clinical practice.

Journal ArticleDOI
25 Mar 2021-PLOS ONE
TL;DR: In this article, a simulated data set incorporating actual community prevalence and test performance characteristics to a susceptible, infectious, removed (SIR) compartmental model, modeling the impact of base and tunable variables including test sensitivity, testing frequency, results lag, sample pooling, disease prevalence, externally-acquired infections, symptom checking, and test cost on outcomes including case reduction and false positives.
Abstract: BACKGROUND: COVID-19 test sensitivity and specificity have been widely examined and discussed, yet optimal use of these tests will depend on the goals of testing, the population or setting, and the anticipated underlying disease prevalence. We model various combinations of key variables to identify and compare a range of effective and practical surveillance strategies for schools and businesses. METHODS: We coupled a simulated data set incorporating actual community prevalence and test performance characteristics to a susceptible, infectious, removed (SIR) compartmental model, modeling the impact of base and tunable variables including test sensitivity, testing frequency, results lag, sample pooling, disease prevalence, externally-acquired infections, symptom checking, and test cost on outcomes including case reduction and false positives. FINDINGS: Increasing testing frequency was associated with a non-linear positive effect on cases averted over 100 days. While precise reductions in cumulative number of infections depended on community disease prevalence, testing every 3 days versus every 14 days (even with a lower sensitivity test) reduces the disease burden substantially. Pooling provided cost savings and made a high-frequency approach practical; one high-performing strategy, testing every 3 days, yielded per person per day costs as low as $1.32. INTERPRETATION: A range of practically viable testing strategies emerged for schools and businesses. Key characteristics of these strategies include high frequency testing with a moderate or high sensitivity test and minimal results delay. Sample pooling allowed for operational efficiency and cost savings with minimal loss of model performance.

Journal ArticleDOI
16 Feb 2021-BMJ
TL;DR: In this article, the authors evaluated the suitability of SNP chips for detecting rare pathogenic variants in a clinically unselected population and concluded that they are extremely unreliable for genotyping very rare variants and should not be used to guide health decisions without validation.
Abstract: Objective To determine whether the sensitivity and specificity of SNP chips are adequate for detecting rare pathogenic variants in a clinically unselected population. Design Retrospective, population based diagnostic evaluation. Participants 49 908 people recruited to the UK Biobank with SNP chip and next generation sequencing data, and an additional 21 people who purchased consumer genetic tests and shared their data online via the Personal Genome Project. Main outcome measures Genotyping (that is, identification of the correct DNA base at a specific genomic location) using SNP chips versus sequencing, with results split by frequency of that genotype in the population. Rare pathogenic variants in the BRCA1 and BRCA2 genes were selected as an exemplar for detailed analysis of clinically actionable variants in the UK Biobank, and BRCA related cancers (breast, ovarian, prostate, and pancreatic) were assessed in participants through use of cancer registry data. Results Overall, genotyping using SNP chips performed well compared with sequencing; sensitivity, specificity, positive predictive value, and negative predictive value were all above 99% for 108 574 common variants directly genotyped on the SNP chips and sequenced in the UK Biobank. However, the likelihood of a true positive result decreased dramatically with decreasing variant frequency; for variants that are very rare in the population, with a frequency below 0.001% in UK Biobank, the positive predictive value was very low and only 16% of 4757 heterozygous genotypes from the SNP chips were confirmed with sequencing data. Results were similar for SNP chip data from the Personal Genome Project, and 20/21 individuals analysed had at least one false positive rare pathogenic variant that had been incorrectly genotyped. For pathogenic variants in the BRCA1 and BRCA2 genes, which are individually very rare, the overall performance metrics for the SNP chips versus sequencing in the UK Biobank were: sensitivity 34.6%, specificity 98.3%, positive predictive value 4.2%, and negative predictive value 99.9%. Rates of BRCA related cancers in UK Biobank participants with a positive SNP chip result were similar to those for age matched controls (odds ratio 1.31, 95% confidence interval 0.99 to 1.71) because the vast majority of variants were false positives, whereas sequence positive participants had a significantly increased risk (odds ratio 4.05, 2.72 to 6.03). Conclusions SNP chips are extremely unreliable for genotyping very rare pathogenic variants and should not be used to guide health decisions without validation.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper developed a deep learning-based system named ENDOANGEL-LD (lesion detection) to assist in detecting all focal gastric lesions and predicting neoplasms by WLE.

Journal ArticleDOI
TL;DR: In this article, the authors explored the impact of using the Panbio SARS-CoV-2 assay with conditions falling outside manufacturer recommendations, and demonstrated that the kit buffer's pH, ionic strength, and buffering capacity were critical components to ensure proper kit function and avoid generation of false-positive results.
Abstract: Antigen-based rapid diagnostics tests (Ag-RDTs) are useful tools for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) detection. However, misleading demonstrations of the Abbott Panbio coronavirus disease 2019 (COVID-19) Ag-RDT on social media claimed that SARS-CoV-2 antigen could be detected in municipal water and food products. To offer a scientific rebuttal to pandemic misinformation and disinformation, this study explored the impact of using the Panbio SARS-CoV-2 assay with conditions falling outside manufacturer recommendations. Using Panbio, various water and food products, laboratory buffers, and SARS-CoV-2-negative clinical specimens were tested with and without manufacturer buffer. Additional experiments were conducted to assess the role of each Panbio buffer component (tricine, NaCl, pH, and Tween 20) as well as the impact of temperature (4°C, 20°C, and 45°C) and humidity (90%) on assay performance. Direct sample testing (without the kit buffer) resulted in false-positive signals resembling those obtained with SARS-CoV-2 positive controls tested under proper conditions. The likely explanation of these artifacts is nonspecific interactions between the SARS-CoV-2-specific conjugated and capture antibodies, as proteinase K treatment abrogated this phenomenon, and thermal shift assays showed pH-induced conformational changes under conditions promoting artifact formation. Omitting, altering, and reverse engineering the kit buffer all supported the importance of maintaining buffering capacity, ionic strength, and pH for accurate kit function. Interestingly, the Panbio assay could tolerate some extremes of temperature and humidity outside manufacturer claims. Our data support strict adherence to manufacturer instructions to avoid false-positive SARS-CoV-2 Ag-RDT reactions, otherwise resulting in anxiety, overuse of public health resources, and dissemination of misinformation. IMPORTANCE With the Panbio severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) antigen test being deployed in over 120 countries worldwide, understanding conditions required for its ideal performance is critical. Recently on social media, this kit was shown to generate false positives when manufacturer recommendations were not followed. While erroneous results from improper use of a test may not be surprising to some health care professionals, understanding why false positives occur can help reduce the propagation of misinformation and provide a scientific rebuttal for these aberrant findings. This study demonstrated that the kit buffer's pH, ionic strength, and buffering capacity were critical components to ensure proper kit function and avoid generation of false-positive results. Typically, false positives arise from cross-reacting or interfering substances; however, this study demonstrated a mechanism where false positives were generated under conditions favoring nonspecific interactions between the two antibodies designed for SARS-CoV-2 antigen detection. Following the manufacturer instructions is critical for accurate test results.

Journal ArticleDOI
TL;DR: In this article, a plankton image data set acquired by an Imaging FlowCytobot over a decade of operation was used to train and evaluate two new automated image classifiers.
Abstract: Continuous monitoring and early warning together represent an important mitigation strategy for harmful algal blooms (HAB). The coast of Texas experiences periodic blooms of three HAB dinoflagellates: Karenia brevis, Dinophysis ovum, and Prorocentrum texanum. A plankton image data set acquired by an Imaging FlowCytobot over a decade of operation was used to train and evaluate two new automated image classifiers. A 112 class, random forest classifier (RF_112) and a 112 class, convolutional neural network classifier (CNN_112) were developed and compared with an existing, 54 class, random forest classifier (RF_54) already in use as an early warning notification system. Both 112 class classifiers exhibited improved performance over the RF_54 classifier when tested on three different HAB species with the CNN_112 classifier producing fewer false positives and false negatives in most of the cases tested. For K. brevis and P. texanum, the current threshold of 2 cells.mL-1 was identified as the best threshold to minimize the number of false positives and false negatives. For D. ovum, a threshold of 1 cell.mL-1 was found to produce the best results with regard to the number of false positives/negatives. A lower threshold will result in earlier notification of an increase in cell concentration and will provide state health managers with increased lead time to prepare for an impending HAB.

Journal ArticleDOI
TL;DR: In this paper, a suspect screening workflow is proposed to combine several predictors based on m/z, retention time (Rt) prediction models, and isotope ratio to generate intermediate and global scorings.
Abstract: The technological advances of cutting-edge high-resolution mass spectrometry (HRMS) have set the stage for a new paradigm for exposure assessment. However, some adjustments of the metabolomics workflow are needed before HRMS-based methods can detect the low-abundant exogenous chemicals in human matrixes. It is also essential to provide tools to speed up marker identifications. Here, we first show that metabolomics software packages developed for automated optimization of XCMS parameters can lead to a false negative rate of up to 80% for chemicals spiked at low levels in blood. We then demonstrate that manual selection criteria in open-source (XCMS, MZmine2) and vendor software (MarkerView, Progenesis QI) allow to decrease the rate of false negative up to 4% (MZmine2). We next report an MS1 automatized suspect screening workflow that allows for a rapid preannotation of HRMS data sets. The novelty of this suspect screening workflow is to combine several predictors based on m/z, retention time (Rt) prediction models, and isotope ratio to generate intermediate and global scorings. Several Rt prediction models were tested and hierarchized (PredRet, Retip, retention time indices, and a log P model), and a nonlinear scoring was developed to account for Rt variations observed within individual runs. We then tested the efficiency of this suspect screening tool to detect spiked and nonspiked chemicals in human blood. Compared to other existing annotation tools, its main advantages include the use of Rt predictors using different models, its speed, and the use of efficient scoring algorithms to prioritize preannotated markers and reduce false positives.

Journal ArticleDOI
TL;DR: In this paper, an embedded multi-sensor architecture was proposed to detect incipient short-circuit in wind turbine electrical generators, that is robust to both false positives and negatives.

Journal ArticleDOI
TL;DR: In this article, an attention-embedded complementary-stream convolutional neural network (AECS-CNN) is proposed to obtain more representative features of nodules for false positive reduction.

Journal ArticleDOI
TL;DR: In this paper, the authors examined whether demographic, psychological, cognitive, and/or adaptive factors predict ADOS-2 false positives and which psychiatric diagnoses most often result in false positives.
Abstract: OBJECTIVE While the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2) shows high sensitivity for detecting autism spectrum disorder (ASD) when present (i.e. true positives), scores on the ADOS-2 may be falsely elevated for individuals with cognitive impairments or psychological concerns other than ASD (i.e. false positives). This study examined whether demographic, psychological, cognitive, and/or adaptive factors predict ADOS-2 false positives and which psychiatric diagnoses most often result in false positives. METHOD Sensitivity, specificity, false positive, and false negative rates were calculated among 214 5- to 16-year-old patients who completed an ADOS-2 (module 3) as part of an ASD diagnostic evaluation. Additional analyses were conducted with the 101 patients who received clinically elevated ADOS-2 scores (i.e. 56 true positives and 45 false positives). RESULTS Results revealed a 34% false positive rate and a 1% false negative rate. False positives were slightly more likely to be male, have lower restricted and repetitive behavior (RRB) severity scores on the ADOS-2, and demonstrate elevated anxiety during the ADOS-2. Neither IQ, adaptive functioning, nor caregiver-reported emotional functioning was predictive of false positive status. Trauma-related psychiatric diagnoses were more common among false positives. CONCLUSIONS The ADOS-2 should not be used in isolation to assess for ASD, and, in psychiatrically-complex cases, RRB symptom severity may be particularly helpful in differentiating ASD from other psychiatric conditions. Additionally, heightened levels of anxiety, more so than overactivity or disruptive behavior, may lead to non-ASD specific elevations in ADOS-2 scores.

Journal ArticleDOI
TL;DR: The Elastic Bloom filter (EBF) is proposed, which can first delete the corresponding fingerprint and then update the corresponding bit in the Bloom filter, and which significantly outperforms existing works.
Abstract: The Bloom filter, answering whether an item is in a set, has achieved great success in various fields, including networking, databases, and bioinformatics. However, the Bloom filter has two main shortcomings: no support of item deletion and no support of expansion. Existing solutions either support deletion at the cost of using additional memory, or support expansion at the cost of increasing the false positive rate and decreasing the query speed. Unlike existing solutions, we propose the Elastic Bloom filter (EBF) to address the two shortcomings simultaneously. Importantly, when EBF expands, the false positives decrease. Our key technique is Elastic Fingerprints, which dynamically absorb and release bits during compression and expansion. To support deletion, EBF can first delete the corresponding fingerprint and then update the corresponding bit in the Bloom filter. To support expansion, Elastic Fingerprints release bits and insert them to the Bloom filter. Our experimental results show that the Elastic Bloom filter significantly outperforms existing works.


Journal ArticleDOI
TL;DR: A pragmatic implementation study for universal Ag-RDT-based screening at a tertiary care hospital in Germany where patients presenting for elective procedures and selected personnel without symptoms suggestive of SARS-CoV-2 were screened with an AgRDT since October 2020 as discussed by the authors.