scispace - formally typeset
Search or ask a question

Showing papers on "False positive paradox published in 2019"


Journal ArticleDOI
TL;DR: The harmonic mean p-value (HMP) is introduced, a simple to use and widely applicable alternative to Bonferroni correction motivated by Bayesian model averaging that greatly improves statistical power while maintaining control of the gold standard false positive rate.
Abstract: Analysis of “big data” frequently involves statistical comparison of millions of competing hypotheses to discover hidden processes underlying observed patterns of data, for example, in the search for genetic determinants of disease in genome-wide association studies (GWAS). Controlling the familywise error rate (FWER) is considered the strongest protection against false positives but makes it difficult to reach the multiple testing-corrected significance threshold. Here, I introduce the harmonic mean p-value (HMP), which controls the FWER while greatly improving statistical power by combining dependent tests using generalized central limit theorem. I show that the HMP effortlessly combines information to detect statistically significant signals among groups of individually nonsignificant hypotheses in examples of a human GWAS for neuroticism and a joint human–pathogen GWAS for hepatitis C viral load. The HMP simultaneously tests all ways to group hypotheses, allowing the smallest groups of hypotheses that retain significance to be sought. The power of the HMP to detect significant hypothesis groups is greater than the power of the Benjamini–Hochberg procedure to detect significant hypotheses, although the latter only controls the weaker false discovery rate (FDR). The HMP has broad implications for the analysis of large datasets, because it enhances the potential for scientific discovery.

210 citations


Journal ArticleDOI
TL;DR: The results obtained demonstrate that the proposed cloud-based anomaly detection model is superior in comparison to the other state-of-the-art models (used for network anomaly detection), in terms of accuracy, detection rate, false positive rate, and F-score.
Abstract: With the emergence of the Internet-of-Things (IoT) and seamless Internet connectivity, the need to process streaming data on real-time basis has become essential. However, the existing data stream management systems are not efficient in analyzing the network log big data for real-time anomaly detection. Further, the existing anomaly detection approaches are not proficient because they cannot be applied to networks, are computationally complex, and suffer from high false positives. Thus, in this paper a hybrid data processing model for network anomaly detection is proposed that leverages grey wolf optimization (GWO) and convolutional neural network (CNN). To enhance the capabilities of the proposed model, GWO and CNN learning approaches were enhanced with: 1) improved exploration, exploitation, and initial population generation abilities and 2) revamped dropout functionality, respectively. These extended variants are referred to as Improved-GWO (ImGWO) and Improved-CNN (ImCNN). The proposed model works in two phases for efficient network anomaly detection. In the first phase, ImGWO is used for feature selection in order to obtain an optimal trade-off between two objectives, i.e., reduced error rate and feature-set minimization. In the second phase, ImCNN is used for network anomaly classification. The efficacy of the proposed model is validated on benchmark (DARPA’98 and KDD’99) and synthetic datasets. The results obtained demonstrate that the proposed cloud-based anomaly detection model is superior in comparison to the other state-of-the-art models (used for network anomaly detection), in terms of accuracy, detection rate, false positive rate, and F-score. In average, the proposed model exhibits an overall improvement of 8.25%, 4.08%, and 3.62% in terms of detection rate, false positives, and accuracy, respectively; relative to standard GWO with CNN.

185 citations


Journal ArticleDOI
TL;DR: The contemporary PTP of significant CAD across symptomatic patient categories is substantially lower than currently assumed, and non-invasive testing can rarely rule-in the disease and focus should shift to ruling-out obstructive CAD.
Abstract: AIMS To provide a pooled estimation of contemporary pre-test probabilities (PTPs) of significant coronary artery disease (CAD) across clinical patient categories, re-evaluate the utility of the application of diagnostic techniques according to such estimates, and propose a comprehensive diagnostic technique selection tool for suspected CAD. METHODS AND RESULTS Estimates of significant CAD prevalence across sex, age, and type of chest pain categories from three large-scale studies were pooled (n = 15 815). The updated PTPs and diagnostic performance profiles of exercise electrocardiogram, invasive coronary angiography, coronary computed tomography angiography (CCTA), positron emission tomography (PET), stress cardiac magnetic resonance (CMR), and SPECT were integrated to define the PTP ranges in which ruling-out CAD is possible with a post-test probability of <10% and <5%. These ranges were then integrated in a new colour-coded tabular diagnostic technique selection tool. The Bayesian relationship between PTP and the rate of diagnostic false positives was explored to complement the characterization of their utility. Pooled CAD prevalence was 14.9% (range = 1-52), clearly lower than that used in current clinical guidelines. Ruling-out capabilities of non-invasive imaging were good overall. The greatest ruling-out capacity (i.e. post-test probability <5%) was documented by CCTA, PET, and stress CMR. With decreasing PTP, the fraction of false positive findings rapidly increased, although a lower CAD prevalence partially cancels out such effect. CONCLUSION The contemporary PTP of significant CAD across symptomatic patient categories is substantially lower than currently assumed. With a low prevalence of the disease, non-invasive testing can rarely rule-in the disease and focus should shift to ruling-out obstructive CAD. The large proportion of false positive findings must be taken into account when patients with low PTP are investigated.

122 citations


Journal ArticleDOI
TL;DR: A method to provide a decision support for the doctor in order to help to consult each case faster and more precisely is proposed and the results show high potential of the newly proposed method.
Abstract: Background and objective The X-ray screening is one of the most popular methodologies in detection of respiratory system diseases. Chest organs are screened on the film or digital file which go to the doctor for evaluation. However, the analysis of x-ray images requires much experience and time. Clinical decision support is very important for medical examinations. The use of Computational Intelligence can simulate the evaluation and decision processes of a medical expert. We propose a method to provide a decision support for the doctor in order to help to consult each case faster and more precisely. Methods We use image descriptors based on the spatial distribution of Hue, Saturation and Brightness values in x-ray images, and a neural network co-working with heuristic algorithms (Moth-Flame, Ant Lion) to detect degenerated lung tissues in x-ray image. The neural network evaluates the image and if the possibility of a respiratory disease is detected, the heuristic method identifies the degenerated tissues in the x-ray image in detail based on the use of the proposed fitness function. Results The average accuracy is 79.06% in pre-detection stage, similarly the sensitivity and the specificity averaged for three pre-classified diseases are 84.22% and 66.7%, respectively. The misclassification errors are 3.23% for false positives and 3.76% for false negatives. Conclusions The proposed neuro-heuristic approach addresses small changes in the structure of lung tissues, which appear in pneumonia, sarcoidosis or cancer and some consequences that may appear after the treatment. The results show high potential of the newly proposed method. Additionally, the method is flexible and has low computational burden.

107 citations


Journal ArticleDOI
10 Oct 2019
TL;DR: This paper proposes to use as the global context the Program Dependence Graph and Data Flow Graph to connect the method under investigation with the other relevant methods that might contribute to the buggy code to reduce the false positive rate and improve the recall of the model.
Abstract: Bug detection has been shown to be an effective way to help developers in detecting bugs early, thus, saving much effort and time in software development process. Recently, deep learning-based bug detection approaches have gained successes over the traditional machine learning-based approaches, the rule-based program analysis approaches, and mining-based approaches. However, they are still limited in detecting bugs that involve multiple methods and suffer high rate of false positives. In this paper, we propose a combination approach with the use of contexts and attention neural network to overcome those limitations. We propose to use as the global context the Program Dependence Graph (PDG) and Data Flow Graph (DFG) to connect the method under investigation with the other relevant methods that might contribute to the buggy code. The global context is complemented by the local context extracted from the path on the AST built from the method’s body. The use of PDG and DFG enables our model to reduce the false positive rate, while to complement for the potential reduction in recall, we make use of the attention neural network mechanism to put more weights on the buggy paths in the source code. That is, the paths that are similar to the buggy paths will be ranked higher, thus, improving the recall of our model. We have conducted several experiments to evaluate our approach on a very large dataset with +4.973M methods in 92 different project versions. The results show that our tool can have a relative improvement up to 160% on F-score when comparing with the state-of-the-art bug detection approaches. Our tool can detect 48 true bugs in the list of top 100 reported bugs, which is 24 more true bugs when comparing with the baseline approaches. We also reported that our representation is better suitable for bug detection and relatively improves over the other representations up to 206% in accuracy.

103 citations


Journal ArticleDOI
12 Apr 2019
TL;DR: The debate about false positives in psychological research has led to a demand for higher statistical power as mentioned in this paper, and to meet this demand, researchers need to collect data from larger samples, which is impor...
Abstract: The debate about false positives in psychological research has led to a demand for higher statistical power. To meet this demand, researchers need to collect data from larger samples—which is impor...

86 citations


Journal ArticleDOI
TL;DR: This paper addresses processing time as well as the required number of training samples for a 3-D CNN implementation through the development of a two-stage computer-aided detection system for automatic detection of pulmonary nodules.
Abstract: Deep two-dimensional (2-D) convolutional neural networks (CNNs) have been remarkably successful in producing record-breaking results in a variety of computer vision tasks. It is possible to extend CNNs to three dimensions using 3-D kernels to make them suitable for volumetric medical imaging data such as CT or MRI, but this increases the processing time as well as the required number of training samples (due to the higher number of parameters that need to be learned). In this paper, we address both of these issues for a 3-D CNN implementation through the development of a two-stage computer-aided detection system for automatic detection of pulmonary nodules. The first stage consists of a 3-D fully convolutional network for fast screening and generation of candidate suspicious regions. The second stage consists of an ensemble of 3-D CNNs trained using extensive transformations applied to both the positive and negative patches to augment the training set. To enable the second stage classifiers to learn differently, they are trained on false positive patches obtained from the screening model using different thresholds on their associated scores as well as different augmentation types. The networks in the second stage are averaged together to produce the final classification score for each candidate patch. Using this procedure, our overall nodule detection system called DeepMed is fast and can achieve 91% sensitivity at 2 false positives per scan on cases from the LIDC dataset.

82 citations


Journal ArticleDOI
TL;DR: The resulting method, which is named PhISCS, is the first to integrate SCS and bulk sequencing data while accounting for ISA violating mutations and provides a guarantee of optimality in reported solutions.
Abstract: Available computational methods for tumor phylogeny inference via single-cell sequencing (SCS) data typically aim to identify the most likely perfect phylogeny tree satisfying the infinite sites assumption (ISA). However, the limitations of SCS technologies including frequent allele dropout and variable sequence coverage may prohibit a perfect phylogeny. In addition, ISA violations are commonly observed in tumor phylogenies due to the loss of heterozygosity, deletions, and convergent evolution. In order to address such limitations, we introduce the optimal subperfect phylogeny problem which asks to integrate SCS data with matching bulk sequencing data by minimizing a linear combination of potential false negatives (due to allele dropout or variance in sequence coverage), false positives (due to read errors) among mutation calls, and the number of mutations that violate ISA (real or because of incorrect copy number estimation). We then describe a combinatorial formulation to solve this problem which ensures that several lineage constraints imposed by the use of variant allele frequencies (VAFs, derived from bulk sequence data) are satisfied. We express our formulation both in the form of an integer linear program (ILP) and-as a first in tumor phylogeny reconstruction-a Boolean constraint satisfaction problem (CSP) and solve them by leveraging state-of-the-art ILP/CSP solvers. The resulting method, which we name PhISCS, is the first to integrate SCS and bulk sequencing data while accounting for ISA violating mutations. In contrast to the alternative methods, typically based on probabilistic approaches, PhISCS provides a guarantee of optimality in reported solutions. Using simulated and real data sets, we demonstrate that PhISCS is more general and accurate than all available approaches.

77 citations


Journal ArticleDOI
TL;DR: A hybrid statistical analysis is demonstrated that combines individual significance testing with an estimated global significance limit, that simultaneously decreased the risk of false positives and retained superior power and highlights the utility of null HX-MS measurements to explicitly evaluate the criteria used to classify a difference in HX as significant.
Abstract: Differential hydrogen exchange-mass spectrometry (HX-MS) measurements are valuable for identification of differences in the higher order structures of proteins. Typically, the data sets are large with many differential HX values corresponding to many peptides monitored at several labeling times. To eliminate subjectivity and reliably identify significant differences in HX-MS measurements, a statistical analysis approach is needed. In this work, we performed null HX-MS measurements (i.e., no meaningful differences) on maltose binding protein and infliximab, a monoclonal antibody, to evaluate the reliability of different statistical analysis approaches. Null measurements are useful for directly evaluating the risk (i.e., falsely classifying a difference as significant) and power (i.e., failing to classify a true difference as significant) associated with different statistical analysis approaches. With null measurements, we identified weaknesses in the approaches commonly used. Individual tests of significance were prone to false positives due to the problem of multiple comparisons. Incorporation of Bonferroni correction led to unacceptably large limits of detection, severely decreasing the power. Analysis methods using a globally estimated significance limit also led to an overestimation of the limit of detection, leading to a loss of power. Here, we demonstrate a hybrid statistical analysis, based on volcano plots, that combines individual significance testing with an estimated global significance limit, that simultaneously decreased the risk of false positives and retained superior power. Furthermore, we highlight the utility of null HX-MS measurements to explicitly evaluate the criteria used to classify a difference in HX as significant.

60 citations


Journal ArticleDOI
TL;DR: It is found that the combination of a two‐sided test and cleaning the data using ICA FIX resulted in nominal false positive rates for all data sets, meaning that data cleaning is not only important for resting state fMRI, but also for task fMRI.
Abstract: Methodological research rarely generates a broad interest, yet our work on the validity of cluster inference methods for functional magnetic resonance imaging (fMRI) created intense discussion on both the minutia of our approach and its implications for the discipline. In the present work, we take on various critiques of our work and further explore the limitations of our original work. We address issues about the particular event-related designs we used, considering multiple event types and randomization of events between subjects. We consider the lack of validity found with one-sample permutation (sign flipping) tests, investigating a number of approaches to improve the false positive control of this widely used procedure. We found that the combination of a two-sided test and cleaning the data using ICA FIX resulted in nominal false positive rates for all data sets, meaning that data cleaning is not only important for resting state fMRI, but also for task fMRI. Finally, we discuss the implications of our work on the fMRI literature as a whole, estimating that at least 10% of the fMRI studies have used the most problematic cluster inference method (p = .01 cluster defining threshold), and how individual studies can be interpreted in light of our findings. These additional results underscore our original conclusions, on the importance of data sharing and thorough evaluation of statistical methods on realistic null data.

55 citations


Journal ArticleDOI
TL;DR: An error due to the incorrect implementation of the BY procedure in Narum such that the approach does not adequately control FDR, and the impact on conservation genetics and other fields will be study-dependent, as it is related to the number of true to false positives for each study.
Abstract: In 2006, Narum published a paper in Conservation Genetics emphasizing that Bonferroni correction for multiple testing can be highly conservative with poor statistical power (high Type II error). He pointed out that other approaches for multiple testing correction can control the false discovery rate (FDR) with a better balance of Type I and Type II errors and suggested that the approach of Benjamini and Yekutieli (BY) 2001 provides the most biologically relevant correction for evaluating the significance of population differentiation in conservation genetics. However, there are crucial differences between the original Benjamini and Yekutieli procedure and that described by Narum. After carefully reviewing both papers, we found an error due to the incorrect implementation of the BY procedure in Narum (Conserv Genet 7:783–787, 2006) such that the approach does not adequately control FDR. Since the incorrect BY approach has been increasingly used, not only in conservation genetics, but also in medicine and biology, it is important that the error is made known to the scientific community. In addition, we provide an overview of FDR approaches for multiple testing correction and encourage authors first and foremost to provide effect sizes for their results; and second, to be transparent in their descriptions of multiple testing correction. Finally, the impact of this error on conservation genetics and other fields will be study-dependent, as it is related to the number of true to false positives for each study.

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a novel Multi-scale Gradual Integration Convolutional Neural Network (MGI-CNN), which used multi-scale inputs with different levels of contextual information, and learned multi-stream feature integration in an end-to-end manner.

Journal ArticleDOI
TL;DR: Based on its classification performance in both case studies, Random Undersampling is the best choice as it results in models with a significantly smaller number of samples, thus reducing computational burden and training time.
Abstract: Severe class imbalance between majority and minority classes in Big Data can bias the predictive performance of Machine Learning algorithms toward the majority (negative) class. Where the minority (positive) class holds greater value than the majority (negative) class and the occurrence of false negatives incurs a greater penalty than false positives, the bias may lead to adverse consequences. Our paper incorporates two case studies, each utilizing three learners, six sampling approaches, two performance metrics, and five sampled distribution ratios, to uniquely investigate the effect of severe class imbalance on Big Data analytics. The learners (Gradient-Boosted Trees, Logistic Regression, Random Forest) were implemented within the Apache Spark framework. The first case study is based on a Medicare fraud detection dataset. The second case study, unlike the first, includes training data from one source (SlowlorisBig Dataset) and test data from a separate source (POST dataset). Results from the Medicare case study are not conclusive regarding the best sampling approach using Area Under the Receiver Operating Characteristic Curve and Geometric Mean performance metrics. However, it should be noted that the Random Undersampling approach performs adequately in the first case study. For the SlowlorisBig case study, Random Undersampling convincingly outperforms the other five sampling approaches (Random Oversampling, Synthetic Minority Over-sampling TEchnique, SMOTE-borderline1 , SMOTE-borderline2 , ADAptive SYNthetic) when measuring performance with Area Under the Receiver Operating Characteristic Curve and Geometric Mean metrics. Based on its classification performance in both case studies, Random Undersampling is the best choice as it results in models with a significantly smaller number of samples, thus reducing computational burden and training time.

Journal ArticleDOI
TL;DR: The methodology for determining test- and laboratory-specific criteria can be generalized into a practical approach that can be used by laboratories to reduce the cost and time burdens of confirmation without affecting clinical accuracy.

Journal ArticleDOI
TL;DR: In this paper, the authors evaluated the degree to which errors in species data (false presences −absences) affect model predictions and how this is reflected in com-monly used evaluation metrics.
Abstract: Aim: Species distribution information is essential under increasing global changes, and models can be used to acquire such information but they can be affected by dif-ferent errors/bias. Here, we evaluated the degree to which errors in species data (false presences–absences) affect model predictions and how this is reflected in com-monly used evaluation metrics.Location: Western Swiss Alps.Methods: Using 100 virtual species and different sampling methods, we created ob-servation datasets of different sizes (100–400–1,600) and added increasing levels of errors (creating false positives or negatives; from 0% to 50%). These degraded data-sets were used to fit models using generalized linear model, random forest and boosted regression trees. Model fit (ability to reproduce calibration data) and predic-tive success (ability to predict the true distribution) were measured on probabilistic/binary outcomes using Kappa, TSS, MaxKappa, MaxTSS and Somers’D (rescaled AUC).Results: The interpretation of models’ performance depended on the data and met-rics used to evaluate them, with conclusions differing whether model fit, or predic-tive success were measured. Added errors reduced model performance, with effects expectedly decreasing as sample size increased. Model performance was more af-fected by false positives than by false negatives. Models with different techniques were differently affected by errors: models with high fit presenting lower predictive success (RFs), and vice versa (GLMs). High evaluation metrics could still be obtained with 30% error added, indicating that some metrics (Somers’D) might not be sensitive enough to detect data degradation.Main conclusions: Our findings highlight the need to reconsider the interpretation scale of some commonly used evaluation metrics: Kappa seems more realistic than Somers’D/AUC or TSS. High fits were obtained with high levels of error added, show-ing that RF overfits the data. When collecting occurrence databases, it is advisory to reduce the rate of false positives (or increase sample sizes) rather than false negatives.

Journal ArticleDOI
TL;DR: This work proposes a novel strategy for fast candidate detection from volumetric chest CT scans, which can minimize false negatives (FNs) and false positives (FPs), and develops a simple yet effective CNNs-based classifier for FP reduction, which benefits from the candidate detection.
Abstract: In computed tomography, automated detection of pulmonary nodules with a broad spectrum of appearance is still a challenge, especially, in the detection of small nodules. An automated detection system usually contains two major steps: candidate detection and false positive (FP) reduction. We propose a novel strategy for fast candidate detection from volumetric chest CT scans, which can minimize false negatives (FNs) and false positives (FPs). The core of the strategy is a nodule-size-adaptive deep model that can detect nodules of various types, locations, and sizes from 3D images. After candidate detection, each result is located with a bounding cube, which can provide rough size information of the detected objects. Furthermore, we propose a simple yet effective CNNs-based classifier for FP reduction, which benefits from the candidate detection. The performance of the proposed nodule detection was evaluated on both independent and publicly available datasets. Our detection could reach high sensitivity with few FPs and it was comparable with the state-of-the-art systems and manual screenings. The study demonstrated that excellent candidate detection plays an important role in the nodule detection and can simplify the design of the FP reduction. The proposed candidate detection is an independent module, so it can be incorporated with any other FP reduction methods. Besides, it can be used as a potential solution for other similar clinical applications.

Journal ArticleDOI
TL;DR: In this paper, a 3D deep residual network was developed to distinguish true microbleeds from false positive mimics of a previously developed technique based on traditional algorithms, which achieved a detection precision of 71.9%.
Abstract: Cerebral microbleeds, which are small focal hemorrhages in the brain that are prevalent in many diseases, are gaining increasing attention due to their potential as surrogate markers of disease burden, clinical outcomes, and delayed effects of therapy. Manual detection is laborious and automatic detection and labeling of these lesions is challenging using traditional algorithms. Inspired by recent successes of deep convolutional neural networks in computer vision, we developed a 3D deep residual network that can distinguish true microbleeds from false positive mimics of a previously developed technique based on traditional algorithms. A dataset of 73 patients with radiation-induced cerebral microbleeds scanned at 7 T with susceptibility-weighted imaging was used to train and evaluate our model. With the resulting network, we maintained 95% of the true microbleeds in 12 test patients and the average number of false positives was reduced by 89%, achieving a detection precision of 71.9%, higher than existing published methods. The likelihood score predicted by the network was also evaluated by comparing to a neuroradiologist’s rating, and good correlation was observed.

Journal ArticleDOI
TL;DR: This algorithm is more efficient, fast, and less complex and spawns improved results and outperforms some of the best techniques used for mammogram classification based on Sensitivity, Specificity, Accuracy, and Area under the curve (ROC).
Abstract: Widespread use of electronic health records is a major cause of a massive dataset that ultimately results in Big Data. Computer-aided systems for healthcare can be an effective tool to automatically process such big data. Breast cancer is one of the major causes of high mortality rate among women in the world since it is difficult to detect due to lack of early symptoms. There is a number of techniques and advanced technologies available to detect breast tumors nowadays. One of the common approaches for breast tumour detection is mammography. The similarity between the normal (unaffected) tissues and the masses (affected) tissues is often very high that leads to false positives (FP). In the field of medicine, the sensitivity to false positives is very high because it results in false diagnosis and can lead to serious consequences. Therefore, it is a challenge for the researchers to correctly distinguish between the normal and affected tissues to increase the detection accuracy. Radiologists use Gabor filter bank for feature extraction and apply it to the entire input image that yields poor results. The proposed system optimizes the Gabor filter bank to select most appropriate Gabor filter using a metaheuristic algorithm known as “Cuckoo Search”. The proposed algorithm is run over sub-images in order to extract more descriptive features. Moreover, feature subset selection is used to reduce feature size because feature extracted from the segmented region of interest will be high dimensional and cannot be handled easily. This algorithm is more efficient, fast, and less complex and spawns improved results. The proposed method is tested on 2000 mammograms taken from DDSM database and outperforms some of the best techniques used for mammogram classification based on Sensitivity, Specificity, Accuracy, and Area under the curve (ROC).

Journal ArticleDOI
TL;DR: The proposed tracker efficiently handles occlusion situations and achieves competitive performance compared to the state-of-the-art methods, and shows the best multi-object tracking accuracy among the online and real-time executable methods.
Abstract: In this paper, we propose an efficient online multi-object tracking method based on the Gaussian mixture probability hypothesis density (GMPHD) filter and occlusion group management scheme where a hierarchical data association is utilized for the GMPHD filter to reduce the false negatives caused by missed detection. The hierarchical data association consisting of two modules, detection-to-track and track-to-track associations, can recover the lost tracks and their switched IDs. In addition, the proposed grouping management scheme handles occlusion problems with two main parts. The first part, “track merging” can merge the false positive tracks caused by false positive detections from occlusions. The occlusion of the false positive tracks is usually measured with some metric. In this research, we define the occlusion measure between visual objects, as sum-of-intersection-over-each-area (SIOA) instead of the commonly used intersection-over-union (IOU). The second part, “occlusion group energy minimization (OGEM)” prevents the occluded true positive tracks from false “track merging”. Each group of the occluded objects is expressed with an energy function and an optimal hypothesis will be obtained by minimizing the energy. We evaluate the proposed tracker in benchmarks such as MOT15 and MOT17 which are public datasets for multi-person tracking. An ablation study in training dataset reveals not only that “track merging” and “OGEM” complement each other, but also that the proposed tracking method shows more robust performance and less sensitiveness than baseline methods. Also, the tracking performance with SIOA is better than that with IOU for various sizes of false positives. Experimental results show that the proposed tracker efficiently handles occlusion situations and achieves competitive performance compared to the state-of-the-art methods. In fact, our method shows the best multi-object tracking accuracy among the online and real-time executable methods.

Journal ArticleDOI
TL;DR: This study proposes a cautious method to account for errors in acoustic identifications of any taxa without excessive manual checking of recordings, which will facilitate the improvement of large-scale monitoring, and ultimately the understanding of ecological responses.
Abstract: Assessing the state and trend of biodiversity in the face of anthropogenic threats requires large- scale and long-time monitoring, for which new recording methods offer interesting possibilities. Reduced costs and a huge increase in storage capacity of acoustic recorders has resulted in an exponential use of Passive Acoustic Monitoring (PAM) on a wide range of animal groups in recent years. PAM has led to a rapid growth in the quantity of acoustic data, making manual identification increasingly time-consuming. Therefore, software detecting sound events, extracting numerous features, and automatically identifying species have been developed. However, automated identification generates identification errors, which could influence analyses which looks at the ecological response of species. Taking the case of bats for which PAM constitutes an efficient tool,we propose a cautious method to account for errors in acoustic identifications of any taxa without excessive manual checking of recordings. We propose to check a representative sample of the outputs of a software commonly used in acoustic surveys (Tadarida), to model the identification success probability of 10 species and 2 species groups as a function of the confidence score provided for each automated identification. Using this relationship, we then investigated the effect of setting different False Positive Tolerances (FPTs), from a 50% to 10% false positive rate, above which data are discarded, by repeating a large- scale analysis of bat response to environmental variables and checking for consistency in the results. Considering estimates, standard errors and significance of species response to environmental variables, the main changes occurred between the naive (i.e. raw data) and robust analyses (i.e. using FPTs). Responses were highly stable between FPTs. We conclude it was essential to, at least, remove data above 50% FPT to minimize false positives. We recommend systematically checking the consistency of responses for at least two contrasting FPTs (e.g. 50% and 10%), in order to ensure robustness, and only going on to conclusive interpretation when these are consistent. This study provides a huge saving of time for manual checking, which will facilitate the improvement of large-scale monitoring, and ultimately our understanding of ecological responses.

Journal ArticleDOI
TL;DR: A novel two-stage intelligent intrusion detection system (IDS) based on machine learning algorithms to detect and protect from malicious attacks and eliminate the number of false positives.
Abstract: With the introduction of emerging technologies cybersecurity has become an inherited and amplified problem. New technologies bring significant developments but also come with new challenges in the cybersecurity area. The fight against malicious attacks is an everyday battle for every company. Challenges brought by security breaches can be devastating for a company and sometimes bring un-survivable circumstances. In this paper, we propose a novel two-stage intelligent intrusion detection system (IDS) to detect and protect from such malicious attacks. Intrusion Detection Systems are feasible solutions for cybersecurity problems, but they come with implementation challenges. Anomaly based IDS usually have a high rate of false positives (FP) and they require considerable computational requirements. The approach proposed in this paper consists of a two-stage architecture based on machine learning algorithms. In the first stage, the IDS uses K-Means to detect attacks and the second stage uses supervised learning to classify such attacks and eliminate the number of false positives. The implementation of this approach results in a computationally efficient IDS able to detect and classify attacks at a 99.97% accuracy while lowering the number of false positives to 0. The paper also evaluates the performance results and compares them with other relevant research papers. The performance of this proposed IDS is superior to the current state of the art.

Journal ArticleDOI
TL;DR: It was demonstrated that ATR-FTIR spectroscopy could be used as an efficient and reliable malaria diagnostic tool and has the potential to be developed for use at point of care under tropical field conditions.
Abstract: Widespread elimination of malaria requires an ultra-sensitive detection method that can detect low parasitaemia levels seen in asymptomatic carriers who act as reservoirs for further transmission of the disease, but is inexpensive and easy to deploy in the field in low income settings. It was hypothesized that a new method of malaria detection based on infrared spectroscopy, shown in the laboratory to have similar sensitivity to PCR based detection, could prove effective in detecting malaria in a field setting using cheap portable units with data management systems allowing them to be used by users inexpert in spectroscopy. This study was designed to determine whether the methodology developed in the laboratory could be translated to the field to diagnose the presence of Plasmodium in the blood of patients presenting at hospital with symptoms of malaria, as a precursor to trials testing the sensitivity of to detect asymptomatic carriers. The field study tested 318 patients presenting with suspected malaria at four regional clinics in Thailand. Two portable infrared spectrometers were employed, operated from a laptop computer or a mobile telephone with in-built software that guided the user through the simple measurement steps. Diagnostic modelling and validation testing using linear and machine learning approaches was performed against the gold standard qPCR. Sample spectra from 318 patients were used for building calibration models (112 positive and 110 negative samples according to PCR testing) and independent validation testing (39 positive and 57 negatives samples by PCR). The machine learning classification (support vector machines; SVM) performed with 92% sensitivity (3 false negatives) and 97% specificity (2 false positives). The Area Under the Receiver Operation Curve (AUROC) for the SVM classification was 0.98. These results may be better than as stated as one of the spectroscopy false positives was infected by a Plasmodium species other than Plasmodium falciparum or Plasmodium vivax, not detected by the PCR primers employed. In conclusion, it was demonstrated that ATR-FTIR spectroscopy could be used as an efficient and reliable malaria diagnostic tool and has the potential to be developed for use at point of care under tropical field conditions with spectra able to be analysed via a Cloud-based system, and the diagnostic results returned to the user’s mobile telephone or computer. The combination of accessibility to mass screening, high sensitivity and selectivity, low logistics requirements and portability, makes this new approach a potentially outstanding tool in the context of malaria elimination programmes. The next step in the experimental programme now underway is to reduce the sample requirements to fingerprick volumes.

Journal ArticleDOI
TL;DR: A careful and highly selective approach to identifying delta check analytes, calculation modes, and thresholds before putting them into practice is warranted; then follow-up with careful monitoring of performance and balancing true positives, false negatives, and false positives among delta check alerts is needed.
Abstract: International standards and practice guidelines recommend the use of delta check alerts for laboratory test result interpretation and quality control. The value of contemporary applications of simple univariate delta checks determined as an absolute change, percentage change, or rate of change to recognize specimen misidentification or other laboratory errors has not received much study. This review addresses these three modes of calculation, but in line with the majority of published work, most attention is focused on the identification of specimen misidentification errors. Investigation of delta check alerts are time-consuming and the yield of identified errors is usually small compared to the number of delta check alerts; however, measured analytes with low indices of individuality frequently perform better. While multivariate approaches to delta checks suggest improved usefulness over simple univariate delta check strategies, some of these are complex and not easily applied in contemporary laboratory information systems and middleware. Nevertheless, a simple application of delta checks may hold value in identifying clinically significant changes in several clinical situations: for acute kidney injury using changes in serum creatinine, for risk of osmotic demyelination syndrome using rapid acute changes in serum sodium levels, or for early triage of chest pain patients using high sensitivity troponin assays. A careful and highly selective approach to identifying delta check analytes, calculation modes, and thresholds before putting them into practice is warranted; then follow-up with careful monitoring of performance and balancing true positives, false negatives, and false positives among delta check alerts is needed.

Journal ArticleDOI
03 May 2019-Symmetry
TL;DR: In this paper, a method for calculating the dynamic background region in a video and removing false positives in order to overcome the problems of false positives that occur due to dynamic background and frame drop at slow speeds is proposed.
Abstract: In this paper, we propose a method for calculating the dynamic background region in a video and removing false positives in order to overcome the problems of false positives that occur due to the dynamic background and frame drop at slow speeds. Therefore, we need an efficient algorithm with a robust performance value including processing speed. The foreground is separated from the background by comparing the similarities between false positives and the foreground. In order to improve the processing speed, the median filter was optimized for the binary image. The proposed method was based on a CDnet 2012/2014 dataset and we achieved precision of 76.68%, FPR of 0.90%, FNR of 18.02%, and an F-measure of 75.35%. The average ranking across categories is 14.36, which is superior to the background subtraction method. The proposed method was operated at 45 fps (CPU), 150 fps (GPU) at 320 × 240 resolution. Therefore, we expect that the proposed method can be applied to current commercialized CCTV without any hardware upgrades.


Journal ArticleDOI
TL;DR: This paper investigates using machine learning algorithms to filter false positives using outlier detection models to remove outlier data and demonstrated the proposed classification model could be applied to real-time monitoring, ensuring false positives were filtered and hence not stored in the database.
Abstract: Radio frequency identification (RFID) is an automated identification technology that can be utilized to monitor product movements within a supply chain in real-time. However, one problem that occurs during RFID data capturing is false positives (i.e., tags that are accidentally detected by the reader but not of interest to the business process). This paper investigates using machine learning algorithms to filter false positives. Raw RFID data were collected based on various tagged product movements, and statistical features were extracted from the received signal strength derived from the raw RFID data. Abnormal RFID data or outliers may arise in real cases. Therefore, we utilized outlier detection models to remove outlier data. The experiment results showed that machine learning-based models successfully classified RFID readings with high accuracy, and integrating outlier detection with machine learning models improved classification accuracy. We demonstrated the proposed classification model could be applied to real-time monitoring, ensuring false positives were filtered and hence not stored in the database. The proposed model is expected to improve warehouse management systems by monitoring delivered products to other supply chain partners.

Journal ArticleDOI
02 Jan 2019-PLOS ONE
TL;DR: It is shown how a simple statistical model can be used to explore the quantitative tradeoff between reducing false positives and increasing false negatives, and it reveals that although α = 0.05 would indeed be approximately the optimal value in some realistic situations, the optimal α could actually be substantially larger or smaller in other situations.
Abstract: Researchers who analyze data within the framework of null hypothesis significance testing must choose a critical “alpha” level, α, to use as a cutoff for deciding whether a given set of data demonstrates the presence of a particular effect. In most fields, α = 0.05 has traditionally been used as the standard cutoff. Many researchers have recently argued for a change to a more stringent evidence cutoff such as α = 0.01, 0.005, or 0.001, noting that this change would tend to reduce the rate of false positives, which are of growing concern in many research areas. Other researchers oppose this proposed change, however, because it would correspondingly tend to increase the rate of false negatives. We show how a simple statistical model can be used to explore the quantitative tradeoff between reducing false positives and increasing false negatives. In particular, the model shows how the optimal α level depends on numerous characteristics of the research area, and it reveals that although α = 0.05 would indeed be approximately the optimal value in some realistic situations, the optimal α could actually be substantially larger or smaller in other situations. The importance of the model lies in making it clear what characteristics of the research area have to be specified to make a principled argument for using one α level rather than another, and the model thereby provides a blueprint for researchers seeking to justify a particular α level.

Journal ArticleDOI
TL;DR: A novel and robust statistical method to discover cell-level mutation information from scRNA-seq that can facilitate investigation of cell-to-cell heterogeneity is contributed.
Abstract: Motivation Both single-cell RNA sequencing (scRNA-seq) and DNA sequencing (scDNA-seq) have been applied for cell-level genomic profiling. For mutation profiling, the latter seems more natural. However, the task is highly challenging due to the limited input materials from only two copies of DNA molecules, while whole-genome amplification generates biases and other technical noises. ScRNA-seq starts with a higher input amount, so generally has better data quality. There exists various methods for mutation detection from DNA sequencing, it is not clear whether these methods work for scRNA-seq data. Results Mutation detection methods developed for either bulk-cell sequencing data or scDNA-seq data do not work well for the scRNA-seq data, as they produce substantial numbers of false positives. We develop a novel and robust statistical method-called SCmut-to identify specific cells that harbor mutations discovered in bulk-cell data. Statistically SCmut controls the false positives using the 2D local false discovery rate method. We apply SCmut to several scRNA-seq datasets. In scRNA-seq breast cancer datasets SCmut identifies a number of highly confident cell-level mutations that are recurrent in many cells and consistent in different samples. In a scRNA-seq glioblastoma dataset, we discover a recurrent cell-level mutation in the PDGFRA gene that is highly correlated with a well-known in-frame deletion in the gene. To conclude, this study contributes a novel method to discover cell-level mutation information from scRNA-seq that can facilitate investigation of cell-to-cell heterogeneity. Availability and implementation The source codes and bioinformatics pipeline of SCmut are available at https://github.com/nghiavtr/SCmut. Supplementary information Supplementary data are available at Bioinformatics online.

Proceedings ArticleDOI
06 Mar 2019
TL;DR: A new method based on 3D Convolutional Neural Networks (CNN) that can reduce the false positives rate while providing a high sensitivity in detecting lung cancer lesions is presented.
Abstract: Early diagnosis of lung cancer is very important in improving patients life expectancies. Due to the high number of Computed Tomography (CT) images, fast and accurate diagnosis is difficult for radiologists. Therefore, there is an increasing demand for Computer-Aid Diagnosis (CAD) lung cancer. The core of all lung cancer detection systems is the distinction between cancer and non-cancerous tissues. This operation is performed in the false positive reduction phase, which is one of the most critical part of the lung cancer detection systems. The primary objective of this paper is to present a new method based on 3D Convolutional Neural Networks (CNN) that can reduce the false positives rate while providing a high sensitivity in detecting lung cancer lesions. We obtained 91.23% accuracy for 3.99 false positive per scan using a new method for fusion. The reason for accuracy improvement while reducing the false positive rate is by taking advantage of knowledge obtained from the classifiers in using a new fusion method.

Proceedings ArticleDOI
21 Oct 2019
TL;DR: This work collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period and yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes.
Abstract: Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period.Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.