scispace - formally typeset
Search or ask a question

Showing papers on "False positive paradox published in 2009"


Journal ArticleDOI
TL;DR: This commentary argues in favor of a principled approach to the multiple testing problem--one that places appropriate limits on the rate of false positives across the whole brain gives readers the information they need to properly evaluate the results.
Abstract: An incredible amount of data is generated in the course of a functional neuroimaging experiment. The quantity of data gives us improved temporal and spatial resolution with which to evaluate our results. It also creates a staggering multiple testing problem. A number of methods have been created that address the multiple testing problem in neuroimaging in a principled fashion. These methods place limits on either the familywise error rate (FWER) or the false discovery rate (FDR) of the results. These principled approaches are well established in the literature and are known to properly limit the amount of false positives across the whole brain. However, a minority of papers are still published every month using methods that are improperly corrected for the number of tests conducted. These latter methods place limits on the voxelwise probability of a false positive and yield no information on the global rate of false positives in the results. In this commentary, we argue in favor of a principled approach to the multiple testing problem--one that places appropriate limits on the rate of false positives across the whole brain gives readers the information they need to properly evaluate the results.

436 citations


Journal ArticleDOI
TL;DR: A real experiment is carried out that demonstrates the danger of not correcting for chance properly with functional neuroimaging data, where across the 130,000 voxels in a typical fMRI volume the probability of a false positive is almost certain.

331 citations


Journal ArticleDOI
TL;DR: McPAD (multiple classifier payload-based anomaly detector), a new accurate payload- based anomaly detection system that consists of an ensemble of one-class classifiers that is very accurate in detecting network attacks that bear some form of shell-code in the malicious payload.

296 citations


Journal ArticleDOI
TL;DR: The results suggest that removal of low MAF SNPs from analysis due to concerns about inflated false-positive results may not be appropriate.
Abstract: Determining the most promising single-nucleotide polymorphisms (SNPs) presents a challenge in genome-wide association studies, when hundreds of thousands of association tests are conducted. The power to detect genetic effects is dependent on minor allele frequency (MAF), and genome-wide association studies SNP arrays include SNPs with a wide distribution of MAFs. Therefore, it is critical to understand MAF's effect on the false positive rate.

129 citations


Proceedings ArticleDOI
16 Nov 2009
TL;DR: This work develops a novel approach, called Alattin, that includes a new mining algorithm and a technique for detecting neglected conditions based on the authors' mining algorithm, which helps reduce nearly 28% of false positives among detected violations.
Abstract: To improve software quality, static or dynamic verification tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rules as frequent patterns from program source code. Then these approaches use static defect-detection techniques to detect pattern violations in source code under analysis. These existing approaches often produce many false positives due to various factors. To reduce false positives produced by these mining approaches, we develop a novel approach, called Alattin, that includes a new mining algorithm and a technique for detecting neglected conditions based on our mining algorithm. Our new mining algorithm mines alternative patterns in example form "P1 or P2", where P1 and P2 are alternative rules such as condition checks on method arguments or return values related to the same API method. We conduct two evaluations to show the effectiveness of our Alattin approach. Our evaluation results show that (1) alternative patterns reach more than 40% of all mined patterns for APIs provided by six open source libraries; (2) the mining of alternative patterns helps reduce nearly 28% of false positives among detected violations.

118 citations


Journal ArticleDOI
TL;DR: This study illustrates the relative merits of different implementations of the target-decoy strategy, which should be worth contemplating when large-scale proteomic biomarker discovery is to be attempted.
Abstract: The potential of getting a significant number of false positives (FPs) in peptide-spectrum matches (PSMs) obtained by proteomic database search has been well-recognized. Among the attempts to assess FPs, the concomitant use of target and decoy databases is widely practiced. By adjusting filtering criteria, FPs and false discovery rate (FDR) can be controlled at a desired level. Although the target-decoy approach is gaining in popularity, subtle differences in decoy construction (e.g., reversing vs. stochastic methods), rate calculation (e.g., total vs. unique PSMs), or searching (separate vs. composite) do exist among various implementations. In the present study, we evaluated the effects of these differences on FP and FDR estimations using a rat kidney protein sample and the SEQUEST search engine as an example. On the effects of decoy construction, we found that, when a single scoring filter (XCorr) was used, stochastic methods generated a higher estimation of FPs and FDR than sequence reversing methods, likely due to an increase in unique peptides. This higher estimation could largely be attenuated by creating decoy databases similar in effective size, but not by a simple normalization with a unique-peptide coefficient. When multiple filters were applied, the differences seen between reversing and stochastic methods significantly diminished, suggesting multiple filterings reduce the dependency on how a decoy is constructed. For a fixed set of filtering criteria, FDR and FPs estimated by using unique PSMs were almost twice those using total PSMs. The higher estimation seemed to be dependent on data acquisition setup. As to the differences between performing separate or composite searches, in general, FDR estimated from separate search was about three times that from composite search. The degree of difference gradually decreased as the filtering criteria became more stringent. Paradoxically, the estimated true positives in separate search were higher when multiple filters were used. By analyzing a standard protein mixture, we demonstrated that the higher estimation of FDR and FPs in separate search likely reflected an overestimation, which could be corrected with a simple merging procedure. Our study illustrates the relative merits of different implementations of the target-decoy strategy, which should be worth contemplating when large-scale proteomic biomarker discovery is to be attempted.

99 citations


Journal ArticleDOI
TL;DR: This paper extends the basic LBP histogram descriptor into a spatially enhanced histogram which encodes both the local region appearance and the spatial structure of the masses and shows that LBP are effective and efficient descriptors for mammographic masses.

94 citations


Journal ArticleDOI
TL;DR: A selection of Weka cost-sensitive classifiers (Naive Bayes, SVM, C4.5 and Random Forest) are applied to a variety of bioassay datasets and the number of false positives arising from primary screening leads to the issue of whether this type of data should be used for virtual screening.
Abstract: There are three main problems associated with the virtual screening of bioassay data. The first is access to freely-available curated data, the second is the number of false positives that occur in the physical primary screening process, and finally the data is highly-imbalanced with a low ratio of Active compounds to Inactive compounds. This paper first discusses these three problems and then a selection of Weka cost-sensitive classifiers (Naive Bayes, SVM, C4.5 and Random Forest) are applied to a variety of bioassay datasets. Pharmaceutical bioassay data is not readily available to the academic community. The data held at PubChem is not curated and there is a lack of detailed cross-referencing between Primary and Confirmatory screening assays. With regard to the number of false positives that occur in the primary screening process, the analysis carried out has been shallow due to the lack of cross-referencing mentioned above. In six cases found, the average percentage of false positives from the High-Throughput Primary screen is quite high at 64%. For the cost-sensitive classification, Weka's implementations of the Support Vector Machine and C4.5 decision tree learner have performed relatively well. It was also found, that the setting of the Weka cost matrix is dependent on the base classifier used and not solely on the ratio of class imbalance. Understandably, pharmaceutical data is hard to obtain. However, it would be beneficial to both the pharmaceutical industry and to academics for curated primary screening and corresponding confirmatory data to be provided. Two benefits could be gained by employing virtual screening techniques to bioassay data. First, by reducing the search space of compounds to be screened and secondly, by analysing the false positives that occur in the primary screening process, the technology may be improved. The number of false positives arising from primary screening leads to the issue of whether this type of data should be used for virtual screening. Care when using Weka's cost-sensitive classifiers is needed - across the board misclassification costs based on class ratios should not be used when comparing differing classifiers for the same dataset.

94 citations


Journal ArticleDOI
TL;DR: A model for the formation of auditory hallucinations that are located externally, and experienced in noisy environments is proposed and the term hypervigilance hallucination is proposed for this type of experience.
Abstract: Background: This paper draws on cognitive psychology research and clinical observation to propose a model for the formation of auditory hallucinations that are located externally, and experienced in noisy environments. Method: This model highlights a series of cognitive processes that may make an individual prone to detecting false positives, i.e. believing they have heard something that is absent. A case study is used to illustrate the model. Results: It is suggested that the false positives may be a by-product of a perceptual system that has evolved to reduce false negatives in conditions of threat. The term hypervigilance hallucination is proposed for this type of experience. Conclusion: The clinical implications of the model are discussed.

94 citations


Journal ArticleDOI
TL;DR: High sensitivity is a more useful attribute in early detection of pre‐eclampsia than specificity because consideration of benefits, harms and costs indicates a much greater preference for minimizing false negatives than false positives, although the ideal would be to avoid both.
Abstract: The aim of this article is to review the accuracy of tests purported to be predictive of pre-eclampsia, a major cause of maternal and perinatal mortality and morbidity worldwide. A review of systematic reviews was done. A total of 219 studies were evaluated for the accuracy of 27 tests for predicting pre-eclampsia. Study quality assessment and data abstraction were performed using piloted proformas. Bivariate meta-analyses were used to synthesize data. Levels of sensitivity and specificity were measured. There were deficiencies in many areas of methodology including blinding, test description, and reference standard adequacy. No test had a high level of both sensitivity and specificity of greater than 90%. Where multiple studies were available, only BMI > 34, alpha-fetoprotein, fibronectin (cellular and total), and uterine artery Doppler (bilateral notching) measurements reached specificity above 90%. Only Doppler (any/unilateral notching, resistance index, and combinations) measurements were over 60% sensitive. Studies were of variable quality and most tests performed poorly. Further research should focus on tests which offer much higher levels of sensitivity than tests currently available. High sensitivity is a more useful attribute in early detection of pre-eclampsia than specificity because consideration of benefits, harms and costs indicates a much greater preference for minimizing false negatives than false positives, although the ideal would be to avoid both.

88 citations


Patent
27 Oct 2009
TL;DR: In this paper, a system that uses a map database as a predictive sensor and more specifically to a system and method of using map databases as a path predictive vehicle sensor or input with the additional ability to identify system related point of interests, or detect and internally correct for errors in the map database that are found during operation of the vehicle as well as preemptively identifying problematic errors in database that may create false negatives, or sometimes false positives when combined with a warning system such as a form of a stability system, crash avoidance system, or crash warning system.
Abstract: A system that uses a map database as a predictive sensor and more specifically to a system and method of using a map database as a path predictive vehicle sensor or input with the additional ability to identify system related point of interests, or detect and internally correct for errors in the map database that are found during operation of the vehicle as well as preemptively identifying problematic errors in the database that may create false negatives, or sometimes false positives when combined with a warning system such as a form of a stability system, crash avoidance system, or crash warning system.

Proceedings ArticleDOI
01 Apr 2009
TL;DR: This work identifies sets of alert characteristics predictive of actionable and unactionable alerts out of 51 candidate characteristics, and evaluates 15 machine learning algorithms, which build models to classify alerts.
Abstract: Automated static analysis can identify potential source code anomalies early in the software process that could lead to field failures. However, only a small portion of static analysis alerts may be important to the developer (actionable). The remainder are false positives (unactionable). We propose a process for building false positive mitigation models to classify static analysis alerts as actionable or unactionable using machine learning techniques. For two open source projects, we identify sets of alert characteristics predictive of actionable and unactionable alerts out of 51 candidate characteristics. From these selected characteristics, we evaluate 15 machine learning algorithms, which build models to classify alerts. We were able to obtain 88-97% average accuracy for both projects in classifying alerts using three to 14 alert characteristics. Additionally, the set of selected alert characteristics and best models differed between the two projects, suggesting that false positive mitigation models should be project-specific.

Proceedings ArticleDOI
15 Jun 2009
TL;DR: In this paper, the authors introduce a family of complementary techniques for measuring channel capacity automatically using a decision procedure (SAT or #SAT solver), which give either exact or narrow probabilistic bounds.
Abstract: The channel capacity of a program is a quantitative measure of the amount of control that the inputs to a program have over its outputs. Because it corresponds to worst-case assumptions about the probability distribution over those inputs, it is particularly appropriate for security applications where the inputs are under the control of an adversary. We introduce a family of complementary techniques for measuring channel capacity automatically using a decision procedure (SAT or #SAT solver), which give either exact or narrow probabilistic bounds.We then apply these techniques to the problem of analyzing false positives produced by dynamic taint analysis used to detect control-flow hijacking in commodity software. Dynamic taint analysis is based on the principle that an attacker should not be able to control values such as function pointers and return addresses, but it uses a simple binary approximation of control that commonly leads to both false positive and false negative errors. Based on channel capacity, we propose a more refined quantitative measure of influence, which can effectively distinguish between true attacks and false positives. We use a practical implementation of our influence measuring techniques, integrated with a dynamic taint analysis operating on x86 binaries, to classify tainting warnings produced by vulnerable network servers, such as those attacked by the Blaster and SQL Slammer worms. Influence measurement correctly distinguishes real attacks from tainting false positives, a task that would otherwise need to be done manually.

Book ChapterDOI
01 Oct 2009
TL;DR: This paper proposes a novel technique for the automatic detection of changes in web applications, which allows for the selective retraining of the affected anomaly detection models, and demonstrates that it can reduce false positives and allow for the automated ret training of the anomaly models.
Abstract: Because of the ad hoc nature of web applications, intrusion detection systems that leverage machine learning techniques are particularly well-suited for protecting websites. The reason is that these systems are able to characterize the applications' normal behavior in an automated fashion. However, anomaly-based detectors for web applications suffer from false positives that are generated whenever the applications being protected change. These false positives need to be analyzed by the security officer who then has to interact with the web application developers to confirm that the reported alerts were indeed erroneous detections. In this paper, we propose a novel technique for the automatic detection of changes in web applications, which allows for the selective retraining of the affected anomaly detection models. We demonstrate that, by correctly identifying legitimate changes in web applications, we can reduce false positives and allow for the automated retraining of the anomaly models. We have evaluated our approach by analyzing a number of real-world applications. Our analysis shows that web applications indeed change substantially over time, and that our technique is able to effectively detect changes and automatically adapt the anomaly detection models to the new structure of the changed web applications.

Journal ArticleDOI
TL;DR: A statistical method to flag potential problem officers by blending three methodologies that are the focus of active research efforts: propensity score weighting, doubly robust estimation, and false discovery rates is presented.
Abstract: Allegations of racially biased policing are a contentious issue in many communities. Processes that flag potential problem officers have become a key component of risk management systems at major police departments. We present a statistical method to flag potential problem officers by blending three methodologies that are the focus of active research efforts: propensity score weighting, doubly robust estimation, and false discovery rates. Compared with other systems currently in use, the proposed method reduces the risk of flagging a substantial number of false positives by more rigorously adjusting for potential confounders and by using the false discovery rate as a measure to flag officers. We apply the methodology to data on 500,000 pedestrian stops in New York City in 2006. Of the nearly 3,000 New York City Police Department officers regularly involved in pedestrian stops, we flag 15 officers who stopped a substantially greater fraction of black and Hispanic suspects than our statistical benchmark pre...

Proceedings ArticleDOI
27 Apr 2009
TL;DR: An approach for SQL injection vulnerability detection, automated by a prototype tool SQLInjectionGen, which had no false positives, but had a small number of false negatives while the static analysis tool had a false positive for every vulnerability that was actually protected by a white or black list.
Abstract: Our research objective is to facilitate the identification of true input manipulation vulnerabilities via the combination of static analysis, runtime detection, and automatic testing. We propose an approach for SQL injection vulnerability detection, automated by a prototype tool SQLInjectionGen. We performed case studies on two small web applications for the evaluation of our approach compared to static analysis for identifying true SQL injection vulnerabilities. In our case study, SQLInjectionGen had no false positives, but had a small number of false negatives while the static analysis tool had a false positive for every vulnerability that was actually protected by a white or black list.

Journal ArticleDOI
TL;DR: Challenges associated with post-deployment screening for mild traumatic brain injury are discussed and additional research is necessary to refine the sequential screening methodology, with the goal of minimizing false negatives during initial post- de deployment screening and minimizing false positives during follow-up evaluations.
Abstract: There is ongoing debate regarding the epidemiology of mild traumatic brain injury (MTBI) in military personnel. Accurate and timely estimates of the incidence of brain injury and the prevalence of long-term problems associated with brain injuries among active duty service members and veterans are essential for (a) operational planning, and (b) to allocate sufficient resources for rehabilitation and ongoing services and supports. The purpose of this article is to discuss challenges associated with post-deployment screening for MTBI. Multiple screening methods have been used in military, Veterans Affairs, and independent studies, which complicate cross-study comparisons of the resulting epidemiological data. We believe that post-deployment screening is important and necessary--but no screening methodology will be flawless, and false positives and false negatives are inevitable. Additional research is necessary to refine the sequential screening methodology, with the goal of minimizing false negatives during initial post-deployment screening and minimizing false positives during follow-up evaluations.

Journal ArticleDOI
05 Aug 2009-Heredity
TL;DR: The findings demonstrate the importance of an accurate characterization of population structure for methods based on FST, and show that certain events in the demographic history of a population can mimic the polymorphism patterns produced by selection.
Abstract: T he days of the neutral theory are seemingly over. In time for the Darwin year celebrations, recent research has allowed a remarkable comeback for selection as the dominant force in shaping the diversity of genotypes and phenotypes. This change in perception results mainly from the emerging field of evolutionary genomics. On the basis of newly available genome-wide polymorphism and divergence data, and driven by Big Genomics endeavours like the human HapMap Project, selection is detected without a phenotype, directly from DNA sequence data. Two main results emerge: (1) selection affects non-coding regions throughout the genome as well as coding regions, leading almost to a shortage of sequence material that can be considered reliably neutral in some species (Wright and Andolfatto, 2008). (2) There is evidence for frequent positive selection in the recent history of several species, including fruitflies, mice and particularly humans. Thousands of candidate regions for recent positive selection have been identified in 420 genome-wide scans in humans (Akey, 2009). Using a haplotype test, Hawks et al. (2007) found traces of adaptations in 7% of all human genes in as few as 40 000 years (3000 generations). Even higher estimates have been reported by Foll and Gaggiotti (2008). They used an FST based test and data from 53 human populations to find evidence for positive or balancing selection in 131 out of 560 (423%) randomly distributed STR marker loci. This is a staggering number, but how reliable are these estimates? It has long been known that demographic effects can confound the results, but how severe are these problems in real world applications? A new study by Excoffier et al. (2009) suggests that they can be very severe indeed. In particular, the findings demonstrate the importance of an accurate characterization of population structure for methods based on FST. Genomic tests for selection can be distinguished with respect to the summary statistics they use. Several of these statistics are used to detect hitchhiking events, also known as selective sweeps (Schlotterer, 2003; Pavlidis et al., 2008). These methods build on the characteristic footprint of recent positive selection on linked neutral DNA. The main effect is a local reduction in polymorphism, but the signal can also be picked up in the frequency spectrum and the local linkage disequilibrium or haplotype pattern. The strengths and weaknesses of the hitchhiking approach are quite well understood (Teshima et al., 2006; Thornton et al., 2007). The problem that is considered most severe is that certain events in the demographic history of a population can mimic the polymorphism patterns produced by selection. Population bottlenecks of a critical strength are the most dreaded alternative scenario. The reason is easy to understand: bottlenecks readily lead to large variances in the genealogical (coalescent) history of samples from different loci along a chromosome. These histories can either be short if the entire sample coalesces to a common ancestor during the bottleneck, or very much longer if several lines of descent extend through the bottleneck into a large ancestral population. As a result, almost all summary statistics show large variances, turning a population bottleneck into a neutral null-model that is hard to reject. Similarly, if a simpler demography is (wrongly) assumed, tests will produce an excess of false positives. An alternative method to detect selection from genomic data goes back to Lewontin and Krakauer (1973). It is based on genetic diversity between subpopulations (demes) as measured by FST and follows a simple intuition: regions under diversifying selection should exhibit larger divergence among demes than neutral loci (high FST). Similarly, regions under uniform balancing selection in all demes should be less differentiated (low FST). More recently, these ideas have been developed into sophisticated statistical frameworks to detect selection from genome scans (for example, Beaumont and Nichols, 1996; Beaumont and Balding, 2004; Foll and Gaggiotti, 2008). Compared with the hitchhiking approach, the FST method focuses on a different selection scenario: diversifying local selection instead of populationwide positive selection. Consequently, one expects to detect partly complementary sets of candidate loci. The method was criticised early on by Robertson (1975) concerning robustness with respect to demography; however, recent theoretical considerations and a limited number of simulations have led to speculations that the method might be less vulnerable (Beaumont, 2005). With the new work by Excoffier et al. (2009), this issue can be considered as settled. The authors convincingly establish a neutral model with hierarchical population structure as the ‘bottleneck scenario’ of the FST based approach. The reason is analogous to the case of sweeps and bottlenecks: due to hierarchical structure (and similarly due to range expansion or sequential population splits and mergers) different demes draw from different migrant pools, leading to higher levels of variance in FST than expected under an island model. To avoid excessive false positives, knowledge about the population structure needs to be built into the null distribution of FST that is used. For a hierarchical model, Excoffier et al. (2009) show how this can be done. The results are drastic—and sobering. In their reanalysis of human STR data, introduction of hierarchical structure based on five previously established geographic regions reduces the frequency of selection candidates from 23% (Foll and Gaggiotti, 2008) to no more than expected by chance (that is, comparable with the 1% significance level applied). What do these results imply? Certainly that many numbers in published studies are up for revision. But not necessarily that selection is rare. The problem is that our knowledge about false negatives is even more rudimentary than about false positives. For panmictic populations, the power of many tests to detect selection is known to be rather low. For a structured population, this information is basically missing. Realistic models of selection should further account for local adaptation, adaptation from standing genetic variation or interference among selected loci. It will be important to characterize the expected genomic footprints under realistic scenarios in much more detail and to construct adequate (combinations of) summary statistics to detect the Heredity (2009) 103, 283–284 & 2009 Macmillan Publishers Limited All rights reserved 0018-067X/09 $32.00

Journal ArticleDOI
TL;DR: This work considers the problem of designing a contaminant warning system for a municipal water distribution network that uses imperfect sensors, which can generate false-positive and false-negative detections, and describes a general exact nonlinear formulation for imperfect sensors and a linear approximation.
Abstract: We consider the problem of designing a contaminant warning system for a municipal water distribution network that uses imperfect sensors, which can generate false-positive and false-negative detections. Although sensor placement optimization methods have been developed for contaminant warning systems, most sensor placement formulations assume perfect sensors, which does not accurately reflect the behavior of real sensor technology. We describe a general exact nonlinear formulation for imperfect sensors and a linear approximation. We consider six general solution strategies, some of which have multiple solution methods. We applied these methods to three test networks, including one with over 10,000 nodes. Our experiments indicate that it is worth deploying a sensor network even when sensors have low detection probability. They also indicate it is worth paying attention to sensor imperfections when placing sensors even when there is a response delay of up to 8 h. The best choice of solution strategy depends upon the user's goals and the problem size. However, for large-scale problems with a moderate number of sensors, using a local search for the linear approximation formulation provides a reasonable-quality solution in a few minutes of computation. Our models assume that sensors can fail via false negatives. Additionally, we discuss ways to model false positives, ways to limit them, and how to trade them off against false negatives. All of our solution methods can handle false positives but our experiments do not explicitly consider them.

Proceedings ArticleDOI
24 Mar 2009
TL;DR: The results show that, compared with Det, PROUD offers a flexible trade-off between false positives and false negatives by controlling a threshold, while maintaining a similar computation cost.
Abstract: We present PROUD -- A PRObabilistic approach to processing similarity queries over Uncertain Data streams, where the data streams here are mainly time series streams. In contrast to data with certainty, an uncertain series is an ordered sequence of random variables. The distance between two uncertain series is also a random variable. We use a general uncertain data model, where only the mean and the deviation of each random variable at each timestamp are available. We derive mathematical conditions for progressively pruning candidates to reduce the computation cost. We then apply PROUD to a streaming environment where only sketches of streams, like wavelet synopses, are available. Extensive experiments are conducted to evaluate the effectiveness of PROUD and compare it with Det, a deterministic approach that directly processes data without considering uncertainty. The results show that, compared with Det, PROUD offers a flexible trade-off between false positives and false negatives by controlling a threshold, while maintaining a similar computation cost. In contrast, Det does not provide such flexibility. This trade-off is important as in some applications false negatives are more costly, while in others, it is more critical to keep the false positives low.

Book ChapterDOI
02 Sep 2009
TL;DR: An adversarial noise model that only limits the number of false observations is considered, and it is shown that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector.
Abstract: We study combinatorial group testing schemes for learning d-sparse boolean vectors using highly unreliable disjunctive measurements. We consider an adversarial noise model that only limits the number of false observations, and show that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector. On the positive side, we give a general framework for construction of highly noise-resilient group testing schemes using randomness condensers. Simple randomized instantiations of this construction give non-adaptive measurement schemes, with m = O(d log n) measurements, that allow efficient reconstruction of d-sparse vectors up to O(d) false positives even in the presence of δm false positives and Ω(m/d) false negatives within the measurement outcomes, for any constant δ < 1. None of these parameters can be substantially improved without dramatically affecting the others. Furthermore, we obtain several explicit (and incomparable) constructions, in particular one matching the randomized trade-off but using m = O(d1+o(1) log n) measurements. We also obtain explicit constructions that allow fast reconstruction in time poly(m), which would be sublinear in n for sufficiently sparse vectors.

Book ChapterDOI
27 Mar 2009
TL;DR: This is the first specification miner with such a low false positive rate, and thus a low associated burden of manual inspection, and the technique identifies which input is most indicative of program behavior, which allows off-the-shelf techniques to learn the same number of specifications using only 60% of their original input.
Abstract: Formal specifications can help with program testing, optimization, refactoring, documentation, and, most importantly, debugging and repair. Unfortunately, formal specifications are difficult to write manually, while techniques that infer specifications automatically suffer from 90---99% false positive rates. Consequently, neither option is currently practical for most software development projects. We present a novel technique that automatically infers partial correctness specifications with a very low false positive rate. We claim that existing specification miners yield false positives because they assign equal weight to all aspects of program behavior. By using additional information from the software engineering process, we are able to dramatically reduce this rate. For example, we grant less credence to duplicate code, infrequently-tested code, and code that exhibits high turnover in the version control system. We evaluate our technique in two ways: as a preprocessing step for an existing specification miner and as part of novel specification inference algorithms. Our technique identifies which input is most indicative of program behavior, which allows off-the-shelf techniques to learn the same number of specifications using only 60% of their original input. Our inference approach has few false positives in practice, while still finding useful specifications on over 800,000 lines of code. When minimizing false alarms, we obtain a 5% false positive rate, an order-of-magnitude improvement over previous work. When used to find bugs, our mined specifications locate over 250 policy violations. To the best of our knowledge, this is the first specification miner with such a low false positive rate, and thus a low associated burden of manual inspection.

Journal ArticleDOI
TL;DR: A new class of regularization is proposed, called recursive elastic net, to increase the capability of the elastic net and estimate gene networks based on the VAR model, which succeeds in reducing the number of false positives drastically while keeping the high number of true positives in the network inference.
Abstract: Inferring gene networks from time-course microarray experiments with vector autoregressive (VAR) model is the process of identifying functional associations between genes through multivariate time series. This problem can be cast as a variable selection problem in Statistics. One of the promising methods for variable selection is the elastic net proposed by Zou and Hastie (2005). However, VAR modeling with the elastic net succeeds in increasing the number of true positives while it also results in increasing the number of false positives. By incorporating relative importance of the VAR coefficients into the elastic net, we propose a new class of regularization, called recursive elastic net, to increase the capability of the elastic net and estimate gene networks based on the VAR model. The recursive elastic net can reduce the number of false positives gradually by updating the importance. Numerical simulations and comparisons demonstrate that the proposed method succeeds in reducing the number of false positives drastically while keeping the high number of true positives in the network inference and achieves two or more times higher true discovery rate (the proportion of true positives among the selected edges) than the competing methods even when the number of time points is small. We also compared our method with various reverse-engineering algorithms on experimental data of MCF-7 breast cancer cells stimulated with two ErbB ligands, EGF and HRG. The recursive elastic net is a powerful tool for inferring gene networks from time-course gene expression profiles.

Patent
19 Jun 2009
TL;DR: In this article, a system, method and computer program product for detection of false positives occurring during execution of anti-malware applications is presented, where the system calculates a probability of detection of a certain potential malware object.
Abstract: A system, method and computer program product for detection of false positives occurring during execution of anti-malware applications. The detection and correction of the false positives is implemented in two phases, before creation of new anti-virus databases (i.e., malware black lists) or before creation of new white lists, and after the anti-virus databases or new white lists are created and new false positives are detected. The system calculates a probability of detection of a certain potential malware object. Based on this probability, the system decides to either correct a white list (i.e., a collection of known clean objects) or update a black list (i.e., a collection of known malware objects). A process is separated into a several steps: creation and update (or correction) of white lists; creation and update of black lists; detection of collisions between these lists and correction of black lists or white lists based on the detected collisions.

Journal ArticleDOI
TL;DR: In this article, the authors evaluated the rates and nature of false positives in the CoRoT exoplanets search and compared their results with semi-empirical predictions, and classified the results of the follow-up observations completed to verify their planetary nature.
Abstract: Context. The CoRoT satellite searches for planets by applying the transit method, monitoring up to 12 000 stars in the galactic plane for 150 days in each observing run. This search is contaminated by a large fraction of false positives, caused by different eclipsing binary configurations that might be confused with a transiting planet. Aims. We evaluate the rates and nature of false positives in the CoRoT exoplanets search and compare our results with semiempirical predictions. Methods. We consider the detected binary and planet candidates in the first three extended CoRoT runs, and classify the results of the follow-up observations completed to verify their planetary nature. We group the follow-up results into undiluted binaries, diluted binaries, and planets and compare their abundances with predictions from the literature. Results. 83% of the initial detections are classified as false positives using only the CoRoT light-curves, the remaining 17% require follow-up observations. Finally, 12% of the candidates in the follow-up program are planets. The shape of the overall distribution of the false positive rate follows previous predictions, except for candidates with transit depths below about 0.4%. For candidates with transit depths in the range from 0.1–0.4%, CoRoT detections are nearly complete, and this difference from predictions is probably real and dominated by a lower than expected abundance of diluted eclipsing binaries.

Journal ArticleDOI
TL;DR: A drug approval process that can use post hoc subgroup analysis to eliminate false negatives but does not risk opportunistic behavior and spurious correlation is sought.
Abstract: The FDA employs an average-patient standard when reviewing drugs: it approves a drug only if the average patient (in clinical trials) does better on the drug than on control. It is common, however, for different patients to respond differently to a drug. Therefore, the average-patient standard can result in approval of a drug with significant negative effects for certain patient subgroups (false positives) and disapproval of drugs with significant positive effects for other patient subgroups (false negatives). Drug companies have a financial incentive to avoid false negatives. After their clinical trials reveal that their drug does not benefit the average patient, they conduct what is called post hoc subgroup analysis to highlight patients that benefit from the drug. The FDA rejects such analysis due to the risk of spurious results. With enough data dredging, a drug company can always find some patients that benefit from their drug. This paper asks whether there workable compromise between the FDA and drug companies. Specifically, we seek a drug approval process that can use post hoc subgroup analysis to eliminate false negatives but does not risk opportunistic behavior and spurious correlation. We recommend that the FDA or some other independent agent conduct subgroup analysis to identify patient subgroups that may benefit from a drug. Moreover, we suggest a number of statistical algorithms that operate as veil of ignorance rules to ensure that the independent agent is not indirectly captured by drug companies. We illustrate our proposal by applying it to the results of a recent clinical trial of a cancer drug (motexafin gadolinium) that was recently rejected by the FDA.

Proceedings ArticleDOI
23 May 2009
TL;DR: Helgrind+ is described, a dynamic race detection tool that incorporates correct handling of condition variables and a combination of the lockset algorithm and happens-before relation that reduces the number of both false negatives and false positives.
Abstract: Finding synchronization defects is difficult due to non-deterministic orderings of parallel threads. Current tools for detecting synchronization defects tend to miss many data races or produce an overwhelming number of false alarms. In this paper, we describe Helgrind+, a dynamic race detection tool that incorporates correct handling of condition variables and a combination of the lockset algorithm and happens-before relation. We compare our techniques with Intel Thread Checker and the original Helgrind tool on two substantial benchmark suites. Helgrind+ reduces the number of both false negatives (missed races) and false positives. The additional accuracy incurs almost no performance overhead.

Journal ArticleDOI
TL;DR: An efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space.
Abstract: Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines In order to realize reasonable comprehensive predictions which can involve many false positives, we propose two approaches for reduction of false positives: (i) efficient use of multiple statistical prediction models in the framework of two-layer SVM and (ii) reasonable design of the negative data to construct statistical prediction models In two-layer SVM, outputs produced by the first-layer SVM models, which are constructed with different negative samples and reflect different aspects of classifications, are utilized as inputs to the second-layer SVM In order to design negative data which produce fewer false positive predictions, we iteratively construct SVM models or classification boundaries from positive and tentative negative samples and select additional negative sample candidates according to pre-determined rules Moreover, in order to fully utilize the advantages of statistical learning methods, we propose a strategy to effectively feedback experimental results to computational predictions with consideration of biological effects of interest We show the usefulness of our approach in predicting potential ligands binding to human androgen receptors from more than 19 million chemical compounds and verifying these predictions by in vitro binding Moreover, we utilize this experimental validation as feedback to enhance subsequent computational predictions, and experimentally validate these predictions again This efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space

Journal ArticleDOI
TL;DR: Five types of false positives identified when aligning sequences to MS/MS spectra by Mascot database searching software are presented to highlight the importance of using unmatched peaks to remove false positives and offer direction to aid development of better sequence alignment algorithms for peptide and PTM identification.
Abstract: False positives that arise when MS/MS data are used to search protein sequence databases remain a concern in proteomics research. Here, we present five types of false positives identified when aligning sequences to MS/MS spectra by Mascot database searching software. False positives arise because of (1) enzymatic digestion at abnormal sites; (2) misinterpretation of charge states; (3) misinterpretation of protein modifications; (4) incorrect assignment of the protein modification site; and (5) incorrect use of isotopic peaks. We present examples, clearly identified as false positives by manual inspection, that nevertheless were assigned high scores by Mascot sequence alignment algorithm. In some examples, the sequence assigned to the MS/MS spectrum explains more than 80% of the fragment ions present. Because of high sequence similarity between the false positives and their corresponding true hits, the false positive rate cannot be evaluated by the common method of using a reversed or scrambled sequence da...

Proceedings ArticleDOI
18 Oct 2009
TL;DR: A data mining based real-time method for distinguishing important network IDS alerts from frequently occurring false positives and events of low importance that is fully automated and able to adjust to environment changes without a human intervention.
Abstract: During the last decade, intrusion detection systems (IDSs) have become a widely used measure for security management. However, these systems often generate many false positives and irrelevant alerts. In this paper, we propose a data mining based real-time method for distinguishing important network IDS alerts from frequently occurring false positives and events of low importance. Unlike conventional data mining based approaches, our method is fully automated and able to adjust to environment changes without a human intervention.