Institution
Invincea
About: Invincea is a based out in . It is known for research contribution in the topics: Malware & Computer security model. The organization has 24 authors who have published 31 publications receiving 4525 citations.
Topics: Malware, Computer security model, Nanopore sequencing, Graph (abstract data type), Cloud computing security
Papers
More filters
••
TL;DR: Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences, is presented, demonstrating that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences or Oxford Nanopore technologies.
Abstract: Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences (PacBio) or Oxford Nanopore technologies and achieves a contig NG50 of >21 Mbp on both human and Drosophila melanogaster PacBio data sets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.
4,806 citations
••
20 Oct 2015TL;DR: A deep neural network based malware detection system that Invincea has developed is introduced, which achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware.
Abstract: In this paper we introduce a deep neural network based malware detection system that Invincea has developed, which achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware. We show that our system achieves a 95% detection rate at 0.1% false positive rate (FPR), based on more than 400,000 software binaries sourced directly from our customers and internal malware databases. In addition, we describe a non-parametric method for adjusting the classifier’s scores to better represent expected precision in the deployment environment. Our results demonstrate that it is now feasible to quickly train and deploy a low resource, highly accurate machine learning classification model, with false positive rates that approach traditional labor intensive expert rule based malware detection, while also detecting previously unseen malware missed by these traditional approaches. Since machine learning models tend to improve with larger datasizes, we foresee deep neural network classification models gaining in importance as part of a layered network defense strategy in coming years.
438 citations
•
TL;DR: In this paper, a deep neural network malware classifier is proposed that achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware.
Abstract: Malware remains a serious problem for corporations, government agencies, and individuals, as attackers continue to use it as a tool to effect frequent and costly network intrusions. Machine learning holds the promise of automating the work required to detect newly discovered malware families, and could potentially learn generalizations about malware and benign software that support the detection of entirely new, unknown malware families. Unfortunately, few proposed machine learning based malware detection methods have achieved the low false positive rates required to deliver deployable detectors.
In this paper we a deep neural network malware classifier that achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware. Specifically, we show that our system achieves a 95% detection rate at 0.1% false positive rate (FPR), based on more than 400,000 software binaries sourced directly from our customers and internal malware databases. We achieve these results by directly learning on all binaries, without any filtering, unpacking, or manually separating binary files into categories. Further, we confirm our false positive rates directly on a live stream of files coming in from Invincea's deployed endpoint solution, provide an estimate of how many new binary files we expected to see a day on an enterprise network, and describe how that relates to the false positive rate and translates into an intuitive threat score.
Our results demonstrate that it is now feasible to quickly train and deploy a low resource, highly accurate machine learning classification model, with false positive rates that approach traditional labor intensive signature based methods, while also detecting previously unseen malware.
342 citations
••
TL;DR: Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences, is presented, demonstrating that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either PacBio or Oxford Nanopore technologies.
Abstract: Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a complete reworking of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either PacBio or Oxford Nanopore technologies, and achieves a contig NG50 of greater than 21 Mbp on both human and Drosophila melanogaster PacBio datasets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs for analysis or integration with complementary phasing and scaffolding techniques. Canu source code and pre-compiled binaries are freely available under a GPLv2 license from https://github.com/marbl/canu.
122 citations
••
TL;DR: This work focuses on mosquitoes, malaria parasites and vertebrate hosts, because this system offers the opportunity to integrate from genetic and molecular mechanisms to population dynamics and because disrupting rhythms offers a novel avenue for disease control.
Abstract: The 24-h day involves cycles in environmental factors that impact organismal fitness. This is thought to select for organisms to regulate their temporal biology accordingly, through circadian and diel rhythms. In addition to rhythms in abiotic factors (such as light and temperature), biotic factors, including ecological interactions, also follow daily cycles. How daily rhythms shape, and are shaped by, interactions between organisms is poorly understood. Here, we review an emerging area, namely the causes and consequences of daily rhythms in the interactions between vectors, their hosts and the parasites they transmit. We focus on mosquitoes, malaria parasites and vertebrate hosts, because this system offers the opportunity to integrate from genetic and molecular mechanisms to population dynamics and because disrupting rhythms offers a novel avenue for disease control.
78 citations
Authors
Showing all 24 results
Name | H-index | Papers | Citations |
---|---|---|---|
Konstantin Berlin | 18 | 34 | 5515 |
Joshua Saxe | 13 | 25 | 1163 |
Robert Gove | 10 | 18 | 340 |
James E. Gentile | 9 | 23 | 396 |
Scott F. Cosby | 8 | 8 | 302 |
David Slater | 5 | 10 | 133 |
Anup Ghosh | 4 | 4 | 132 |
David Mentis | 3 | 3 | 69 |
Alexander Mason Long | 3 | 3 | 56 |
Kristina Blokhin | 2 | 3 | 34 |
Saxe Joshua Daniel | 2 | 3 | 12 |
Josh Saxe | 2 | 2 | 60 |
Michael Nathan Lack | 2 | 2 | 12 |
Alan Keister | 1 | 1 | 65 |
Christopher Greamo | 1 | 1 | 9 |