scispace - formally typeset
Search or ask a question
Institution

Invincea

About: Invincea is a based out in . It is known for research contribution in the topics: Malware & Computer security model. The organization has 24 authors who have published 31 publications receiving 4525 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences, is presented, demonstrating that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences or Oxford Nanopore technologies.
Abstract: Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences (PacBio) or Oxford Nanopore technologies and achieves a contig NG50 of >21 Mbp on both human and Drosophila melanogaster PacBio data sets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.

4,806 citations

Proceedings ArticleDOI
Joshua Saxe1, Konstantin Berlin1
20 Oct 2015
TL;DR: A deep neural network based malware detection system that Invincea has developed is introduced, which achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware.
Abstract: In this paper we introduce a deep neural network based malware detection system that Invincea has developed, which achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware. We show that our system achieves a 95% detection rate at 0.1% false positive rate (FPR), based on more than 400,000 software binaries sourced directly from our customers and internal malware databases. In addition, we describe a non-parametric method for adjusting the classifier’s scores to better represent expected precision in the deployment environment. Our results demonstrate that it is now feasible to quickly train and deploy a low resource, highly accurate machine learning classification model, with false positive rates that approach traditional labor intensive expert rule based malware detection, while also detecting previously unseen malware missed by these traditional approaches. Since machine learning models tend to improve with larger datasizes, we foresee deep neural network classification models gaining in importance as part of a layered network defense strategy in coming years.

438 citations

Posted Content
Joshua Saxe1, Konstantin Berlin1
TL;DR: In this paper, a deep neural network malware classifier is proposed that achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware.
Abstract: Malware remains a serious problem for corporations, government agencies, and individuals, as attackers continue to use it as a tool to effect frequent and costly network intrusions. Machine learning holds the promise of automating the work required to detect newly discovered malware families, and could potentially learn generalizations about malware and benign software that support the detection of entirely new, unknown malware families. Unfortunately, few proposed machine learning based malware detection methods have achieved the low false positive rates required to deliver deployable detectors. In this paper we a deep neural network malware classifier that achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware. Specifically, we show that our system achieves a 95% detection rate at 0.1% false positive rate (FPR), based on more than 400,000 software binaries sourced directly from our customers and internal malware databases. We achieve these results by directly learning on all binaries, without any filtering, unpacking, or manually separating binary files into categories. Further, we confirm our false positive rates directly on a live stream of files coming in from Invincea's deployed endpoint solution, provide an estimate of how many new binary files we expected to see a day on an enterprise network, and describe how that relates to the false positive rate and translates into an intuitive threat score. Our results demonstrate that it is now feasible to quickly train and deploy a low resource, highly accurate machine learning classification model, with false positive rates that approach traditional labor intensive signature based methods, while also detecting previously unseen malware.

342 citations

Posted ContentDOI
24 Aug 2016-bioRxiv
TL;DR: Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences, is presented, demonstrating that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either PacBio or Oxford Nanopore technologies.
Abstract: Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a complete reworking of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either PacBio or Oxford Nanopore technologies, and achieves a contig NG50 of greater than 21 Mbp on both human and Drosophila melanogaster PacBio datasets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs for analysis or integration with complementary phasing and scaffolding techniques. Canu source code and pre-compiled binaries are freely available under a GPLv2 license from https://github.com/marbl/canu.

122 citations

Journal ArticleDOI
14 Apr 2016-Insects
TL;DR: This work focuses on mosquitoes, malaria parasites and vertebrate hosts, because this system offers the opportunity to integrate from genetic and molecular mechanisms to population dynamics and because disrupting rhythms offers a novel avenue for disease control.
Abstract: The 24-h day involves cycles in environmental factors that impact organismal fitness. This is thought to select for organisms to regulate their temporal biology accordingly, through circadian and diel rhythms. In addition to rhythms in abiotic factors (such as light and temperature), biotic factors, including ecological interactions, also follow daily cycles. How daily rhythms shape, and are shaped by, interactions between organisms is poorly understood. Here, we review an emerging area, namely the causes and consequences of daily rhythms in the interactions between vectors, their hosts and the parasites they transmit. We focus on mosquitoes, malaria parasites and vertebrate hosts, because this system offers the opportunity to integrate from genetic and molecular mechanisms to population dynamics and because disrupting rhythms offers a novel avenue for disease control.

78 citations


Network Information
Related Institutions (5)
Information Sciences Institute
2K papers, 114.6K citations

79% related

National Institute of Informatics
7.4K papers, 129.3K citations

77% related

Helsinki Institute for Information Technology
1.9K papers, 63.4K citations

75% related

BBN Technologies
3.1K papers, 158.5K citations

75% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20202
20192
20181
20172
20168
20156