ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads
Citations
9,827 citations
5,850 citations
Cites methods from "ALLPATHS 2: small genomes assembled..."
...We used whole genome sequencing data from Staphylococcus aureus subspecies aureus strain USA 300 TCH 1516 sequenced by MacCallum et al. (2009) and retrieved from the GAGE-B repository (http://ccb.jhu.edu/gage_b/)....
[...]
3,270 citations
Cites methods from "ALLPATHS 2: small genomes assembled..."
...…overlap and DNA fragment sizes as well as the following two empirical datasets: (1) deep sequencing data of the Staphylococcus aureus genome by MacCallum et al. (2009), (2) reads generated from paired-end sequencing of a known single sequence (template) used by Masella et al. (2012) to test…...
[...]
1,616 citations
Cites background or methods from "ALLPATHS 2: small genomes assembled..."
...] We developed several laboratory techniques for making the libraries (see SI Materials and Methods for details): (i) For fragments, we adapted existing protocols with the goal of improving the representation of high GC-content DNA; (ii) for short jumps (∼3 kb), we used the Illumina protocol (6); (iii) for long jumps (∼6 kb), we used a protocol that we had previously developed, on the basis of a protocol for the SOLiD sequencing platform that involves circularization and EcoP15I digestion (7, 9); and (iv) for Fosmid jumps (∼40 kb), we developed two methodologies, “ShARC” and “Fosill” (described in SI Materials and Methods)....
[...]
...For this purpose, we made extensive improvements to our previous program ALLPATHS (9, 16), which can routinely assemble small genomes....
[...]
...Scaffold accuracy: Validity at 100 kb (9): We report the probability that two 100-base sequences in the assembly, separated by 100 kb, and also present in the reference, have the same orientation and are separated by 100 kb ± 10%....
[...]
...In practice, however, recalcitrant sequence contexts (including those with low and high GC content) do cause low coverage (9, 18), sometimes even to zero....
[...]
1,176 citations
Cites background from "ALLPATHS 2: small genomes assembled..."
...It was published with results on simulated data [57] and revised for real data [58]....
[...]
References
9,389 citations
"ALLPATHS 2: small genomes assembled..." refers background or methods in this paper
...To understand how the ALLPATHS assemblies would compare to assemblies produced by existing software, we also assembled the identical datasets with Velvet [12] and EULERSR [9,14], using standardized arguments for each assembler applied to all five genomes....
[...]
...Recent work has begun to explore the possibilities of short read assembly [6-14], but high-quality assembly from experimentally generated paired reads has not been demonstrated, even for small genomes....
[...]
...We also ran the assembly programs Velvet [12] and EULER-SR [9,14] on the same data sets and provide a side-by-side comparison....
[...]
8,434 citations
7,627 citations
5,334 citations
3,802 citations
"ALLPATHS 2: small genomes assembled..." refers background or methods in this paper
...For example, a recent method [5] based on blunt end ligation rather than restriction generates jumping construct libraries of sufficient complexity for large genomes and not having a hard size limit on end reads....
[...]
...Background Recent advances in sequencing technology [1-5] have rapidly driven down the cost of DNA sequence data....
[...]
...The data for the assemblies were of three types: paired 36base reads [5] derived from approximately 200-bp fragments, paired 26-base reads derived via a 'jumping' construction from approximately 4,000-bp fragments, and for one genome, additional unpaired 36-base reads....
[...]