Highly improved homopolymer aware nucleotide-protein alignments with 454 data
Citations
41 citations
Cites methods from "Highly improved homopolymer aware n..."
...Next, the HAXAT program (Lysholm, 2012) was applied to the sequences (against a custom-built database of viral phoH sequences) in order to correct homopolymer sequence errors (using default parameters, except that both strands were queried and a minimum score of 200 was used)....
[...]
32 citations
Cites background from "Highly improved homopolymer aware n..."
...Received on May 12, 2014; revised on July 24, 2014; accepted on August 20, 2014...
[...]
...We should also mention HAXAT, which accurately aligns Roche 454 DNA sequences to proteins allowing for frameshifts, but is not designed for largescale searches (Lysholm, 2012)....
[...]
23 citations
5 citations
Cites methods from "Highly improved homopolymer aware n..."
...[26] Later algorithms by Huang and others improved the space complexity and also covered more possible mutations like intra codon frameshifts.[27,28] The base for all these dynamic programming algorithms is the recursive formula introduced by Smith and Waterman [24]:...
[...]
2 citations
References
8,434 citations
"Highly improved homopolymer aware n..." refers background or methods in this paper
...The intensity of each flow of nucleotide reagents is recorded, as a so-called flowpeak and collected in a flowgram [1]....
[...]
...One of the early next-generation techniques is Roche 454 sequencing [1], currently extensively used as it produces longer sequence reads compared to other platforms....
[...]
...The GS20 brought a huge leap in performance over present Sanger based techniques and produced around 20 million bases (Mb) per run [1]....
[...]
...As the homopolymer length is estimated from peak intensity, occasional uncertainties occur [1]....
[...]
8,326 citations
"Highly improved homopolymer aware n..." refers methods in this paper
...Kent WJ: BLAT–the BLAST-like alignment tool....
[...]
...Later, methods using improved heuristics, and/or innovative implementations to more efficiently utilise computer memory and processors have been proposed, e.g. MegaBLAST [10], SSAHA2 [11], BLAT [12] and Smith-Waterman-Gotoh alignments utilising SIMD instructions [13]....
[...]
...MegaBLAST [10], SSAHA2 [11], BLAT [12] and Smith-Waterman-Gotoh alignments utilising SIMD instructions [13]....
[...]
6,553 citations
"Highly improved homopolymer aware n..." refers methods in this paper
...Each mutation position was randomly chosen, while the substitution frequencies were proportional to the corresponding BLOSUM(X) substitution probabilities [23], where X denotes the clustering identity threshold used....
[...]
4,628 citations
"Highly improved homopolymer aware n..." refers methods in this paper
...MegaBLAST [10], SSAHA2 [11], BLAT [12] and Smith-Waterman-Gotoh alignments utilising SIMD instructions [13]....
[...]
4,522 citations
"Highly improved homopolymer aware n..." refers methods in this paper
...The parameter set which results in the best mean MCC is highlighted in bold. low identity targets....
[...]
...For example, at 40% identity with flowpeak information, a MCC-value of 0.68 was obtained, compared to 0.41 without the aid of flowpeak information....
[...]
...True/false positives/negatives were defined as before and the algorithm efficiency were evaluated by MCC, see Figure 5....
[...]
...This is clear from Table 3 where, even at 100% protein identity and using flowpeak information, the best MCC-value was 0.862....
[...]
...As the parameter combinations evaluated did not span all possible values of all parameters, some combinations received lower MCC-values than expected....
[...]