An Improved Search Algorithm to Find G-Quadruplexes in Genome Sequences
TL;DR: An improved (broadened) GQ-search algorithm is developed that accounts for the recently reported new types of GQs and confirms the G Q-forming potential of naturally occurring and model single-stranded DNA fragments defying the G3-NL1G3+NL2G 3+NL3G3- formula.
Abstract: A growing body of data suggests that the secondary structures adopted by G-rich polynucleotides may be more diverse than previously thought and that the definition of G-quadruplex-forming sequences should be broadened. We studied solution structures of a series of naturally occurring and model single-stranded DNA fragments defying the G3+NL1G3+NL2G3+NL3G3+ formula, which is used in most of the current GQ-search algorithms. The results confirm the GQ-forming potential of such sequences and suggest the existence of new types of GQs. We developed an improved (broadened) GQ-search algorithm (http://niifhm.ru/nauchnye-issledovanija/otdel-molekuljarnoj-biologii-i-genetiki/laboratorija-iskusstvennogo-antitelogeneza/497-2/) that accounts for the recently reported new types of GQs.
Summary (2 min read)
- Non-canonical polynucleotide structures play an important role in biogenesis processes, such as transcription, DNA repair, replication, translocation and RNA splicing (Saini et al. 2013).
- A clear view of DNA/RNA secondary structures and dynamics is necessary to understand the mechanisms of genomic regulation and to identify new biomarkers of pathology and drug targets.
- GQs with mismatches (mGQs) contain one or more substitutions of G for other nucleotides in the tetrads.
- (The mismatching nucleosides may participate in stacking).
- The authors present here the first GQ-search tool, imGQfinder, that accounts for noncanonical (‘imperfect’) quadruplex structures (imGQs; i.e., bGQs and mGQs) in addition to canonical GQs.
- ImGQ-motif definition and ImGQfinder interface (algorithm implementation).
- The 3 imGQ motif definition for imGQs with single defects is presented in Table 1.
- Some imGQs may also turn out to be ‘perfect’.
- The graphical user interface was developed using the Tk library.
- The inputs include the queried nucleotide sequence in fasta format, the number of tetrads and defects and the maximum loop length.
Structural studies (algorithm verification)
- Such structures are still relatively new, and there are few examples of well-characterized imGQs.
- The sequences Bcl, Ct1 and PSTP were taken from the human genome.
- For the UV-melting profiles, molecularity analysis, CD spectra of the model ONs G3, G4 and their mutants and all the corresponding experimental procedures, see the supporting information.
- The authors only considered 4-tetrad GQs and imGQs, which are generally more stable than 2- and 3-tetrad GQs according to the literature and their own physicochemical data.
- As expected, imGQs are substantially more abundant than GQs (Table 3).
- A new GQ-search algorithm, which is based on a broadened definition of quadruplex-forming sequences, and the user-friendly online tool ImGQfinder were developed.
- The algorithm was verified by structural studies of a series of ONs whose imGQ-forming potential was predicted by 1 G3 demonstrated extreme stability in potassium, and the stability was even superior to that of G4.
- Importantly, the physicochemical properties of ImGQs and GQ, such as the thermal stability under physiological conditions, appear to be rather similar.
- Large clusters of putative GQ/imGQ sites were found in the introns near the intron/exon boundaries and in the promoters that are approximately 100 bp downstream of the transcription start site.
- The results of several recent studies suggest 5’- UTR GQ participation in transcription and translational regulation (Huppert et al. 2008).
- The ON synthesis and purification, the MS analysis and the UV-melting, CD and rotational relaxation time measurements were performed as previously described (Varizhuk et al. 2013).
- For the analysis GQ/imGQ abundance and distribution in human genome, RefSeq genomic sequences (http://www.ncbi.nlm.nih.gov/refseq/rsg/about/) were used.
- The 1H chemical shifts were referenced relative to an external standard - sodium 2,2-dimethyl-2silapentane-5-sulfonate (DSS).
- The spectra were recorded using presaturation or pulsed-field gradient WATERGATE W5 pulse sequences (zgprsp and zggpw5 from the Bruker library, respectively) for H2O suppression.
- The starting positions of the GQ core atoms were obtained from the PDB (139D and 2KQH).
- The core of every GQ was created using SwissPDB Viewer.
- The electrostatic contribution to the hydration energy Gpolar was computed using the Generalized Born (GB) method (Onufriev et al. 2000) using the algorithm developed by Onufriev et al.
- SASA was computed using the LCPO method (Srinivasan et al. 1998) with α = 0.00542 kcal/mol-1 Å-2. The entropic term was not calculated explicitly, but it was accounted for implicitly via the GQ conformational mobility.
- Snapshots taken from a single trajectory of the MD simulation of the complex were used for the calculations of the binding free energy.
- For each of the two imGQ types, a single example is shown.
- Both mGQ and bGQ structures are diverse and can theoretically contain more than one mismatch or bulge.
- CD spectra of the ONs Bcl, CT1 and their mutants, also known as Figure 2. Left.
- The ellipticity is given per mole of nucleotide.
- 1H NMR spectra fragments of several GQ- and imGQ-ONs, also known as Right.
Did you find this useful? Give us your feedback
Cites background from "An Improved Search Algorithm to Fin..."
...Several tools have been released which attempt to include these sequences amongst matches (Varizhuk et al., 2014; Dhapola and Chowdhury, 2016; Hon et al., 2017)....
"An Improved Search Algorithm to Fin..." refers result in this paper
...The obtained value (359 k) is close to the previous estimations (376 k) (Huppert and Balasubramanian 2005)....
"An Improved Search Algorithm to Fin..." refers background in this paper
...The results of several recent studies suggest 5’-UTR GQ participation in transcription and translational regulation (Huppert et al. 2008)....
"An Improved Search Algorithm to Fin..." refers methods in this paper
...The electrostatic contribution to the hydration energy Gpolar was computed using the Generalized Born (GB) method (Onufriev et al. 2000) using the algorithm developed by Onufriev et al....
...The electrostatic contribution to the hydration energy Gpolar was computed using the Generalized Born (GB) method (Onufriev et al. 2000) using the algorithm developed by Onufriev et al. (Weiser et al. 1999; Onufriev et al. 2002) for calculating the effective Born radii....
"An Improved Search Algorithm to Fin..." refers methods in this paper
...All currently available online search tools for GQs (Quad finder (Scaria et al. 2006), QGRS Mapper (Kikin et al. 2006) and QGRS predictor (Menendez et al. 2012)) employ the G3+NL1G3+NL2G3+NL3G3+ formula, which only defines canonical (‘perfect’) GQs. Figure 1....
...2006), QGRS Mapper (Kikin et al. 2006) and QGRS predictor (Menendez et al....
Related Papers (2)
Frequently Asked Questions (1)
Q1. What are the contributions mentioned in the paper "An improved search algorithm to find g-quadruplexes in genome sequences" ?
ImGQfinder this paper is a search tool for non-canonical quadruplex structures ( imGQs ) in addition to canonical GQs.