scispace - formally typeset
Open AccessJournal ArticleDOI

Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants.

TLDR
It is shown that the super-helical RNA-binding surface of RNA-editing factors is potentially longer than previously recognised, and used the redefined motifs to develop accurate and consistent annotations of PPR sequences from 109 genomes.
Abstract
The pentatricopeptide repeat (PPR) proteins form one of the largest protein families in land plants. They are characterised by tandem 30-40 amino acid motifs that form an extended binding surface capable of sequence-specific recognition of RNA strands. Almost all of them are post-translationally targeted to plastids and mitochondria, where they play important roles in post-transcriptional processes including splicing, RNA editing and the initiation of translation. A code describing how PPR proteins recognise their RNA targets promises to accelerate research on these proteins, but making use of this code requires accurate definition and annotation of all of the various nucleotide-binding motifs in each protein. We have used a structural modelling approach to define 10 different variants of the PPR motif found in plant proteins, in addition to the putative deaminase motif that is found at the C-terminus of many RNA-editing factors. We show that the super-helical RNA-binding surface of RNA-editing factors is potentially longer than previously recognised. We used the redefined motifs to develop accurate and consistent annotations of PPR sequences from 109 genomes. We report a high error rate in PPR gene models in many public plant proteomes, due to gene fusions and insertions of spurious introns. These consistently annotated datasets across a wide range of species are valuable resources for future comparative genomics studies, and an essential pre-requisite for accurate large-scale computational predictions of PPR targets. We have created a web portal (http://www.plantppr.com) that provides open access to these resources for the community.

read more

Citations
More filters
Journal ArticleDOI

Shifting the limits in wheat research and breeding using a fully annotated reference genome

Rudi Appels, +207 more
- 17 Aug 2018 - 
TL;DR: This annotated reference sequence of wheat is a resource that can now drive disruptive innovation in wheat improvement, as this community resource establishes the foundation for accelerating wheat research and application through improved understanding of wheat biology and genomics-assisted breeding.
Journal ArticleDOI

Multiple wheat genomes reveal global variation in modern breeding.

Sean Walkowiak, +103 more
- 25 Nov 2020 - 
TL;DR: Comparative analysis of multiple genome assemblies from wheat reveals extensive diversity that results from the complex breeding history of wheat and provides a basis for further potential improvements to this important food crop.
Journal ArticleDOI

Fern genomes elucidate land plant evolution and cyanobacterial symbioses

TL;DR: The genomes of two fern species, Azolla filiculoides and Salvinia cucullata, are reported and insights into fern-specific whole-genome duplications, f Fern-specific insect-resistant gene evolution and fern–cyanobacterial symbiosis are provided.

Niche of harmful alga Aureococcus anophagefferens revealed through ecogenomics - eScholarship

TL;DR: A. anophagefferens possesses a larger genome (56 Mbp) and has more genes involved in light harvesting, organic carbon and nitrogen use, and encoding selenium- and metal-requiring enzymes than competing phytoplankton as mentioned in this paper.
References
More filters
Journal ArticleDOI

EMBOSS: The European Molecular Biology Open Software Suite

TL;DR: The European Molecular Biology Open Software Suite is a mature package of software tools developed for the molecular biology community that includes a comprehensive set of applications for molecular sequence analysis and other tasks and integrates popular third-party software packages under a consistent interface.
Journal ArticleDOI

Pfam: the protein families database.

TL;DR: Pfam as discussed by the authors is a widely used database of protein families, containing 14 831 manually curated entries in the current version, version 27.0, and has been updated several times since 2012.
Journal ArticleDOI

The genome sequence of Drosophila melanogaster

Mark Raymond Adams, +194 more
- 24 Mar 2000 - 
TL;DR: The nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome is determined using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map.
Book

Accelerated Profile HMM Searches

TL;DR: An acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm, which computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment.
Related Papers (5)