scispace - formally typeset
Search or ask a question
Author

Michael Tristem

Bio: Michael Tristem is an academic researcher from Imperial College London. The author has contributed to research in topics: Endogenous retrovirus & Paleovirology. The author has an hindex of 28, co-authored 45 publications receiving 3918 citations.

Papers
More filters
Journal ArticleDOI
01 Jan 2000-Heredity
TL;DR: The authors set themselves the tasks of showing how evolutionary information is written into gene sequences and describing the methods by which such information can be recovered and succeed, providing a wealth of examples and setting many of them in a historical perspective.
Abstract: The authors set themselves the tasks of showing how evolutionary information is written into gene sequences and describing the methods by which such information can be recovered. In this quite excellent book they succeed, providing a wealth of examples and setting many of them in a historical perspective.

517 citations

Journal ArticleDOI
TL;DR: This finding strongly suggests that the proliferation of the human ERV family HERV-K(HML2) has been almost entirely due to germ-line reinfection, rather than retrotransposition in cis or complementation in trans, and that an infectious pool of endogenous retroviruses has persisted within the primate lineage throughout the past 30 million years.
Abstract: Endogenous retrovirus (ERV) families are derived from their exogenous counterparts by means of a process of germ-line infection and proliferation within the host genome. Several families in the human and mouse genomes now consist of many hundreds of elements and, although several candidates have been proposed, the mechanism behind this proliferation has remained uncertain. To investigate this mechanism, we reconstructed the ratio of nonsynonymous to synonymous changes and the acquisition of stop codons during the evolution of the human ERV family HERV-K(HML2). We show that all genes, including the env gene, which is necessary only for movement between cells, have been under continuous purifying selection. This finding strongly suggests that the proliferation of this family has been almost entirely due to germ-line reinfection, rather than retrotransposition in cis or complementation in trans, and that an infectious pool of endogenous retroviruses has persisted within the primate lineage throughout the past 30 million years. Because many elements within this pool would have been unfixed, it is possible that the HERV-K(HML2) family still contains infectious elements at present, despite their apparent absence in the human genome sequence. Analysis of the env gene of eight other HERV families indicated that reinfection is likely to be the most common mechanism by which endogenous retroviruses proliferate in their hosts.

374 citations

Journal ArticleDOI
TL;DR: The evolution of ERV lineages is discussed, considering the processes by which ERV distribution and diversity is generated, and the relevance of ERVs to studies of genome evolution, host disease and viral ecology is considered.
Abstract: The retroviral capacity for integration into the host genome can give rise to endogenous retroviruses (ERVs): retroviral sequences that are transmitted vertically as part of the host germ line, within which they may continue to replicate and evolve ERVs represent both a unique archive of ancient viral sequence information and a dynamic component of host genomes As such they hold great potential as informative markers for studies of both virus evolution and host genome evolution Numerous novel ERVs have been described in recent years, particularly as genome sequencing projects have advanced This review discusses the evolution of ERV lineages, considering the processes by which ERV distribution and diversity is generated The diversity of ERVs isolated so far is summarised in terms of both their distribution across host taxa, and their relationships to recognised retroviral genera Finally the relevance of ERVs to studies of genome evolution, host disease and viral ecology is considered, and recent findings discussed

346 citations

Journal ArticleDOI
TL;DR: Members of each of the 10 families are defective, and calculation of their integration dates suggested that most of them are likely to have been present within the human lineage since it diverged from the Old World monkeys more than 25 million years ago.
Abstract: Human endogenous retroviruses (HERVs) were first identified almost 20 years ago, and since then numerous families have been described. It has, however, been difficult to obtain a good estimate of both the total number of independently derived families and their relationship to each other as well as to other members of the family Retroviridae. In this study, I used sequence data derived from over 150 novel HERVs, obtained from the Human Genome Mapping Project database, and a variety of recently identified nonhuman retroviruses to classify the HERVs into 22 independently acquired families. Of these, 17 families were loosely assigned to the class I HERVs, 3 to the class II HERVs and 2 to the class III HERVs. Many of these families have been identified previously, but six are described here for the first time and another four, for which only partial sequence information was previously available, were further characterized. Members of each of the 10 families are defective, and calculation of their integration dates suggested that most of them are likely to have been present within the human lineage since it diverged from the Old World monkeys more than 25 million years ago.

314 citations

Journal ArticleDOI
TL;DR: This work optimized the design of the PCR primers for human V genes and used them to amplify cDNA from human peripheral blood lymphocytes and identified a region conserved within V gene families, but differing between families, and used this to design family‐specific oligonucleotide probes.
Abstract: In recent work, the polymerase chain reaction (PCR) has been used to amplify rearranged mouse and human immunoglobulin heavy and kappa light chain variable (V) genes. Here we have optimized the design of the PCR primers for human V genes and used them to amplify cDNA from human peripheral blood lymphocytes. Cloning and sequencing revealed a diverse repertoire of V genes, and the presence of members of each human V gene family. After alignment of the sequences, we identified a region conserved within V gene families, but differing between families, and used this to design family-specific oligonucleotide probes.

253 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work has used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches.
Abstract: The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/. (Algorithm; computer simulations; maximum likelihood; phylogeny; rbcL; RDPII project.) The size of homologous sequence data sets has in- creased dramatically in recent years, and many of these data sets now involve several hundreds of taxa. More- over, current probabilistic sequence evolution models (Swofford et al., 1996 ; Page and Holmes, 1998 ), notably those including rate variation among sites (Uzzell and Corbin, 1971 ; Jin and Nei, 1990 ; Yang, 1996 ), require an increasing number of calculations. Therefore, the speed of phylogeny reconstruction methods is becoming a sig- nificant requirement and good compromises between speed and accuracy must be found. The maximum likelihood (ML) approach is especially accurate for building molecular phylogenies. Felsenstein (1981) brought this framework to nucleotide-based phy- logenetic inference, and it was later also applied to amino acid sequences (Kishino et al., 1990). Several vari- ants were proposed, most notably the Bayesian meth- ods (Rannala and Yang 1996; and see below), and the discrete Fourier analysis of Hendy et al. (1994), for ex- ample. Numerous computer studies (Huelsenbeck and Hillis, 1993; Kuhner and Felsenstein, 1994; Huelsenbeck, 1995; Rosenberg and Kumar, 2001; Ranwez and Gascuel, 2002) have shown that ML programs can recover the cor- rect tree from simulated data sets more frequently than other methods can. Another important advantage of the ML approach is the ability to compare different trees and evolutionary models within a statistical framework (see Whelan et al., 2001, for a review). However, like all optimality criterion-based phylogenetic reconstruction approaches, ML is hampered by computational difficul- ties, making it impossible to obtain the optimal tree with certainty from even moderate data sets (Swofford et al., 1996). Therefore, all practical methods rely on heuristics that obtain near-optimal trees in reasonable computing time. Moreover, the computation problem is especially difficult with ML, because the tree likelihood not only depends on the tree topology but also on numerical pa- rameters, including branch lengths. Even computing the optimal values of these parameters on a single tree is not an easy task, particularly because of possible local optima (Chor et al., 2000). The usual heuristic method, implemented in the pop- ular PHYLIP (Felsenstein, 1993 ) and PAUP ∗ (Swofford, 1999 ) packages, is based on hill climbing. It combines stepwise insertion of taxa in a growing tree and topolog- ical rearrangement. For each possible insertion position and rearrangement, the branch lengths of the resulting tree are optimized and the tree likelihood is computed. When the rearrangement improves the current tree or when the position insertion is the best among all pos- sible positions, the corresponding tree becomes the new current tree. Simple rearrangements are used during tree growing, namely "nearest neighbor interchanges" (see below), while more intense rearrangements can be used once all taxa have been inserted. The procedure stops when no rearrangement improves the current best tree. Despite significant decreases in computing times, no- tably in fastDNAml (Olsen et al., 1994 ), this heuristic becomes impracticable with several hundreds of taxa. This is mainly due to the two-level strategy, which sepa- rates branch lengths and tree topology optimization. In- deed, most calculations are done to optimize the branch lengths and evaluate the likelihood of trees that are finally rejected. New methods have thus been proposed. Strimmer and von Haeseler (1996) and others have assembled four- taxon (quartet) trees inferred by ML, in order to recon- struct a complete tree. However, the results of this ap- proach have not been very satisfactory to date (Ranwez and Gascuel, 2001 ). Ota and Li (2000, 2001) described

16,261 citations

Journal ArticleDOI
TL;DR: MEGA2 vastly extends the capabilities of MEGA version 1 by facilitating analyses of large datasets, enabling creation and analyses of groups of sequences, and expanding the repertoire of statistical methods for molecular evolutionary studies.
Abstract: Summary: We have developed a new software package, Molecular Evolutionary Genetics Analysis version 2 (MEGA2), for exploring and analyzing aligned DNA or protein sequences from an evolutionary perspective. MEGA2 vastly extends the capabilities of MEGA version 1 by: (1) facilitating analyses of large datasets; (2) enabling creation and analyses of groups of sequences; (3) enabling specification of domains and genes; (4) expanding the repertoire of statistical methods for molecular evolutionary studies; and (5) adding new modules for visual representation of input data and output results on the Microsoft Windows platform. Availability: http://www.megasoftware.net.

6,184 citations

Journal ArticleDOI
TL;DR: The results suggest that a single large phage display library can be used to isolate human antibodies against any antigen, by-passing both hybridoma technology and immunization.

2,678 citations

Journal ArticleDOI
TL;DR: This work proposes the first unified hierarchical classification system, designed on the basis of the transposition mechanism, sequence similarities and structural relationships, that can be easily applied by non-experts.
Abstract: Our knowledge of the structure and composition of genomes is rapidly progressing in pace with their sequencing. The emerging data show that a significant portion of eukaryotic genomes is composed of transposable elements (TEs). Given the abundance and diversity of TEs and the speed at which large quantities of sequence data are emerging, identification and annotation of TEs presents a significant challenge. Here we propose the first unified hierarchical classification system, designed on the basis of the transposition mechanism, sequence similarities and structural relationships, that can be easily applied by non-experts. The system and nomenclature is kept up to date at the WikiPoson web site.

2,425 citations

Patent
02 Dec 1992
TL;DR: In this paper, the authors described methods for the production of anti-self antibodies and antibody fragments, being antibodies or fragments of a particular species of mammal which bind self-antigens of that species.
Abstract: Methods are disclosed for the production of anti-self antibodies and antibody fragments, being antibodies or fragments of a particular species of mammal which bind self-antigens of that species. Methods comprise providing a library of replicable genetic display packages (rgdps), such as filamentous phage, each rgdp displaying at its surface a member of a specific binding pair which is an antibody or antibody fragment, and each rgdp containing nucleic acid sequence derived from a species of mammal. The nucleic acid sequence in each rgdp encodes a polypeptide chain which is a component part of the sbp member displayed at the surface of that rgdp. Anti-self antibody fragments are selected by binding with a self antigen from the said species of mammal. The displayed antibody fragments may be scFv, Fd, Fab or any other fragment which has the capability of binding antigen. Nucleic acid libraries used may be derived from a rearranged V-gene sequences of unimmunised mammal. Synthetic or artificial libraries are described and shown to be useful.

1,373 citations