scispace - formally typeset
Search or ask a question

Showing papers on "Genome published in 1995"


Journal ArticleDOI
28 Jul 1995-Science
TL;DR: An approach for genome analysis based on sequencing and assembly of unselected pieces of DNA from the whole chromosome has been applied to obtain the complete nucleotide sequence of the genome from the bacterium Haemophilus influenzae Rd.
Abstract: An approach for genome analysis based on sequencing and assembly of unselected pieces of DNA from the whole chromosome has been applied to obtain the complete nucleotide sequence (1,830,137 base pairs) of the genome from the bacterium Haemophilus influenzae Rd. This approach eliminates the need for initial mapping efforts and is therefore applicable to the vast array of microbial species for which genome maps are unavailable. The H. influenzae Rd genome sequence (Genome Sequence DataBase accession number L42023) represents the only complete genome sequence from a free-living organism.

5,944 citations


Journal ArticleDOI
20 Oct 1995-Science
TL;DR: Comparison of the Mycoplasma genitalium genome to that of Haemophilus influenzae suggests that differences in genome content are reflected as profound differences in physiology and metabolic capacity between these two organisms.
Abstract: The complete nucleotide sequence (580,070 base pairs) of the Mycoplasma genitalium genome, the smallest known genome of any free-living organism, has been determined by whole-genome random sequencing and assembly. A total of only 470 predicted coding regions were identified that include genes required for DNA replication, transcription and translation, DNA repair, cellular transport, and energy metabolism. Comparison of this genome to that of Haemophilus influenzae suggests that differences in genome content are reflected as profound differences in physiology and metabolic capacity between these two organisms.

2,565 citations


Journal ArticleDOI
TL;DR: Three statistics (%GC, GC-skew, and AT-s Skew) can be used to describe the overall patterns of nucleotide composition in DNA sequences, which reflect the substitution process.
Abstract: Three statistics (%GC, GC-skew, and AT-skew) can be used to describe the overall patterns of nucleotide composition in DNA sequences. Fourfold degenerate third codon positions from 16 animal mitochondrial genomes were analyzed. The overall composition, as measured by %GC, varies from 3.6 %GC in the honeybee to 47.2 %GC in human mtDNA. Compositional differences between strands of the mitochondrial genome were quantified using the two skew statistics presented in this paper. Strand-specific distribution of bases varies among animal taxa independently of overall %GC. Compositional patterns reflect the substitution process. Description of these patterns may aid in the formation of hypotheses about substitutional mechanisms.

1,090 citations


Journal ArticleDOI
TL;DR: PCR mapping of integrons can be a useful epidemiological tool to study the evolution of multiresistance plasmids and transposons and dissemination of antibiotic resistance genes.
Abstract: The integron is a new type of mobile element which has evolved by a site-specific recombinational mechanism. Integrons consist of two conserved segments of DNA separated by a variable region containing one or more genes integrated as cassettes. Oligonucleotide probes specific for the conserved segments have revealed that integrons are widespread in recently isolated clinical bacteria. Also, by using oligonucleotide probes for several antibiotic resistance genes, we have found novel combinations of resistance genes in these strains. By using PCR, we have determined the content and order of the resistance genes inserted between the conserved segments in the integrons of these clinical isolates. PCR mapping of integrons can be a useful epidemiological tool to study the evolution of multiresistance plasmids and transposons and dissemination of antibiotic resistance genes.

1,002 citations


Journal ArticleDOI
TL;DR: It is demonstrated that polyploid species can generate extensive genetic diversity in a short period of time and genetic divergence among the derivatives of synthetic polyploids was evident from variation in genome composition and phenotypes.
Abstract: Although the evolutionary success of polyploidy in higher plants has been widely recognized, there is virtually no information on how polyploid genomes have evolved after their formation. In this report, we used synthetic polyploids of Brassica as a model system to study genome evolution in the early generations after polyploidization. The initial polyploids we developed were completely homozygous, and thus, no nuclear genome changes were expected in self-fertilized progenies. However, extensive genome change was detected by 89 nuclear DNA clones used as probes. Most genome changes involved loss and/or gain of parental restriction fragments and appearance of novel fragments. Genome changes occurred in each generation from F2 to F5, and the frequency of change was associated with divergence of the diploid parental genomes. Genetic divergence among the derivatives of synthetic polyploids was evident from variation in genome composition and phenotypes. Directional genome changes, possibly influenced by cytoplasmic-nuclear interactions, were observed in one pair of reciprocal synthetics. Our results demonstrate that polyploid species can generate extensive genetic diversity in a short period of time. The occurrence and impact of this process in the evolution of natural polyploids is unknown, but it may have contributed to the success and diversification of many polyploid lineages in both plants and animals.

959 citations


Journal ArticleDOI
22 Dec 1995-Science
TL;DR: A physical map has been constructed of the human genome containing 15,086 sequence-tagged sites (STSs), with an average spacing of 199 kilobases, anchored by the radiation hybrid and genetic maps.
Abstract: A physical map has been constructed of the human genome containing 15,086 sequence-tagged sites (STSs), with an average spacing of 199 kilobases. The project involved assembly of a radiation hybrid map of the human genome containing 6193 loci and incorporated a genetic linkage map of the human genome containing 5264 loci. This information was combined with the results of STS-content screening of 10,850 loci against a yeast artificial chromosome library to produce an integrated map, anchored by the radiation hybrid and genetic maps. The map provides radiation hybrid coverage of 99 percent and physical coverage of 94 percent of the human genome. The map also represents an early step in an international project to generate a transcript map of the human genome, with more than 3235 expressed sequences localized. The STSs in the map provide a scaffold for initiating large-scale sequencing of the human genome.

814 citations


Journal ArticleDOI
28 Jul 1995-Cell
TL;DR: Cells and mice that are deficient for the presumed DNA mismatch repair (MMR) gene Msh2 have lost mismatch binding and have acquired microsatellite instability, a mutator phenotype, and tolerance to methylating agents, suggesting that Msh1 is involved in safeguarding the genome from promiscuous recombination.

813 citations


Journal ArticleDOI
TL;DR: This study demonstrates that the process occurs for tandemly repeated sequences in diploid and polyploid plants and that interlocus concerted evolution can occur bidirectionally subsequent to hybidization andpolyploidization, which has significant implications for phylogeny reconstruction.
Abstract: Polyploidy is a prominent process in plant evolution; yet few data address the question of whether homeologous sequences evolve independently subsequent to polyploidization. We report on ribosomal DNA (rDNA) evolution in five allopolyploid (AD genome) species of cotton (Gossypium) and species representing their diploid progenitors (A genome, D genome). Sequence data from the internal transcribed spacer regions (ITS1 and ITS2) and the 5.8S gene indicate that rDNA arrays are homogeneous, or nearly so, in all diploids and allopolyploids examined. Because these arrays occur at four chromosomal loci in allopolyploid cotton, two in each subgenome, repeats from different arrays must have become homogenized by interlocus concerted evolution. Southern hybridization analysis combined with copy-number estimation demonstrate that this process has gone to completion in the diploids and to completion or near-completion in all allopolyploid species and that it most likely involves the entire rDNA repeat. Phylogenetic analysis demonstrates that interlocus concerted evolution has been bidirectional in allopolyploid species--i.e., rDNA from four polyploid lineages has been homogenized to a D genome repeat type, whereas sequences from Gossypium mustelinum have concerted to an A genome repeat type. Although little is known regarding the functional significance of interlocus concerted evolution of homeologous sequences, this study demonstrates that the process occurs for tandemly repeated sequences in diploid and polyploid plants. That interlocus concerted evolution can occur bidirectionally subsequent to hybidization and polyploidization has significant implications for phylogeny reconstruction, especially when based on rDNA sequences.

802 citations


Journal ArticleDOI
TL;DR: Analysis of currently available genomic sequence data has extended earlier results, showing that the general designs of disjoint samples of a genome are substantially more similar to each other than to those of sequences from other organisms and that closely related organisms have similar general designs.

651 citations


Journal ArticleDOI
TL;DR: A survey of the 25 editing positions identified in 13 different transcripts of the maize plastome shows that representatives of all protein coding gene classes are subject to editing, particularly for the second codon position and for certain codon transitions.

588 citations


Journal ArticleDOI
10 May 1995-Virology
TL;DR: Comparisons of predicted amino acid sequences allowed the functions of many human herpesvirus-6 encoded proteins to be assigned and showed the closest relationship in overall number and similarity to human cytomegalovirus products, with approximately 67% homologous proteins as compared to the 21% identified in all herpesviruses.

Journal ArticleDOI
TL;DR: In this series of projects regarding the accumulation of sequence information of unidentified human genes, the sequences of 40 full-length cDNA clones of human cell line KG-1 are newly deduced, and the coding sequences of the corresponding genes are predicted.
Abstract: In this series of projects regarding the accumulation of sequence information of unidentified human genes, we newly deduced the sequences of 40 full-length cDNA clones of human cell line KG-1, and predicted the coding sequences of the corresponding genes, named KIAA0121 to 0160. The results of a computer search of public databases indicated that the sequences of 13 genes were unrelated to any reported genes, while the remaining 27 genes carried sequences which showed some similarities to known genes. Obvious unique sequences noted were as follows. A stretch of triplet repeats was contained in each of three genes: These were GAG(Glu) in KIAA0122 and KIAA0147, and TCC(Ser) in KIAA0150. A stretch of 10 amino acid-residues was repeated 21 times in KIAA0139, and a homologous sequence of 76-78 nucleotides was found repeated 6 times in the untranslated region of KIAA0125. Northern hybridization analysis demonstrated that 13 genes were expressed in a cell- or tissue-specific manner. Although a vast number of expressed sequence tags (ESTs) have been registered for comprehensive analysis of cDNA clones, our sequence data indicated that their distribution is very unbalanced: e.g. while no EST hit 7 genes, 85 ESTs fell in a single gene.

Journal ArticleDOI
TL;DR: It is demonstrated that integration of HPV-16 DNA leads to increased steady-state levels of mRNAs encoding the viral oncogenes E6 and E7, and that the A+U-rich element within this viral early 3' untranslated region confers instability on a heterologous mRNA.
Abstract: In many cervical cancers, human papillomavirus type 16 (HPV-16) DNA genomes are found to be integrated into the host chromosome. In this study, we demonstrate that integration of HPV-16 DNA leads to increased steady-state levels of mRNAs encoding the viral oncogenes E6 and E7. This increase is shown to result, at least in part, from an increased stability of E6 and E7 mRNAs that arise specifically from those integrated viral genomes disrupted in the 3' untranslated region of the viral early region. Further, we demonstrate that the A+U-rich element within this viral early 3' untranslated region confers instability on a heterologous mRNA. We conclude that integration of HPV-16 DNA, as occurs in cervical cancers, can result in the increased expression of the viral E6 and E7 oncogenes through altered mRNA stability.

Journal ArticleDOI
TL;DR: It is suggested that HBV genomes with C gene deletions can have a selective advantage in immunosuppressed patients and the potential for the structural and functional characterization of heterogeneous populations of complete virion-encapsidated HBV DNAs is demonstrated.
Abstract: Current knowledge of hepatitis B virus (HBV) sequence heterogeneity is based mainly on sequencing of amplified subgenomic HBV fragments. Here, we describe a method which allows sensitive amplification and simplified functional analysis of full-length HBV genomes with or without prior cloning. By this method, a large number of HBV genomes were cloned from sera of six immunosuppressed kidney transplant patients. Two size classes of HBV genomes, one 3.2 kb and another about 2.0 kb in size, were found in all patients. The genome population from one serum sample was studied in detail by size analysis of subgenomic PCR fragments and sequencing. Regions with deletions and insertions were mapped in the C gene and pre-S region. Up to 100% of HBV genomes in all other immunosuppressed patients also had deletions in the C gene. Our results demonstrate the potential of the established method for the structural and functional characterization of heterogeneous populations of complete virion-encapsidated HBV DNAs and suggest that HBV genomes with C gene deletions can have a selective advantage in immunosuppressed patients.

Journal ArticleDOI
TL;DR: It is anticipated that SSR loci within the chloroplast genome should provide a highly informative assay for the analysis of the genetic structure of plant populations, by using a PCR-based assay.
Abstract: Simple sequence repeats (SSRs), consisting of tandemly repeated multiple copies of mono-, di-, tri-, or tetranucleotide motifs, are ubiquitous in eukaryotic genomes and are frequently used as genetic markers, taking advantage of their length polymorphism. We have examined the polymorphism of such sequences in the chloroplast genomes of plants, by using a PCR-based assay. GenBank searches identified the presence of several (dA)n.(dT)n mononucleotide stretches in chloroplast genomes. A chloroplast (cp) SSR was identified in three pine species (Pinus contorta, Pinus sylvestris, and Pinus thunbergii) 312 bp upstream of the psbA gene. DNA amplification of this repeated region from 11 pine species identified nine length variants. The polymorphic amplified fragments were isolated and the DNA sequence was determined, confirming that the length polymorphism was caused by variation in the length of the repeated region. In the pines, the chloroplast genome is transmitted through pollen and this PCR assay may be used to monitor gene flow in this genus. Analysis of 305 individuals from seven populations of Pinus leucodermis Ant. revealed the presence of four variants with intrapopulational diversities ranging from 0.000 to 0.629 and an average of 0.320. Restriction fragment length polymorphism analysis of cpDNA on the same populations previously failed to detect any variation. Population subdivision based on cpSSR was higher (Gst = 0.22, where Gst is coefficient of gene differentiation) than that revealed in a previous isozyme study (Gst = 0.05). We anticipate that SSR loci within the chloroplast genome should provide a highly informative assay for the analysis of the genetic structure of plant populations.

Patent
05 Jun 1995
TL;DR: Positive-negative selector (PNS) as discussed by the authors vectors are provided for modifying a target DNA sequence contained in the genome of a target cell capable of homologous recombination, which includes organisms such as non-human transgenic animals and plants.
Abstract: Positive-negative selector (PNS) vectors are provided for modifying a target DNA sequence contained in the genome of a target cell capable of homologous recombination. The vector comprises a first DNA sequence which contains at least one sequence portion which is substantially homologous to a portion of a first region of a target DNA sequence. The vector also includes a second DNA sequence containing at least one sequence portion which is substantially homologous to another portion of a second region of a target DNA sequence. A third DNA sequence is positioned between the first and second DNA sequences and encodes a positive selection marker which when expressed is functional in the target cell in which the vector is used. A fourth DNA sequence encoding a negative selection marker, also functional in the target cell, is positioned 5' to the first or 3' to the second DNA sequence and is substantially incapable of homologous recombination with the target DNA sequence. The invention also includes transformed cells containing at least one predetermined modification of a target DNA sequence contained in the genome of the cell. In addition, the invention includes organisms such as non-human transgenic animals and plants which contain cells having predetermined modifications of a target DNA sequence in the genome of the organism.

Journal ArticleDOI
TL;DR: The results indicate that both genomes are maternally inherited, an observation which agrees with the commonly observed pattern of inheritance in angiosperms and confirms that both chloroplast DNA and mitochondrial DNA can be used as a source of seed-specific markers for the study of the geographic structure of oaks.
Abstract: The restriction patterns of two chloroplast fragments and one mitochondrial DNA fragment, amplified by PCR with universal primers, were studied to determine the mode of inheritance of these organelles in 143 progeny of five intraspecific crosses in pedunculate oak (Quercus robur L.). The results indicate that both genomes are maternally inherited, an observation which agrees with the commonly observed pattern of inheritance in angiosperms. They confirm that both chloroplast DNA and mitochondrial DNA can be used as a source of seed-specific markers for the study of the geographic structure of oaks. This is the first report of organelle inheritance within the Fagaceae, an important and widespread tree family.

Journal ArticleDOI
TL;DR: Chromosome landing is likely to become the main strategy by which map-based cloning is applied to isolate both major genes and genes underlying quantitative traits in plant species.

Journal ArticleDOI
01 Apr 1995-Virology
TL;DR: Analysis of the complete genome of African swine fever virus (ASFV) strain BA71V confirms the intermediate characteristics of ASFV between poxviruses and iridoviruses, supporting the notion that AsFV belongs to an independent virus family.

Journal ArticleDOI
Michael Reith1, Janet Munholland1
TL;DR: The complete nucleotide sequence of the chloroplast genome of the red algaPorphyra purpurea has been determined and encodes approximately 250 genes.
Abstract: The complete nucleotide sequence of the chloroplast genome of the red algaPorphyra purpurea has been determined (accession number=U38804). The circular genome is 191,028 bp in length and encodes approximately 250 genes.

Journal ArticleDOI
TL;DR: Hydropathy profiles indicate that the ORFs of VR-2332 and Lelystad virus correspond structurally despite significant sequence differences, consistent with the biological similarities but distinct serological properties of North American and European isolates of the virus.
Abstract: The 3′-portion of the genome of a U.S. isolate of the porcine reproductive and respiratory syndrome (PRRS) virus, ATCC VR-2332, was cloned and sequenced. The resultant 3358 nucleotides contain 6 open reading frames (ORFs) with homologies to ORFs 2 through 7 of the European strain of the PRRS virus and other members of the free-standing genus of arteriviruses. Both VR-2332 and the European isolate (called the Lelystad virus) have been identified as infectious agents responsible for the swine disease called PRRS. Comparative sequence analysis indicates that there are degrees of amino acid identity to the Lelystad virus open reading frames ranging from 55% in ORF 5 to 79% in ORF 6. Hydropathy profiles indicate that the ORFs of VR-2332 and Lelystad virus correspond structurally despite significant sequence differences. These results are consistent with the biological similarities but distinct serological properties of North American and European isolates of the virus.

Journal ArticleDOI
TL;DR: Analysis of the 3'-ends of approximately 900 separate human LINE-1 (L1) elements from primates revealed 47 contiguous but distinct subfamilies with the L1 family, and with the set of consensus sequences for different subfam families and their diagnostic features, it is possible to estimate the age of individual Line-1 elements.

Journal ArticleDOI
TL;DR: It was found that blocks of contiguous sites were less likely to lead to the whole-genome tree than samples composed of sites drawn individually from throughout the genome, a condition that violates a basic assumption of the bootstrap method as it is applied in phylogenetic studies.
Abstract: We inferred phylogenetic trees from individual genes and random samples of nucleotides from the mitochondrial genomes of 10 vertebrates and compared the results to those obtained by analyzing the whole genomes. Individual genes are poor samples in that they infrequently lead to the whole-genome tree. A large number of nucleotide sites is needed to exactly determine the whole-genome tree. A relatively small number of sites, however, often results in a tree close to the whole-genome tree. We found that blocks of contiguous sites were less likely to lead to the whole-genome tree than samples composed of sites drawn individually from throughout the genome. Samples of contiguous sites are not representative of the entire genome, a condition that violates a basic assumption of the bootstrap method as it is applied in phylogenetic studies.


Journal ArticleDOI
TL;DR: This review outlines briefly the compositional properties of the vertebrate genome, namely its isochore organization, the Compositional patterns of DNA molecules and of coding sequences, and the relationships between isochores and chromosomal bands.
Abstract: This review outlines briefly the compositional properties of the vertebrate genome, namely its isochore organization, the compositional patterns of DNA molecules and of coding sequences, the compositional correlations between coding and noncoding sequences, and the relationships between isochores and chromosomal bands. It then deals with the fundamental properties the verte­ brate genome, namely the distribution of genes and its associated functional features. Finally, it considers how the structural and functional organization of the human genome (and of the genomes of warm-blooded vertebrates in general) arose in evolution.

01 Jan 1995
TL;DR: A two-stage genome-wide search for genes conferring susceptibility to schizophrenia revealed significant evidence for linkage to an area distal of the HLA region on chromosome 6p and evidence suggestive of locus heterogeneity and oligogenic transmission in schizophrenia was obtained.
Abstract: Schizophrenia is thought to be a multifactorial disease with complex mode of inheritance1,2. Using a two-stage strategy for another complex disorder, a number of putative IDDM-susceptibility genes have recently been mapped3. We now report the results of a two-stage genome-wide search for genes conferring susceptibility to schizophrenia. In stage I, model-free linkage analyses of large pedigrees from Iceland, a geographical isolate, revealed 26 loci suggestive of linkage. In stage II, ten of these were followed-up in a second international collaborative study comprising families from Austria, Canada, Germany, Italy, Scotland, Sweden, Taiwan and the United States. Potential linkage findings of stage I on chromosomes 6p, 9 and 20 were observed again in the second sample. Furthermore, in a third sample from China, fine mapping of the 6p region by association studies also showed evidence for linkage or linkage disequilibrium. Combining our results with other recent findings4,5 revealed significant evidence for linkage to an area distal of the HLA region on chromosome 6p. However, in a fourth sample from Europe, the 6p fine mapping finding observed in the Chinese sample could not be replicated. Finally, evidence suggestive of locus heterogeneity and oligogenic transmission in schizophrenia was obtained.

Journal ArticleDOI
TL;DR: It is demonstrated that bacterial artificial chromosome (BAC) clones can be mapped readily on rice (Oryza sativa L.) chromosomes by FISH, demonstrating the utility of FISH in plant genome analysis.
Abstract: Fluorescence in situ hybridization (FISH) is a powerful tool for physical mapping in human and other mammalian species However, application of the FISH technique has been limited in plant species, especially for mapping single- or low-copy DNA sequences, due to inconsistent signal production in plant chromosome preparations Here we demonstrate that bacterial artificial chromosome (BAC) clones can be mapped readily on rice (Oryza sativa L) chromosomes by FISH Repetitive DNA sequences in BAC clones can be suppressed efficiently by using rice genomic DNA as a competitor in the hybridization mixture BAC clones as small as 40 kb were successfully mapped To demonstrate the application of the FISH technique in physical mapping of plant genomes, both anonymous BAC clones and clones closely linked to a rice bacterial blight-resistance locus, Xa21, were chosen for analysis The physical location of Xa21 and the relationships among the linked clones were established, thus demonstrating the utility of FISH in plant genome analysis

Journal ArticleDOI
28 Sep 1995-Nature
TL;DR: A yeast artificial chromosome library containing 33,000 clones with an average insert size of one megabase of human genomic DNA was extensively analysed by several different procedures for detecting overlaps and positional information and an analysis strategy was developed that resulted in a YAC contig map reliably covering about 75% of the human genome.
Abstract: A yeast artificial chromosome library containing 33,000 clones with an average insert size of one megabase of human genomic DNA was extensively analysed by several different procedures for detecting overlaps and positional information. We developed an analysis strategy that resulted, after confirmatory tests, in a YAC contig map reliably covering about 75% of the human genome in 225 contigs having an average size of about ten megabases.

Journal ArticleDOI
TL;DR: The contiguous sequence of 1,003,450 bp spanning map positions 64% to 92% of the genome of Synechocystis sp.
Abstract: The contiguous sequence of 1,003,450 bp spanning map positions 64% to 92% of the genome of Synechocystis sp. strain PCC6803 has been deduced. Computer analysis of the sequence predicts that this region contains at least 818 potential ORFs, in which 255 (31%) were either genes that had already been identified or their homologues, 84 (10%) were homologues to registered hypothetical genes, and 149 (18%) showed weak similarities to reported genes. The remaining 330 ORFs showed no apparent similarity to any reported genes or carried no significant protein motifs. The potential ORFs as a whole occupied 86% of the sequenced region, implying compact arrangement of genes in the genome. As to the structural RNA genes, one rRNA operon consisting of 5,028 bp and at least 11 species of tRNA genes were identified. It is noteworthy that 10 out of the 11 tRNA species showed significant sequence similarities to tRNAs reported in plant chloroplasts. As other notable unique sequences, three classes of IS-like elements each with characteristics typical of IS elements were identified, and a typical unit of WD(Trp-Asp)-repeats which have only been detected in the regulatory proteins of eukaryotes was identified within the large 5,079-bp ORF located at map position 69%.

Journal Article
TL;DR: The chloroplast genome consists of homogeneous circular DNA molecules and each portion is transcribed separately and two to three separate transcripts are joined together to yield a functional mRNA by trans-splicing.
Abstract: The chloroplast genome consists of homogeneous circular DNA molecules. To date, the entire nucleotide sequences (120-190 kbp) of chloroplast genomes have been determined from eight plant species. The chloroplast genomes of land plants and green algae contain about 110 different genes, which can be classified into two main groups: genes involved in gene expression and those related to photosynthesis. The red alga Porphyra chloroplast genome has 70 additional genes, one-third of which are related to biosynthesis of amino acids and other low molecular mass compounds. Chloroplast genes contain at least three structurally distinct promoters and transcribe two or more classes of RNA polymerase. Two chloroplast genes, rps12 of land plants and psaA of Chlamydomonas, are divided into two to three pieces and scattered over the genome. Each portion is transcribed separately, and two to three separate transcripts are joined together to yield a functional mRNA by trans-splicing. RNA editing (C to U base changes) occurs in some of the chloroplast transcripts. Most edited codons are functionally significant, creating start and stop codons and changing codons to retain conserved amino acids.