Home
/
Authors
/
Susan Thomson

Author

Susan Thomson

Other affiliations: Western General Hospital, University of Otago

Bio: Susan Thomson is an academic researcher from Plant & Food Research. The author has contributed to research in topics: Genome & Population. The author has an hindex of 15, co-authored 31 publications receiving 2418 citations. Previous affiliations of Susan Thomson include Western General Hospital & University of Otago.

Topics: Genome, Population, Quantitative trait locus, Gene, Jurkat cells ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Genome sequence and analysis of the tuber crop potato.

[...]

Xun Xu¹, Shengkai Pan¹, Shifeng Cheng¹, Bo Zhang¹, Mu D¹, Peixiang Ni¹, Gengyun Zhang¹, Shuang Yang¹, Ruiqiang Li¹, Jun Wang¹, Gisella Orjeda², Frank Guzman², Torres M², Roberto Lozano², Olga Ponce², Diana Martinez², De la Cruz G³, Chakrabarti Sk³, Patil Vu³, Konstantin G. Skryabin⁴, Boris B. Kuznetsov⁴, Nikolai V. Ravin⁴, Tatjana V. Kolganova⁴, Alexey V. Beletsky⁴, Andrey V. Mardanov⁴, Di Genova A⁵, Dan Bolser⁵, David M. A. Martin⁵, Li G, Yang Y, Hanhui Kuang⁶, Hu Q⁶, Xiong X⁷, Gerard J. Bishop⁸, Boris Sagredo, Nilo Mejía, Zagorski W⁹, Robert Gromadka⁹, Jan Gawor⁹, Pawel Szczesny⁹, Sanwen Huang, Zhang Z, Liang C, He J, Li Y, He Y, Xu J, Youjun Zhang, Xie B, Du Y, Qu D, Merideth Bonierbale¹⁰, Marc Ghislain¹⁰, Herrera Mdel R, Giovanni Giuliano, Marco Pietrella, Gaetano Perrotta, Paolo Facella, O'Brien K¹¹, Sergio Enrique Feingold, Barreiro Le, Massa Ga, Luis Aníbal Diambra¹², Brett R Whitty¹³, Brieanne Vaillancourt¹³, Lin H¹³, Alicia N. Massa¹³, Geoffroy M¹³, Lundback S¹³, Dean DellaPenna¹³, Buell Cr¹⁴, Sanjeev Kumar Sharma¹⁴, David Marshall¹⁴, Robbie Waugh¹⁴, Glenn J. Bryan¹⁴, Destefanis M¹⁵, Istvan Nagy¹⁵, Dan Milbourne¹⁵, Susan Thomson¹⁶, Mark Fiers¹⁶, Jeanne M. E. Jacobs¹⁶, Kåre Lehmann Nielsen¹⁷, Mads Sønderkær¹⁷, Marina Iovene¹⁸, Giovana Augusta Torres¹⁸, Jiming Jiang¹⁸, Richard E. Veilleux¹⁹, Christian W. B. Bachem²⁰, de Boer J²⁰, Theo Borm²⁰, Bjorn Kloosterman²⁰, van Eck H²⁰, Erwin Datema²⁰, Hekkert Bt²⁰, Aska Goverse²⁰, van Ham Rc²⁰, Richard G. F. Visser²⁰ - Show less +93 more•Institutions (20)

Beijing Institute of Genomics¹, Cayetano Heredia University², Indian Council of Agricultural Research³, Russian Academy of Sciences⁴, University of Dundee⁵, Huazhong Agricultural University⁶, Hunan Agricultural University⁷, Imperial College London⁸, Polish Academy of Sciences⁹, International Potato Center¹⁰, J. Craig Venter Institute¹¹, National University of La Plata¹², Michigan State University¹³, James Hutton Institute¹⁴, Teagasc¹⁵, Plant & Food Research¹⁶, Aalborg University¹⁷, University of Wisconsin-Madison¹⁸, Virginia Tech¹⁹, Wageningen University and Research Centre²⁰

10 Jul 2011-Nature

TL;DR: The potato genome sequence provides a platform for genetic improvement of this vital crop and predicts 39,031 protein-coding genes and presents evidence for at least two genome duplication events indicative of a palaeopolyploid origin.

...read moreread less

Abstract: Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.

...read moreread less

1,813 citations

Journal Article•DOI•

Construction of Reference Chromosome-Scale Pseudomolecules for Potato: Integrating the Potato Genome with Genetic and Physical Maps

[...]

Sanjeev Kumar Sharma¹, Dan Bolser², Jan M. de Boer³, Mads Sønderkær⁴, Walter Amoros⁵, Martín Federico Carboni⁶, Juan Martín D'Ambrosio, Germán De la Cruz, Alex Di Genova⁷, David S. Douches⁸, Maria Eguiluz⁹, Xiao-Qiang Guo, Frank Guzman⁹, Christine A. Hackett², John P. Hamilton⁸, Guangcun Li, Ying Li, Roberto Lozano⁹, Alejandro Maass⁷, David Marshall¹, Diana Martinez⁹, Karen McLean¹, Nilo Mejía, Linda Milne¹, Susan Munive⁵, Istvan Nagy¹⁰, Olga Ponce⁹, Manuel Ramirez⁹, Reinhard Simon⁵, Susan Thomson, Yerisf Torres⁹, Robbie Waugh¹, Zhonghua Zhang, Sanwen Huang, Richard G. F. Visser³, Christian W. B. Bachem³, Boris Sagredo, Sergio Enrique Feingold⁶, Gisella Orjeda⁹, Richard E. Veilleux¹¹, Merideth Bonierbale⁵, Jeanne M. E. Jacobs¹², Dan Milbourne¹⁰, David M. A. Martin², Glenn J. Bryan¹ - Show less +41 more•Institutions (12)

James Hutton Institute¹, University of Dundee², Wageningen University and Research Centre³, Aalborg University⁴, International Potato Center⁵, International Trademark Association⁶, University of Chile⁷, Michigan State University⁸, Cayetano Heredia University⁹, Teagasc¹⁰, Virginia Tech¹¹, Plant & Food Research¹²

01 Nov 2013-G3: Genes, Genomes, Genetics

TL;DR: The work presented here has led to a greatly improved ordering of the potato reference genome superscaffolds into chromosomal “pseudomolecules”.

...read moreread less

Abstract: The genome of potato, a major global food crop, was recently sequenced. The work presented here details the integration of the potato reference genome (DM) with a new sequence-tagged site marker−based linkage map and other physical and genetic maps of potato and the closely related species tomato. Primary anchoring of the DM genome assembly was accomplished by the use of a diploid segregating population, which was genotyped with several types of molecular genetic markers to construct a new ~936 cM linkage map comprising 2469 marker loci. In silico anchoring approaches used genetic and physical maps from the diploid potato genotype RH89-039-16 (RH) and tomato. This combined approach has allowed 951 superscaffolds to be ordered into pseudomolecules corresponding to the 12 potato chromosomes. These pseudomolecules represent 674 Mb (~93%) of the 723 Mb genome assembly and 37,482 (~96%) of the 39,031 predicted genes. The superscaffold order and orientation within the pseudomolecules are closely collinear with independently constructed high density linkage maps. Comparisons between marker distribution and physical location reveal regions of greater and lesser recombination, as well as regions exhibiting significant segregation distortion. The work presented here has led to a greatly improved ordering of the potato reference genome superscaffolds into chromosomal “pseudomolecules”.

...read moreread less

236 citations

Journal Article•DOI•

A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants

[...]

Sarah M. Pilkington¹, Ross N. Crowhurst¹, Elena Hilario¹, Simona Nardozza¹, Lena G. Fraser¹, Yongyan Peng¹, Yongyan Peng², Kularajathevan Gunaseelan¹, Robert M. Simpson, Jibran Tahir, Simon C. Deroles, Kerry Robert Templeton¹, Zhiwei Luo¹, Marcus Davy, Canhong Cheng¹, Mark A McNeilage¹, Davide Scaglione, Yifei Liu³, Qiong Zhang, P. M. Datson¹, Nihal De Silva¹, Susan E. Gardiner, H. Bassett, David Chagné, John McCallum, Helge Dzierzon, Cecilia H. Deng¹, Yen-Yi Wang¹, Lorna Barron¹, Kelvina I. Manako¹, Judith H. Bowen¹, Toshi Foster, Zoe A. Erridge, Heather R. Tiffin, Chethi N. Waite, Kevin M. Davies, Ella R. P. Grierson, William A. Laing, Rebecca Kirk¹, Xiuyin Chen¹, Marion Wood¹, Mirco Montefiori¹, David A. Brummell, Kathy E. Schwinn, Andrew Catanach, Christina G. Fullerton¹, Dawei Li, Sathiyamoorthy Meiyalaghan, Niels J. Nieuwenhuizen¹, Nicola C. Read², Roneel Prakash¹, Donald A. Hunter, Huaibi Zhang, Marian J. McKenzie, Mareike Knäbel, Alastair Harris², Andrew C. Allan¹, Andrew C. Allan², Andrew P. Gleave¹, Angela Chen², Bart J. Janssen¹, Blue Plunkett¹, Charles Ampomah-Dwamena¹, Charlotte Voogd¹, Davin Leif¹, Davin Leif², Declan J. Lafferty², Edwige J. F. Souleyre¹, Erika Varkonyi-Gasic¹, Francesco Gambi¹, Jenny Hanley², Jia-Long Yao¹, Joey Cheung², Karine M. David², Ben Warren¹, K.B. Marsh¹, Kimberley C. Snowden¹, Kui Lin-Wang¹, Lara Brian¹, Marcela Martínez-Sánchez¹, Mindy Y. Wang¹, Nadeesha R. Ileperuma¹, Nikolai Macnee¹, Robert Campin¹, Peter A. McAtee¹, Revel S.M. Drummond¹, Richard V. Espley¹, Hilary S. Ireland¹, Rongmei Wu¹, Ross G. Atkinson¹, Sakuntala Karunairetnam¹, Sean Bulley, Shayhan Chunkath², Zac Hanley¹, Roy Storey, Amali H. Thrimawithana¹, Susan Thomson, Charles David, Raffaele Testolin⁴, Hongwen Huang³, Roger P. Hellens⁵, Robert J. Schaffer¹, Robert J. Schaffer² - Show less +99 more•Institutions (5)

Plant & Food Research¹, University of Auckland², Chinese Academy of Sciences³, University of Udine⁴, Queensland University of Technology⁵

16 Apr 2018-BMC Genomics

TL;DR: The use of the manual annotation tool WebApollo facilitated manual checking and correction of gene models enabling improvement of computational prediction, especially relevant for certain types of gene families such as the EXPANSIN like genes.

...read moreread less

Abstract: Most published genome sequences are drafts, and most are dominated by computational gene prediction. Draft genomes typically incorporate considerable sequence data that are not assigned to chromosomes, and predicted genes without quality confidence measures. The current Actinidia chinensis (kiwifruit) ‘Hongyang’ draft genome has 164 Mb of sequences unassigned to pseudo-chromosomes, and omissions have been identified in the gene models. A second genome of an A. chinensis (genotype Red5) was fully sequenced. This new sequence resulted in a 554.0 Mb assembly with all but 6 Mb assigned to pseudo-chromosomes. Pseudo-chromosomal comparisons showed a considerable number of translocation events have occurred following a whole genome duplication (WGD) event some consistent with centromeric Robertsonian-like translocations. RNA sequencing data from 12 tissues and ab initio analysis informed a genome-wide manual annotation, using the WebApollo tool. In total, 33,044 gene loci represented by 33,123 isoforms were identified, named and tagged for quality of evidential support. Of these 3114 (9.4%) were identical to a protein within ‘Hongyang’ The Kiwifruit Information Resource (KIR v2). Some proportion of the differences will be varietal polymorphisms. However, as most computationally predicted Red5 models required manual re-annotation this proportion is expected to be small. The quality of the new gene models was tested by fully sequencing 550 cloned ‘Hort16A’ cDNAs and comparing with the predicted protein models for Red5 and both the original ‘Hongyang’ assembly and the revised annotation from KIR v2. Only 48.9% and 63.5% of the cDNAs had a match with 90% identity or better to the original and revised ‘Hongyang’ annotation, respectively, compared with 90.9% to the Red5 models. Our study highlights the need to take a cautious approach to draft genomes and computationally predicted genes. Our use of the manual annotation tool WebApollo facilitated manual checking and correction of gene models enabling improvement of computational prediction. This utility was especially relevant for certain types of gene families such as the EXPANSIN like genes. Finally, this high quality gene set will supply the kiwifruit and general plant community with a new tool for genomics and other comparative analysis.

...read moreread less

129 citations

Journal Article•DOI•

Identification of Mendel's White Flower Character

[...]

Roger P. Hellens¹, Carol Moreau², Kui Lin-Wang¹, Kathy E. Schwinn¹, Susan Thomson¹, Mark Fiers¹, Tonya J. Frew¹, Sarah R. Murray¹, Julie M.I. Hofer², Jeanne M. E. Jacobs¹, Kevin M. Davies¹, Andrew C. Allan¹, Abdelhafid Bendahmane, Clarice J. Coyne³, Gail M. Timmerman-Vaughan¹, T. H. Noel Ellis² - Show less +12 more•Institutions (3)

Plant & Food Research¹, John Innes Centre², United States Department of Agriculture³

11 Oct 2010-PLOS ONE

TL;DR: In this paper, a combination of genetic mapping, fast neutron mutant analysis, allelic diversity, transcript quantification and transient expression complementation studies was used to identify the pea genes A and A2.

...read moreread less

Abstract: Background: The genetic regulation of flower color has been widely studied, notably as a character used by Mendel and his predecessors in the study of inheritance in pea. Methodology/Principal Findings: We used the genome sequence of model legumes, together with their known synteny to the pea genome to identify candidate genes for the A and A2 loci in pea. We then used a combination of genetic mapping, fast neutron mutant analysis, allelic diversity, transcript quantification and transient expression complementation studies to confirm the identity of the candidates. Conclusions/Significance: We have identified the pea genes A and A2. A is the factor determining anthocyanin pigmentation in pea that was used by Gregor Mendel 150 years ago in his study of inheritance. The A gene encodes a bHLH transcription factor. The white flowered mutant allele most likely used by Mendel is a simple G to A transition in a splice donor site that leads to a mis-spliced mRNA with a premature stop codon, and we have identified a second rare mutant allele. The A2 gene encodes a WD40 protein that is part of an evolutionarily conserved regulatory complex.

...read moreread less

113 citations

Journal Article•DOI•

The chemopreventive agent phenethyl isothiocyanate sensitizes cells to Fas-mediated apoptosis.

[...]

Juliet M. Pullar, Susan Thomson, Monica J. King, Christopher I. Turnbull, Robyn G. Midwinter, Mark B. Hampton - Show less +2 more

19 Dec 2003-Carcinogenesis

TL;DR: Evidence is provided for a new mechanism of chemoprevention, wherein sublethal doses of phenethyl isothiocyanate (PEITC) sensitize cells to Fas-mediated apoptosis, and it is proposed that PEITC promotes apoptosis by directly modifying intracellular thiol proteins.

...read moreread less

Abstract: The chemopreventive properties of the isothiocyanates have been attributed to their ability to inhibit phase I enzymes that activate procarcinogens, induce phase II protective enzymes and trigger apoptosis in transformed cells. In this study we provide evidence for a new mechanism of chemoprevention, wherein sublethal doses of phenethyl isothiocyanate (PEITC) sensitize cells to Fas-mediated apoptosis. The phenomenon was observed in the Fas-resistant T24 bladder carcinoma cell line and in Jurkat T cells overexpressing the anti-apoptotic protein Bcl-2. Caspase-3-like activity was increased up to 20-fold of that observed with either PEITC or anti-Fas antibody alone. While PEITC activated ERK, JNK and p38, inhibitors of these MAP kinases did not block apoptosis. PEITC transiently depleted cellular glutathione, providing a putative mechanism for sensitizing the cells to apoptosis. However, lowering glutathione with buthionine sulfoximine did not mimic the effect of PEITC. Instead, we propose that PEITC promotes apoptosis by directly modifying intracellular thiol proteins. The ability of PEITC to sensitize cells to receptor-mediated apoptosis provides an additional mechanism to explain its chemopreventive properties.

...read moreread less

63 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Phytozome: a comparative platform for green plant genomics

[...]

David Goodstein¹, Shengqiang Shu¹, Russell Howson¹, Rochak Neupane¹, Richard D. Hayes¹, Joni Fazo¹, Therese Mitros¹, William Dirks¹, Uffe Hellsten¹, Nicholas H. Putnam¹, Daniel S. Rokhsar¹ - Show less +7 more•Institutions (1)

United States Department of Energy¹

01 Jan 2012-Nucleic Acids Research

TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.

...read moreread less

Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

...read moreread less

3,728 citations

Journal Article•DOI•

The tomato genome sequence provides insights into fleshy fruit evolution

[...]

Shusei Sato, Satoshi Tabata, Hideki Hirakawa, Erika Asamizu +320 more•Institutions (51)

31 May 2012-Nature

TL;DR: A high-quality genome sequence of domesticated tomato is presented, a draft sequence of its closest wild relative, Solanum pimpinellifolium, is compared, and the two tomato genomes are compared to each other and to the potato genome.

...read moreread less

Abstract: Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

...read moreread less

2,687 citations

Integrative Genomics Viewer

[...]

James T. Robinson¹, Helga Thorvaldsdottir¹, Wendy Winckler¹, Mitchell Guttman¹, Eric S. Lander¹, Eric S. Lander², Gad Getz¹, Jill P. Mesirov¹ - Show less +4 more•Institutions (2)

Massachusetts Institute of Technology¹, Harvard University²

01 Jan 2011

TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.

...read moreread less

Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

...read moreread less

2,187 citations

Journal Article•DOI•

The oyster genome reveals stress adaptation and complexity of shell formation

[...]

Guofan Zhang¹, Xiaodong Fang, Ximing Guo², Li Li, Ruibang Luo, Fei Xu, Pengcheng Yang, Linlin Zhang, Xiaotong Wang, Haigang Qi, Zhiqiang Xiong, Huayong Que, Yinlong Xie, Peter W. H. Holland³, Jordi Paps³, Yabing Zhu, Fucun Wu, Yuanxin Chen, Jiafeng Wang, Chunfang Peng, Jie Meng, Lan Yang, Jun Liu, Bo Wen, Na Zhang, Zhiyong Huang, Qihui Zhu, Yue Feng, Andrew S. Mount⁴, Dennis Hedgecock⁵, Zhe Xu⁶, Yunjie Liu, Tomislav Domazet-Lošo, Yishuai Du, Xiaoqing Sun, Shoudu Zhang, Binghang Liu, Peizhou Cheng, Xuanting Jiang, Juan Li, Dingding Fan, Wei Wang, Wenjing Fu, Tong Wang, Bo Wang, Jibiao Zhang, Zhiyu Peng, Yingxiang Li, Na Li, Jinpeng Wang, Maoshan Chen, Yan He², Fengji Tan, Xiaorui Song, Qiumei Zheng, Ronglian Huang, Hailong Yang, Du Xuedi, Li Chen, Mei Yang, Patrick M. Gaffney⁷, Shan Wang², Longhai Luo, Zhicai She, Yao Ming, Huang Wen, Shu Zhang, Baoyu Huang, Yong Zhang, Tao Qu, Peixiang Ni, Guoying Miao, Junyi Wang, Qiang Wang, Christian E. W. Steinberg⁸, Haiyan Wang, Ning Li, Lumin Qian², Guojie Zhang, Yingrui Li, Huanming Yang, Xiao Liu, Jian Wang, Ye Yin, Jun Wang⁹ - Show less +81 more•Institutions (9)

Chinese Academy of Sciences¹, Rutgers University², University of Oxford³, Clemson University⁴, University of Southern California⁵, Atlantic Cape Community College⁶, University of Delaware⁷, Humboldt University of Berlin⁸, University of Copenhagen⁹

04 Oct 2012-Nature

TL;DR: The sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy and transcriptomes of development and stress response and the proteome of the shell are reported, showing that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes.

...read moreread less

Abstract: The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster's adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa.

...read moreread less

1,806 citations

Journal Article•DOI•

Repetitive DNA and next-generation sequencing: computational challenges and solutions.

[...]

Todd J. Treangen¹, Steven L. Salzberg², Steven L. Salzberg¹•Institutions (2)

Johns Hopkins University School of Medicine¹, Johns Hopkins University²

01 Jan 2012-Nature Reviews Genetics

TL;DR: The computational problems surrounding repeats are discussed and strategies used by current bioinformatics systems to solve them are described.

...read moreread less

Abstract: Repetitive DNA sequences are abundant in a broad range of species, from bacteria to mammals, and they cover nearly half of the human genome. Repeats have always presented technical challenges for sequence alignment and assembly programs. Next-generation sequencing projects, with their short read lengths and high data volumes, have made these challenges more difficult. From a computational perspective, repeats create ambiguities in alignment and assembly, which, in turn, can produce biases and errors when interpreting results. Simply ignoring repeats is not an option, as this creates problems of its own and may mean that important biological phenomena are missed. We discuss the computational problems surrounding repeats and describe strategies used by current bioinformatics systems to solve them.

...read moreread less

1,451 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse