Home
/
Authors
/
Thomas G. Marr

Author

Thomas G. Marr

Bio: Thomas G. Marr is an academic researcher from Cold Spring Harbor Laboratory. The author has contributed to research in topics: String metric & Schizosaccharomyces pombe. The author has an hindex of 16, co-authored 21 publications receiving 1545 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Combinatorial pattern discovery for scientific data: some preliminary results

[...]

Jason T. L. Wang¹, Gung-Wei Chirn¹, Thomas G. Marr², Bruce A. Shapiro³, Dennis Shasha⁴, Kaizhong Zhang⁵ - Show less +2 more•Institutions (5)

New Jersey Institute of Technology¹, Cold Spring Harbor Laboratory², National Institutes of Health³, Courant Institute of Mathematical Sciences⁴, University of Western Ontario⁵

24 May 1994

TL;DR: This paper presents an example of combinatorial pattern discovery: the discovery of patterns in protein databases, which give information that is complementary to the best protein classifier available today.

...read moreread less

Abstract: Suppose you are given a set of natural entities (e.g., proteins, organisms, weather patterns, etc.) that possess some important common externally observable properties. You also have a structural description of the entities (e.g., sequence, topological, or geometrical data) and a distance metric. Combinatorial pattern discovery is the activity of finding patterns in the structural data that might explain these common properties based on the metric.This paper presents an example of combinatorial pattern discovery: the discovery of patterns in protein databases. The structural representation we consider are strings and the distance metric is string edit distance permitting variable length don't cares. Our techniques incorporate string matching algorithms and novel heuristics for discovery and optimization, most of which generalize to other combinatorial structures. Experimental results of applying the techniques to both generated data and functionally related protein families obtained from the Cold Spring Harbor Laboratory show the effectiveness of the proposed techniques. When we apply the discovered patterns to perform protein classification, they give information that is complementary to the best protein classifier available today.

...read moreread less

193 citations

Journal Article•DOI•

A weight array method for splicing signal analysis

[...]

Michael Q. Zhang¹, Thomas G. Marr•Institutions (1)

Cold Spring Harbor Laboratory¹

01 Oct 1993-Bioinformatics

TL;DR: The results show that there may exist weak pairwise correlations within the signals and that the proposed weight array method can help to better discriminate these signals.

...read moreread less

Abstract: A new method of sequence analysis, using a weight array method (WAM), which generalizes the traditional Staden weight matrix method (WMM), is proposed. With the help of a statistical mechanical model, the discriminant function is identified with the energy function describing macromolecular interactions. The method is applied to the study of 5'-splice signals in Schizosaccharomyces pombe pre-mRNA sequences. The results show that there may exist weak pairwise correlations within the signals and that our method can help to better discriminate these signals. Experiments are proposed to test the predictions of the theory.

...read moreread less

191 citations

Journal Article•DOI•

Genome-wide scan of bipolar disorder in 65 pedigrees: supportive evidence for linkage at 8q24, 18q22, 4q32, 2p12, and 13q12

[...]

Melvin G. McInnis¹, T-H Lan¹, T-H Lan², Virginia L. Willour¹, Francis J. McMahon¹, Francis J. McMahon³, Sylvia G. Simpson¹, Anjene Musick Addington¹, Dean F. MacKinnon¹, James B. Potash¹, A. T. Mahoney¹, Jennifer L. Chellis¹, Y Huo¹, Theresa Swift-Scanlan¹, Haiming Chen¹, Rebecca Koskela¹, O. Colin Stine⁴, O. Colin Stine¹, Kay Redfield Jamison¹, Peter Holmans⁵, Susan E. Folstein⁶, Koustubh Ranade⁷, Carl Friddle⁷, D Botstein⁶, Thomas G. Marr, T.H. Beaty¹, Peter P. Zandi¹, J. Raymond DePaulo¹ - Show less +24 more•Institutions (7)

Johns Hopkins University¹, National Yang-Ming University², National Institutes of Health³, University of Maryland, Baltimore⁴, Medical Research Council⁵, Tufts University⁶, Stanford University⁷

01 Mar 2003-Molecular Psychiatry

TL;DR: Assessment of 65 pedigrees ascertained through a Bipolar I proband for evidence of linkage, using nonparametric methods in a genome-wide scan and for possible parent of origin effect using several analytical methods identified 15 loci with nominally significant evidence for increased allele sharing among affected relative pairs.

...read moreread less

Abstract: The purpose of this study was to assess 65 pedigrees ascertained through a Bipolar I (BPI) proband for evidence of linkage, using nonparametric methods in a genome-wide scan and for possible parent of origin effect using several analytical methods. We identified 15 loci with nominally significant evidence for increased allele sharing among affected relative pairs. Eight of these regions, at 8q24, 18q22, 4q32, 13q12, 4q35, 10q26, 2p12, and 12q24, directly overlap with previously reported evidence of linkage to bipolar disorder. Five regions at 20p13, 2p22, 14q23, 9p13, and 1q41 are within several Mb of previously reported regions. We report our findings in rank order and the top five markers had an NPL>2.5. The peak finding in these regions were D8S256 at 8q24, NPL 3.13; D18S878 at 18q22, NPL 2.90; D4S1629 at 4q32, NPL 2.80; D2S99 at 2p12, NPL 2.54; and D13S1493 at 13q12, NPL 2.53. No locus produced statistically significant evidence for linkage at the genome-wide level. The parent of origin effect was studied and consistent with our previous findings, evidence for a locus on 18q22 was predominantly from families wherein the father or paternal lineage was affected. There was evidence consistent with paternal imprinting at the loci on 13q12 and 1q41.

...read moreread less

148 citations

Journal Article•DOI•

A 13-Kb Resolution Cosmid Map of the 14-Mb Fission Yeast Genome by Nonrandom Sequence-Tagged Site Mapping

[...]

Toru Mizukami¹, William I. Chang², Igor Garkavtsev¹, Nancy Kaplan¹, Diane Lombardi¹, Tomohiro Matsumoto¹, Osami Niwa³, Asako Kounosu³, Mitsuhiro Yanagida³, Thomas G. Marr², David Beach¹ - Show less +7 more•Institutions (3)

Howard Hughes Medical Institute¹, Cold Spring Harbor Laboratory², Kyoto University³

09 Apr 1993-Cell

TL;DR: This work presents the application of a nonrandom sequence-tagged site (STS) content detection method in mapping an entire genome, that of fission yeast, and developed powerful techniques, based on consistency analysis, for error detection and contig assembly.

...read moreread less

146 citations

Journal Article•DOI•

Understanding long-range correlations in DNA sequences

[...]

Wentian Li¹, Thomas G. Marr¹, Kunihiko Kaneko²•Institutions (2)

Cold Spring Harbor Laboratory¹, University of Tokyo²

01 Aug 1994-Physica D: Nonlinear Phenomena

TL;DR: It is concluded that a mixture of many length scales (including some relatively long ones) in DNA sequences is responsible for the observed 1 f -like spectral component.

...read moreread less

136 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Mining sequential patterns

[...]

Rakesh Agrawal¹, Ramakrishnan Srikant¹•Institutions (1)

IBM¹

06 Mar 1995

TL;DR: Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.

...read moreread less

Abstract: We are given a large database of customer transactions, where each transaction consists of customer-id, transaction time, and the items bought in the transaction. We introduce the problem of mining sequential patterns over such databases. We present three algorithms to solve this problem, and empirically evaluate their performance using synthetic data. Two of the proposed algorithms, AprioriSome and AprioriAll, have comparable performance, albeit AprioriSome performs a little better when the minimum number of customers that must support a sequential pattern is low. Scale-up experiments show that both AprioriSome and AprioriAll scale linearly with the number of customer transactions. They also have excellent scale-up properties with respect to the number of transactions per customer and the number of items in a transaction. >

...read moreread less

5,663 citations

Journal Article•DOI•

Prediction of Complete Gene Structures in Human Genomic DNA

[...]

Christopher B. Burge¹, Samuel Karlin¹•Institutions (1)

Stanford University¹

25 Apr 1997-Journal of Molecular Biology

TL;DR: A general probabilistic model of the gene structure of human genomic sequences which incorporates descriptions of the basic transcriptional, translational and splicing signals, as well as length distributions and compositional features of exons, introns and intergenic regions is introduced.

...read moreread less

3,709 citations

Book Chapter•DOI•

Mining Sequential Patterns: Generalizations and Performance Improvements

[...]

Ramakrishnan Srikant¹, Ramakrishnan Srikant², Rakesh Agrawal²•Institutions (2)

University of Wisconsin-Madison¹, IBM²

25 Mar 1996

TL;DR: This work adds time constraints that specify a minimum and/or maximum time period between adjacent elements in a pattern, and relax the restriction that the items in an element of a sequential pattern must come from the same transaction.

...read moreread less

Abstract: The problem of mining sequential patterns was recently introduced in [3] We are given a database of sequences, where each sequence is a list of transactions ordered by transaction-time, and each transaction is a set of items The problem is to discover all sequential patterns with a user-specified minimum support, where the support of a pattern is the number of data-sequences that contain the pattern An example of a sequential pattern is“5% of customers bought ‘Foundation’ and ‘Ringworld’ in one transaction, followed by ‘Second Foundation’ in a later transaction” We generalize the problem as follows First, we add time constraints that specify a minimum and/or maximum time period between adjacent elements in a pattern Second, we relax the restriction that the items in an element of a sequential pattern must come from the same transaction, instead allowing the items to be present in a set of transactions whose transaction-times are within a user-specified time window Third, given a user-defined taxonomy (is-a hierarchy) on items, we allow sequential patterns to include items across all levels of the taxonomy

...read moreread less

2,973 citations

Journal Article•DOI•

A guided tour to approximate string matching

[...]

Gonzalo Navarro¹•Institutions (1)

University of Chile¹

01 Mar 2001-ACM Computing Surveys

TL;DR: This work surveys the current techniques to cope with the problem of string matching that allows errors, and focuses on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms.

...read moreread less

Abstract: We survey the current techniques to cope with the problem of string matching that allows errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms and their complexities. We present a number of experiments to compare the performance of the different algorithms and show which are the best choices. We conclude with some directions for future work and open problems.

...read moreread less

2,723 citations

Journal Article•DOI•

Telomerase catalytic subunit homologs from fission yeast and human

[...]

Toru Nakamura¹, Gregg B. Morin¹, Gregg B. Morin², Karen B. Chapman², Karen B. Chapman¹, Scott L. Weinrich², Scott L. Weinrich¹, William H. Andrews², William H. Andrews¹, Joachim Lingner¹, Joachim Lingner², Calvin B. Harley¹, Calvin B. Harley², Thomas R. Cech², Thomas R. Cech¹ - Show less +11 more•Institutions (2)

Howard Hughes Medical Institute¹, Geron Corporation²

15 Aug 1997-Science

TL;DR: In this paper, the homologous genes from the fission yeast Schizosaccharomyces pombe and human are identified and the proposed telomerase catalytic subunits represent a deep branch in the evolution of reverse transcriptases.

...read moreread less

Abstract: Catalytic protein subunits of telomerase from the ciliate Euplotes aediculatus and the yeast Saccharomyces cerevisiae contain reverse transcriptase motifs. Here the homologous genes from the fission yeast Schizosaccharomyces pombe and human are identified. Disruption of the S. pombe gene resulted in telomere shortening and senescence, and expression of mRNA from the human gene correlated with telomerase activity in cell lines. Sequence comparisons placed the telomerase proteins in the reverse transcriptase family but revealed hallmarks that distinguish them from retroviral and retrotransposon relatives. Thus, the proposed telomerase catalytic subunits are phylogenetically conserved and represent a deep branch in the evolution of reverse transcriptases.

...read moreread less

2,181 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse