scispace - formally typeset
Search or ask a question
Author

Lei-Hoon See

Bio: Lei-Hoon See is an academic researcher from Cold Spring Harbor Laboratory. The author has contributed to research in topics: Genome & Genomics. The author has an hindex of 3, co-authored 3 publications receiving 6653 citations.

Papers
More filters
Journal ArticleDOI
Sarah Djebali, Carrie A. Davis1, Angelika Merkel, Alexander Dobin1, Timo Lassmann, Ali Mortazavi2, Ali Mortazavi3, Andrea Tanzer, Julien Lagarde, Wei Lin1, Felix Schlesinger1, Chenghai Xue1, Georgi K. Marinov2, Jainab Khatun4, Brian A. Williams2, Chris Zaleski1, Joel Rozowsky5, Marion S. Röder, Felix Kokocinski6, Rehab F. Abdelhamid, Tyler Alioto, Igor Antoshechkin2, Michael T. Baer1, Nadav Bar7, Philippe Batut1, Kimberly Bell1, Ian Bell8, Sudipto K. Chakrabortty1, Xian Chen9, Jacqueline Chrast10, Joao Curado, Thomas Derrien, Jorg Drenkow1, Erica Dumais8, Jacqueline Dumais8, Radha Duttagupta8, Emilie Falconnet11, Meagan Fastuca1, Kata Fejes-Toth1, Pedro G. Ferreira, Sylvain Foissac8, Melissa J. Fullwood12, Hui Gao8, David Gonzalez, Assaf Gordon1, Harsha P. Gunawardena9, Cédric Howald10, Sonali Jha1, Rory Johnson, Philipp Kapranov8, Brandon King2, Colin Kingswood, Oscar Junhong Luo12, Eddie Park3, Kimberly Persaud1, Jonathan B. Preall1, Paolo Ribeca, Brian A. Risk4, Daniel Robyr11, Michael Sammeth, Lorian Schaffer2, Lei-Hoon See1, Atif Shahab12, Jørgen Skancke7, Ana Maria Suzuki, Hazuki Takahashi, Hagen Tilgner13, Diane Trout2, Nathalie Walters10, Huaien Wang1, John A. Wrobel4, Yanbao Yu9, Xiaoan Ruan12, Yoshihide Hayashizaki, Jennifer Harrow6, Mark Gerstein5, Tim Hubbard6, Alexandre Reymond10, Stylianos E. Antonarakis11, Gregory J. Hannon1, Morgan C. Giddings9, Morgan C. Giddings4, Yijun Ruan12, Barbara J. Wold2, Piero Carninci, Roderic Guigó14, Thomas R. Gingeras1, Thomas R. Gingeras8 
06 Sep 2012-Nature
TL;DR: Evidence that three-quarters of the human genome is capable of being transcribed is reported, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs that prompt a redefinition of the concept of a gene.
Abstract: Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.

4,450 citations

01 Sep 2012
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.

2,767 citations

Journal Article
Feng Yue, Yong Cheng, Alessandra Breschi, Jeff Vierstra, Weisheng Wu, Tyrone Ryba, Richard Sandstrom, Zhihai Ma, Carrie A. Davis, Benjamin D. Pope, Yin Shen, Dmitri D. Pervouchine, Sarah Djebali, Robert Thurman, Rajinder Kaul, Eric Rynes, Anthony Kirilusha, Georgi K. Marinov, Brian A. Williams, Diane Trout, Henry Amrhein, Katherine I. Fisher-Aylor, Igor Antoshechkin, Gilberto DeSalvo, Lei-Hoon See, Megan Fastuca, Jorg Drenkow, Chris Zaleski, Alexander Dobin, Pablo Prieto, Julien Lagarde, Giovanni Bussotti, Andrea Tanzer, Olgert Denas, Kanwei Li, Michaël Bender, Miaohua Zhang, Rachel Byron, Mark Groudine, David F. McCleary, Long Pham, Zhen Ye, Samantha Kuan, Lee Edsall, Yi-Chieh Wu, Marie-Louise Hee Rasmussen, Mukul S. Bansal, Manolis Kellis, Cheryl A. Keller, Christopher T. Morrissey, Tejaswini Mishra, Deepti Jain, Nergiz Dogan, Raymond C. Harris, Philip Cayting, Trupti Kawli, Alan P. Boyle, Ghia Euskirchen, Anshul Kundaje, Shin Lin, Yiing Lin, Camden Jansen, Venkat S. Malladi, Melissa S. Cline, Drew T. Erickson, Vanessa M. Kirkup, Katrina Learned, Cricket A. Sloan, Kate R. Rosenbloom, d.B. Lacerda, Kathryn Beal, Miguel Pignatelli, Paul Flicek, Jin Lian, Tamer Kahveci, Dongwon Lee, W. J. Kent, S.M. Ramalho, Javier Herrero, Cedric Notredame, Andrew D. Johnson, Shinny Vong, Kristen Lee, Daniel Bates, Fidencio J. Neri, Morgan Diegel, T. Canfield, Peter J. Sabo, Matthew S. Wilken, Thomas A. Reh, Erika Giste, Anthony Shafer, Tanya Kutyavin, Eric Haugen, Douglas Dunn, Shane Neph, Richard Humbert, Robin L Hansen, M.H.L. de Bruijn, Licia Selleri, Alexander Y. Rudensky, Steven Z. Josefowicz, Robert M. Samstein, Evan E. Eichler, Stuart H. Orkin, Dana N. Levasseur, Thalia Papayannopoulou, Kai-Hsin Chang, Arthur I. Skoultchi, Srikanta Gosh, Christine M. Disteche, Piper R. Treuting, Yanli Wang, Mitchell G. Weiss, Gerd A. Blobel, Xiaoyi Cao, Sheng Zhong, Ting Wang, Peter Good, Rebecca F. Lowdon, Leslie B Adams, X. Zhou, Michael J. Pazin, Elise A. Feingold, Barbara J. Wold, Jeremy F. Taylor, Ali Mortazavi, Sherman M. Weissman, John A. Stamatoyannopoulos, Michael Snyder, Roderic Guigó, Thomas R. Gingeras, David M. Gilbert, Ross C. Hardison, Michael A. Beer, Bing Ren 
01 Jan 2014-Nature

23 citations

Journal ArticleDOI
Joel Rozowsky, Jorg Drenkow, Yucheng T. Yang, Gamze Gursoy, Timur R. Galeev, Beatrice Borsari, Charles B. Epstein, Kun Xiong, Jinrui Xu, Jiahao Gao, Kai Yu, Ana Berthel, Zhanlin Chen, Fabio C. P. Navarro, Jason Liu, Maxwell S Sun, James C. Wright, Justin Chang, Christopher J. F. Cameron, Noam Shoresh, Elizabeth Gaskell, Jessika Adrian, Sergey Aganezov, François Aguet, Gabriela Balderrama-Gutierrez, Samridhi Banskota, G. Corona, Sora Chee, Surya B. Chhetri, Gabriel Conte Cortez Martins, Cassidy Danyko, Carrie A. Davis, Daniel Farid, Nina Farrell, Idan Gabdank, Yoel Gofin, David U. Gorkin, Mengting Gu, Vivian C. Hecht, Benjamin C. Hitz, Robbyn Issner, Melanie Kirsche, Xiangmeng Kong, Bonita R Lam, Shantao Li, Bian Li, Tianxiao Li, Xiqi Li, Khine Lin, Ruibang Luo, Mark Mackiewicz, Jill Moore, Jonathan M. Mudge, Nicholas C Nelson, Chad Nusbaum, Ioann O. Popov, Henry Pratt, Yunjiang Qiu, Srividya Ramakrishnan, Joe Raymond, Leonidas Salichos, Alexandra Scavelli, Jacob Schreiber, Fritz J. Sedlazeck, Lei-Hoon See, Rachel M. Sherman, Xu Shi, Minyi Shi, Cricket A. Sloan, J. Seth Strattan, Zhen Tan, Forrest Y. Tanaka, Anna Vlasova, Jun Wang, Jonathan D. Werner, Brian A. Williams, Min Xu, Chengfei Yan, Lu Yu, Chris Zaleski, Jing Zhang, Kristin G. Ardlie, J. M. Cherry, Eric M. Mendenhall, William Noble, Zhiping Weng, Morgan E. Levine, Alexander Dobin, Barbara J. Wold, Ali Mortazavi, Bing Ren, Jesse Gillis, Richard M. Myers, Michael Snyder, Jyoti S. Choudhary, Aleksandar Milosavljević, Michael C. Schatz, Roderic Guigó, Bradley E. Bernstein, Thomas R. Gingeras, Mark Gerstein 
22 Nov 2022-Cell
TL;DR: The EN-TEx dataset as mentioned in this paper contains 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays) mapped to matched, diploid genomes with long read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci.

3 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

30,684 citations

Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

13,548 citations

Journal ArticleDOI
23 Jan 2015-Science
TL;DR: In this paper, a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level.
Abstract: Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.

9,745 citations

Journal Article
01 Jan 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

8,106 citations

Journal ArticleDOI
TL;DR: The Gene Expression Omnibus is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community and supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable.
Abstract: The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

6,683 citations