scispace - formally typeset
Search or ask a question
Institution

Pompeu Fabra University

EducationBarcelona, Spain
About: Pompeu Fabra University is a education organization based out in Barcelona, Spain. It is known for research contribution in the topics: Population & Gene. The organization has 8093 authors who have published 23570 publications receiving 858431 citations. The organization is also known as: Universitat Pompeu Fabra & UPF.


Papers
More filters
Journal ArticleDOI
TL;DR: An exhaustive review of research on automatic classification of sounds from musical instruments presents and discusses different techniques for similarity-based clustering of sounds and for classification into pre-defined instrumental categories.
Abstract: We present an exhaustive review of research on automatic classification of sounds from musical instruments. Two different but complementary approaches are examined, the perceptual approach and the taxonomic approach. The former is targeted to derive perceptual similarity functions in order to use them for timbre clustering and for searching and retrieving sounds by timbral similarity. The latter is targeted to derive indexes for labeling sounds after culture- or user-biased taxonomies. We review the relevant features that have been used in the two areas and then we present and discuss different techniques for similarity-based clustering of sounds and for classification into pre-defined instrumental categories.

204 citations

Journal ArticleDOI
TL;DR: Though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, there is a long way to go before the authors can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.
Abstract: One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.

204 citations

Journal ArticleDOI
TL;DR: A new computational approach for mammalian BP prediction is presented and it is suggested that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts.
Abstract: The branch point (BP) is one of the three obligatory signals required for pre-mRNA splicing. In mammals, the degeneracy of the motif combined with the lack of a large set of experimentally verified BPs complicates the task of modeling it in silico, and therefore of predicting the location of natural BPs. Consequently, BPs have been disregarded in a considerable fraction of the genome-wide studies on the regulation of splicing in mammals. We present a new computational approach for mammalian BP prediction. Using sequence conservation and positional bias we obtained a set of motifs with good agreement with U2 snRNA binding stability. Using a Support Vector Machine algorithm, we created a model complemented with polypyrimidine tract features, which considerably improves the prediction accuracy over previously published methods. Applying our algorithm to human introns, we show that BP position is highly dependent on the presence of AG dinucleotides in the 3′ end of introns, with distance to the 3′ splice site and BP strength strongly correlating with alternative splicing. Furthermore, experimental BP mapping for five exons preceded by long AG-dinucleotide exclusion zones revealed that, for a given intron, more than one BP can be chosen throughout the course of splicing. Finally, the comparison between exons of different evolutionary ages and pseudo exons suggests a key role of the BP in the pathway of exon creation in human. Our computational and experimental analyses suggest that BP recognition is more flexible than previously assumed, and it appears highly dependent on the presence of downstream polypyrimidine tracts. The reported association between BP features and the splicing outcome suggests that this, so far disregarded but yet crucial, element buries information that can complement current acceptor site models.

203 citations

Journal ArticleDOI
TL;DR: It is proposed that triggering inappropriate liquid phase separation may be an important cause of dosage sensitivity and a determinant of human disease.

203 citations

Journal ArticleDOI
TL;DR: Strong evidence is provided that at least 4%-5% of the tandem gene pairs in the human genome can be eventually transcribed into a single RNA sequence encoding a putative chimeric protein, and that this phenomenon is a common mechanism with the potential of generating hundreds of additional proteins in thehuman genome.
Abstract: The "one-gene, one-protein" rule, coined by Beadle and Tatum, has been fundamental to molecular biology. The rule implies that the genetic complexity of an organism depends essentially on its gene number. The discovery, however, that alternative gene splicing and transcription are widespread phenomena dramatically altered our understanding of the genetic complexity of higher eukaryotic organisms; in these, a limited number of genes may potentially encode a much larger number of proteins. Here we investigate yet another phenomenon that may contribute to generate additional protein diversity. Indeed, by relying on both computational and experimental analysis, we estimate that at least 4%-5% of the tandem gene pairs in the human genome can be eventually transcribed into a single RNA sequence encoding a putative chimeric protein. While the functional significance of most of these chimeric transcripts remains to be determined, we provide strong evidence that this phenomenon does not correspond to mere technical artifacts and that it is a common mechanism with the potential of generating hundreds of additional proteins in the human genome.

202 citations


Authors

Showing all 8248 results

NameH-indexPapersCitations
Andrei Shleifer171514271880
Paul Elliott153773103839
Bert Brunekreef12480681938
Philippe Aghion12250773438
Anjana Rao11833761395
Jordi Sunyer11579857211
Kenneth J. Arrow113411111221
Xavier Estivill11067359568
Roderic Guigó108304106914
Mark J. Nieuwenhuijsen10764749080
Jordi Alonso10752364058
Alfonso Valencia10654255192
Luis Serrano10545242515
Vadim N. Gladyshev10249034148
Josep M. Antó10049338663
Network Information
Related Institutions (5)
University College London
210.6K papers, 9.8M citations

90% related

University of Pennsylvania
257.6K papers, 14.1M citations

90% related

Columbia University
224K papers, 12.8M citations

90% related

University of Amsterdam
140.8K papers, 5.9M citations

89% related

University of Edinburgh
151.6K papers, 6.6M citations

89% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202349
2022248
20211,903
20201,930
20191,763
20181,660