Institution
Turku Centre for Computer Science
Facility•Turku, Finland•
About: Turku Centre for Computer Science is a facility organization based out in Turku, Finland. It is known for research contribution in the topics: Decidability & Word (group theory). The organization has 382 authors who have published 1027 publications receiving 19560 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: This study investigates how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families.
Abstract: Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons – Attribution – Share Alike (CC BY-SA) license.
113 citations
••
TL;DR: Diverse concepts from the theory of concurrency can be introduced and studied in this framework, providing examples of applications to fairness property and to parallelization of non-context-free languages in terms of context-free and even regular languages.
112 citations
••
TL;DR: This study demonstrates that a number a novel genes are regulated by IL-4 through Stat6-dependent and -independent pathways, and elucidation of kinetics of gene expression at early stages of cell differentiation reveals several genes regulated rapidly during the process, suggesting their importance for the differentiation process.
Abstract: IL-4, primarily produced by T cells, mast cells, and basophiles, is a cytokine which has pleiotropic effects on the immune system. IL-4 induces T cells to differentiate to Th2 cells and activated B lymphocytes to proliferate and to synthesize IgE and IgG1. IL-4 is particularly important for the development and perpetuation of asthma and allergy. Stat6 is the protein activated by signal transduction through the IL-4R, and studies with knockout mice demonstrate that Stat6 is critical for a number of IL-4-mediated functions including Th2 development and production of IgE. In the present study, novel IL-4- and Stat6-regulated genes were discovered by using Stat6(-/-) mice and Affymetrix oligonucleotide arrays. Genes regulated by IL-4 were identified by comparing the gene expression profile of the wild-type T cells induced to polarize to the Th2 direction (CD3/CD28 activation + IL-4) to gene expression profile of the cells induced to proliferate (CD3/CD28 activation alone). Stat6-regulated genes were identified by comparing the cells isolated from the wild-type and Stat6(-/-) mice; in this experiment the cells were induced to differentiate to the Th2 direction (CD3/CD28 activation + IL-4). Our study demonstrates that a number a novel genes are regulated by IL-4 through Stat6-dependent and -independent pathways. Moreover, elucidation of kinetics of gene expression at early stages of cell differentiation reveals several genes regulated rapidly during the process, suggesting their importance for the differentiation process.
109 citations
••
01 Sep 2014TL;DR: The final version of a publicly available treebank of Finnish, the Turku Dependency Treebank is presented and the first open source Finnish dependency parser is presented, trained on the newly introduced treebank.
Abstract: In this paper, we present the final version of a publicly available treebank of Finnish, the Turku Dependency Treebank. The treebank contains 204,399 tokens (15,126 sentences) from 10 different text sources and has been manually annotated in a Finnish-specific version of the well-known Stanford Dependency scheme. The morphological analyses of the treebank have been assigned using a novel machine learning method to disambiguate readings given by an existing tool. As the second main contribution, we present the first open source Finnish dependency parser, trained on the newly introduced treebank. The parser achieves a labeled attachment score of 81 %. The treebank data as well as the parsing pipeline are available under an open license at http://bionlp.utu.fi/ .
108 citations
••
TL;DR: In a lattice-theoretical setting two maps are defined which mimic the rough approximation operators and note that this setting is suitable also for other operators based on binary relations.
Abstract: We study rough approximations based on indiscernibility relations which are not necessarily reflexive, symmetric or transitive. For this, we define in a lattice-theoretical setting two maps which mimic the rough approximation operators and note that this setting is suitable also for other operators based on binary relations. Properties of the ordered sets of the upper and the lower approximations of the elements of an atomic Boolean lattice are studied.
108 citations
Authors
Showing all 383 results
Name | H-index | Papers | Citations |
---|---|---|---|
José A. Teixeira | 101 | 1414 | 47329 |
Cunsheng Ding | 61 | 254 | 11116 |
Jun'ichi Tsujii | 59 | 389 | 15985 |
Arto Salomaa | 56 | 374 | 17706 |
Tero Aittokallio | 52 | 271 | 8689 |
Risto Lahdelma | 48 | 149 | 6637 |
Hannu Tenhunen | 45 | 819 | 11661 |
Mats Gyllenberg | 44 | 204 | 8029 |
Sampo Pyysalo | 42 | 153 | 8839 |
Olli Polo | 42 | 140 | 5303 |
Pasi Liljeberg | 40 | 306 | 6959 |
Tapio Salakoski | 38 | 231 | 7271 |
Filip Ginter | 37 | 156 | 7294 |
Robert Fullér | 37 | 152 | 5848 |
Juha Plosila | 35 | 342 | 4917 |