scispace - formally typeset
Search or ask a question
Author

Kristina Hettne

Bio: Kristina Hettne is an academic researcher from Leiden University Medical Center. The author has contributed to research in topics: Workflow & Metadata. The author has an hindex of 21, co-authored 64 publications receiving 1306 citations. Previous affiliations of Kristina Hettne include Leiden University & Maastricht University.


Papers
More filters
Journal ArticleDOI
TL;DR: A dictionary for the identification of small molecules and drugs in text, combining information from UMLS, MeSH, ChEBI, DrugBank, KEGG, HMDB and ChemIDplus is developed.
Abstract: Motivation: From the scientific community, a lot of effort has been spent on the correct identification of gene and protein names in text, while less effort has been spent on the correct identification of chemical names. Dictionary-based term identification has the power to recognize the diverse representation of chemical information in the literature and map the chemicals to their database identifiers. Results: We developed a dictionary for the identification of small molecules and drugs in text, combining information from UMLS, MeSH, ChEBI, DrugBank, KEGG, HMDB and ChemIDplus. Rule-based term filtering, manual check of highly frequent terms and disambiguation rules were applied. We tested the combined dictionary and the dictionaries derived from the individual resources on an annotated corpus, and conclude the following: (i) each of the different processing steps increase precision with a minor loss of recall; (ii) the overall performance of the combined dictionary is acceptable (precision 0.67, recall 0.40 (0.80 for trivial names); (iii) the combined dictionary performed better than the dictionary in the chemical recognizer OSCAR3; (iv) the performance of a dictionary based on ChemIDplus alone is comparable to the performance of the combined dictionary. Availability: The combined dictionary is freely available as an XML file in Simple Knowledge Organization System format on the web site http://www.biosemantics.org/chemlist. Contact: k.hettne@erasmusmc.nl Supplementary information:Supplementary data are available at Bioinformatics online.

151 citations

Journal ArticleDOI
31 Jan 2020
TL;DR: The concept of FAIR implementation considerations is introduced to assist accelerated global participation and convergence towards accessible, robust, widespread and consistent FAIR implementations.
Abstract: The FAIR principles have been widely cited, endorsed and adopted by a broad range of stakeholders since their publication in 2016. By intention, the 15 FAIR guiding principles do not dictate specific technological implementations, but provide guidance for improving Findability, Accessibility, Interoperability and Reusability of digital resources. This has likely contributed to the broad adoption of the FAIR principles, because individual stakeholder communities can implement their own FAIR solutions. However, it has also resulted in inconsistent interpretations that carry the risk of leading to incompatible implementations. Thus, while the FAIR principles are formulated on a high level and may be interpreted and implemented in different ways, for true interoperability we need to support convergence in implementation choices that are widely accessible and (re)-usable. We introduce the concept of FAIR implementation considerations to assist accelerated global participation and convergence towards accessible, robust, widespread and consistent FAIR implementations. Any self-identified stakeholder community may either choose to reuse solutions from existing implementations, or when they spot a gap, accept the challenge to create the needed solution, which, ideally, can be used again by other communities in the future. Here, we provide interpretations and implementation considerations (choices and challenges) for each FAIR principle.

147 citations

Journal ArticleDOI
TL;DR: A novel approach to the preservation of scientific workflows through the application of research objects-aggregations of data and metadata that enrich the workflow specifications that support the creation of workflow-centric research objects.

133 citations

Proceedings ArticleDOI
08 Oct 2012
TL;DR: A minimal set of auxiliary resources to be preserved together with the workflows as an aggregation object and provide a software tool for end-users to create such aggregations and to assess their completeness.
Abstract: Workflows provide a popular means for preserving scientific methods by explicitly encoding their process. However, some of them are subject to a decay in their ability to be re-executed or reproduce the same results over time, largely due to the volatility of the resources required for workflow executions. This paper provides an analysis of the root causes of workflow decay based on an empirical study of a collection of Taverna workflows from the myExperiment repository. Although our analysis was based on a specific type of workflow, the outcomes and methodology should be applicable to workflows from other systems, at least those whose executions also rely largely on accessing third-party resources. Based on our understanding about decay we recommend a minimal set of auxiliary resources to be preserved together with the workflows as an aggregation object and provide a software tool for end-users to create such aggregations and to assess their completeness.

123 citations

Journal ArticleDOI
TL;DR: The observed transcriptional changes inSCA3 mouse brain reveal parallels with previous reported neuropathology in patients, but also shows brain region specific effects as well as involvement of adrenergic signalling and CREB pathway changes in SCA3.
Abstract: Spinocerebellar ataxia type 3 (SCA3) is a progressive neurodegenerative disorder caused by expansion of the polyglutamine repeat in the ataxin-3 protein. Expression of mutant ataxin-3 is known to result in transcriptional dysregulation, which can contribute to the cellular toxicity and neurodegeneration. Since the exact causative mechanisms underlying this process have not been fully elucidated, gene expression analyses in brains of transgenic SCA3 mouse models may provide useful insights. Here we characterised the MJD84.2 SCA3 mouse model expressing the mutant human ataxin-3 gene using a multi-omics approach on brain and blood. Gene expression changes in brainstem, cerebellum, striatum and cortex were used to study pathological changes in brain, while blood gene expression and metabolites/lipids levels were examined as potential biomarkers for disease. Despite normal motor performance at 17.5 months of age, transcriptional changes in brain tissue of the SCA3 mice were observed. Most transcriptional changes occurred in brainstem and striatum, whilst cerebellum and cortex were only modestly affected. The most significantly altered genes in SCA3 mouse brain were Tmc3, Zfp488, Car2, and Chdh. Based on the transcriptional changes, α-adrenergic and CREB pathways were most consistently altered for combined analysis of the four brain regions. When examining individual brain regions, axon guidance and synaptic transmission pathways were most strongly altered in striatum, whilst brainstem presented with strongest alterations in the pi-3 k cascade and cholesterol biosynthesis pathways. Similar to other neurodegenerative diseases, reduced levels of tryptophan and increased levels of ceramides, di- and triglycerides were observed in SCA3 mouse blood. The observed transcriptional changes in SCA3 mouse brain reveal parallels with previous reported neuropathology in patients, but also shows brain region specific effects as well as involvement of adrenergic signalling and CREB pathway changes in SCA3. Importantly, the transcriptional changes occur prior to onset of motor- and coordination deficits.

57 citations


Cited by
More filters
01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal ArticleDOI
TL;DR: ChemSpider is a free, online chemical database offering access to physical and chemical properties, molecular structure, spectral data, synthetic methods, safety information, and nomenclature for almost 25 million unique chemical compounds sourced and linked to almost 400 separate data sources on the Web.
Abstract: ChemSpider is a free, online chemical database offering access to physical and chemical properties, molecular structure, spectral data, synthetic methods, safety information, and nomenclature for almost 25 million unique chemical compounds sourced and linked to almost 400 separate data sources on the Web. ChemSpider is quickly becoming the primary chemistry Internet portal and it can be very useful for both chemical teaching and research.

859 citations

Journal ArticleDOI
TL;DR: An update to the taverna tool suite is provided, highlighting new features and developments in the workbench and the Taverna Server.
Abstract: The Taverna workflow tool suite (http://www.taverna.org.uk) is designed to combine distributed Web Services and/or local tools into complex analysis pipelines. These pipelines can be executed on local desktop machines or through larger infrastructure (such as supercomputers, Grids or cloud environments), using the Taverna Server. In bioinformatics, Taverna workflows are typically used in the areas of high-throughput omics analyses (for example, proteomics or transcriptomics), or for evidence gathering methods involving text mining or data mining. Through Taverna, scientists have access to several thousand different tools and resources that are freely available from a large range of life science institutions. Once constructed, the workflows are reusable, executable bioinformatics protocols that can be shared, reused and repurposed. A repository of public workflows is available at http://www.myexperiment.org. This article provides an update to the Taverna tool suite, highlighting new features and developments in the workbench and the Taverna Server.

724 citations

Journal ArticleDOI
TL;DR: There is a doubling of the number of annotated metabolite nodes in WikiPathways and an OpenAPI documentation of the authors' web services and the FAIR annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data.
Abstract: WikiPathways (wikipathways.org) captures the collective knowledge represented in biological pathways. By providing a database in a curated, machine readable way, omics data analysis and visualization is enabled. WikiPathways and other pathway databases are used to analyze experimental data by research groups in many fields. Due to the open and collaborative nature of the WikiPathways platform, our content keeps growing and is getting more accurate, making WikiPathways a reliable and rich pathway database. Previously, however, the focus was primarily on genes and proteins, leaving many metabolites with only limited annotation. Recent curation efforts focused on improving the annotation of metabolism and metabolic pathways by associating unmapped metabolites with database identifiers and providing more detailed interaction knowledge. Here, we report the outcomes of the continued growth and curation efforts, such as a doubling of the number of annotated metabolite nodes in WikiPathways. Furthermore, we introduce an OpenAPI documentation of our web services and the FAIR (Findable, Accessible, Interoperable and Reusable) annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data. New search options, monthly downloads, more links to metabolite databases, and new portals make pathway knowledge more effortlessly accessible to individual researchers and research communities.

675 citations