scispace - formally typeset
Open AccessJournal ArticleDOI

DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants

Reads0
Chats0
TLDR
DisGeNET is a versatile platform that can be used for different research purposes including the investigation of the molecular underpinnings of specific human diseases and their comorbidities, the analysis of the properties of disease genes, the generation of hypothesis on drug therapeutic action and drug adverse effects, the validation of computationally predicted disease genes and the evaluation of text-mining methods performance.
Abstract
The information about the genetic basis of human diseases lies at the heart of precision medicine and drug discovery. However, to realize its full potential to support these goals, several problems, such as fragmentation, heterogeneity, availability and different conceptualization of the data must be overcome. To provide the community with a resource free of these hurdles, we have developed DisGeNET (http://www.disgenet.org), one of the largest available collections of genes and variants involved in human diseases. DisGeNET integrates data from expert curated repositories, GWAS catalogues, animal models and the scientific literature. DisGeNET data are homogeneously annotated with controlled vocabularies and community-driven ontologies. Additionally, several original metrics are provided to assist the prioritization of genotype-phenotype relationships. The information is accessible through a web interface, a Cytoscape App, an RDF SPARQL endpoint, scripts in several programming languages and an R package. DisGeNET is a versatile platform that can be used for different research purposes including the investigation of the molecular underpinnings of specific human diseases and their comorbidities, the analysis of the properties of disease genes, the generation of hypothesis on drug therapeutic action and drug adverse effects, the validation of computationally predicted disease genes and the evaluation of text-mining methods performance.

read more

Content maybe subject to copyright    Report

Published online 19 October 2016 Nucleic Acids Research, 2017, Vol. 45, Database issue D833–D839
doi: 10.1093/nar/gkw943
DisGeNET: a comprehensive platform integrating
information on human disease-associated genes and
variants
Janet Pi
˜
nero,
`
Alex Bravo, N
´
uria Queralt-Rosinach, Alba Guti
´
errez-Sacrist
´
an,
Jordi Deu-Pons, Emilio Centeno, Javier Garc
´
ıa-Garc
´
ıa, Ferran Sanz and Laura I. Furlong
*
Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM),
Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88,
E-08003 Barcelona, Spain
Received August 11, 2016; Revised September 29, 2016; Editorial Decision October 07, 2016; Accepted October 18, 2016
ABSTRACT
The information about the genetic basis of human
diseases lies at the hear t of precision medicine and
drug discovery. However, to realize its full poten-
tial to support these goals, several problems, such
as fragmentation, heterogeneity, availability and dif-
ferent conceptualization of the data must be over-
come. To provide the community with a resource
free of these hurdles, we have developed DisGeNET
(
http://www.disgenet.org), one of the largest avail-
able collections of genes and variants involved in
human diseases. DisGeNET integrates data from ex-
pert curated repositories, GWAS catalogues, animal
models and the scientific literature. DisGeNET data
are homogeneously annotated with controlled vo-
cabularies and community-driven ontologies. Addi-
tionally, several original metrics are provided to as-
sist the prioritization of genotype–phenotype rela-
tionships. The information is accessible through a
web interface, a Cytoscape App, an RDF SPARQL
endpoint, scripts in several programming languages
and an R package. DisGeNET is a versatile platform
that can be used for different research purposes in-
cluding the investigation of the molecular underpin-
nings of specific human diseases and their comor-
bidities, the analysis of t he properties of disease
genes, the generation of hypothesis on drug thera-
peutic action and drug adverse effects, the validation
of computationally predicted disease genes and the
evaluation of text-mining methods performance.
INTRODUCTION
Research on the genetic causes of disease has accelerated
as a result of both the completion of the human genome
and the development of the Next Generation Sequencing
techniques, which has opened the promise of t ranslating
the alterations in individuals’ genomes in clinically rele-
vant information to assist disease diagnostics and therapeu-
tic decision-making. These efforts have generated a large
volume of potentially useful information that has boosted
biomedical research. Navigating and analyzing this infor-
mation, however, is still cumbersome and time-consuming
for researchers, due to four main hurdles. First, relevant
information about the genetic causes of disease is scat-
tered across specialized catalogues focused on specic dis-
ease classications (i.e. Mendelian, or rare diseases), dif-
ferent model organisms, or on particular technological ap-
proaches (such as GWAS). Second, the fragmented nature
of the processes generating this information has resulted in
data of heterogeneous nature, not always annotated with
controlled vocabularies and ontologies, and provided with
different formats which are often non-trivial to reconcile.
Third, the identication and prioritization of the relevant
information from the vast quantity of data they harbor is
often a challenging task for the end user. Finally, the access
to most resources is usually limited to web interfaces and
data downloads, hampering the usability of the infor mation
they contain. To further complicate this situation, a signi-
cant portion of the research on this eld is only available as
free text in scientic publications. These data are then, not
amenable for computational analysis, and their inspection
by researchers and clinicians, slow and tedious.
Repositories integrating and homogeneously annotating
our current knowledge of the genetic causes of diseases are
therefore essential to expedite translational research. To ad-
dress this need we have developed DisGeNET (
1,2), a dis-
covery platform that contains a comprehensive catalogue
*
To whom correspondence should be addressed. Tel: +34 93 316 0521; Fax: +34 93 316 0550; Email: laura.furlong@upf.edu
Present address: N´uria Queralt-Rosinach, Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA.
C
The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which
permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact
journals.permissions@oup.com
Downloaded from https://academic.oup.com/nar/article/45/D1/D833/2290909 by U.S. Department of Justice user on 16 August 2022

D834 Nucleic Acids Research, 2017, Vol. 45, Database issue
of genes and variants associated to human diseases. Dis-
GeNET covers the whole landscape of human diseases, in-
cluding Mendelian, complex, environmental and rare dis-
eases, and disease-related traits. DisGeNET collects data
on genotype-phenotype relationships from several of the
most popular resources in this area. These data are comple-
mented with information extracted from the scientic liter-
ature using NLP-based text-mining tools. A variety of an-
notations and several metrics are offered to support, pri-
oritize and facilitate the retrieval, and interpretation of the
information. Furthermore, the platform offers several tools
to interact with the data, including a web interface, a Cy-
toscape App, an R package and a SPARQL endpoint. The
data are available for downloading in several formats. In this
way, DisGeNET offers one of the most complete reposito-
ries on the genetic causes of human diseases to a wide range
of users and purposes.
A COMPREHENSIVE CENTRALIZED REPOSITORY
OF DISEASE-ASSOCIATED GENES AND VARIANTS
In order to present the most complete landscape of the ex-
tent of our knowledge of the genetic underpinnings of hu-
man diseases, DisGeNET integrates data from expert cu-
rated databases with information gathered through text-
mining the scientic literature. The DisGeNET release 4.0
includes the following resources (for more details about
the data processing, and versions, please check the Sup-
plementary Data and http://disgenet.org/web/DisGeNET/
menu/dbinfo
):
- The Comparative Toxicogenomics Database (CTD) that
provides information about chemicals and genes, and their
effect in human health and disease (
3)
- UniProt, a repository centred on protein sequence and
function, that also include disease annotations from
OMIM, and from expert curation of the scientic litera-
ture (
4)
- ClinVar*, a public archive of relationships between human
variants and phenotypes, including diseases (
5)
- Orphanet*, a portal on rare diseases and their genes (
6)
- The GWAS Catalog*, a curated collection of published
GWAS results (
7)
- The Rat Genome Database (RGD), a repository for the
genetic, genomic and phenotypic data of the laboratory
rat (
8)
- The Mouse Genome Database (MGD), the main commu-
nity resource for the laboratory mouse, that includes in-
formation about mouse models of disease (
9)
- The Genetic Association Database (GAD), an archive of
genetic association studies (
10)
Additionally, DisGeNET includes two data sets obtained
using different text-mining approaches
- The Literature Human Gene Derived Network
(LHGDN), a data set of gene-disease associations
obtained by text-mining the Entrez Gene Reference Into
Function (GeneRIFs) using a Conditional Random Field
approach (
11)
- BeFree data, obtained using the BeFree System, which
extracts gene-disease associations from MEDLINE ab-
stracts (
12,13)
* New sources with respect to the last published release
The data in DisGeNET are aggregated according to the
original source in: CURATED, that includes data from
UniProt, ClinVar, Orphanet, the GWAS Cat alog and CTD
(human data), PREDICTED, including RGD, MGD and
CTD (mouse and rat data), and ALL. For further details
about the data processing, see the Supplementary Material.
DisGeNET 4.0 (June, 2016) contains 429 036 gene-
disease associations (GDAs), linking 17 381 genes to 15 093
diseases, disorders and clinical or abnormal human pheno-
types. Ninety nine percent of the GDAs in DisGeNET are
supported by at least one publication, and the information
contained in DisGeNET corresponds to more than 289 000
publications. DisGeNET contains 72 870 variant-disease
associations (VDAs), between 46 589 SNPs and 6356 dis-
eases and phenotypes. The GDAs provided by the expert
curated resources (DisGeNET curated) represent only the
2% of all the information (Supplementary Figure S1, panel
A), highlighting the need of text-mining algorithms to un-
lock the information available as free text in scientic pub-
lications. The data obtained from each resource are rather
unique, as evidenced by the small overlap between sources
(Supplementary Figure S1, panel B). The lack of redun-
dancy across different resources emphasizes the importance
of integration, and is a consequence of the fragmented na-
ture of data production, the different focus of individual re-
sources and the different criteria for curation.
HOMOGENEOUS ANNOTATION OF THE DATA FOS-
TERS INTEROPERABILITY AND EASES INTERPRE-
TATION
The data retrieved from the different resources are harmo-
nized and standardized using community driven-controlled
vocabularies and ontologies. Furthermore, genes, diseases,
variants, GDAs and VDAs are e nriched with additional in-
formation that expedites data interpretation and analysis,
both manual and automatic.
Unied Medical Language System (
14) (UMLS)
Metathesaurus Concept Unique Identiers are employed
to homogeneously annotate diseases obtained from dif-
ferent sources. To allow interoperability, and to bridge the
research and the clinical settings, DisGeNET provides a
wide variety of disease vocabularies. These include MeSH,
OMIM, Disease Ontology (
15) (DO) and ICD9. Disease
attributes include the MeSH disease class, the UMLS
semantic type, the Human Phenotype Ontology (
16)and
the DO upper level classes. In version 4.0, we distinguish
between abnormal phenotypes, traits, signs and symptoms
from actual diseases using the UMLS semantic types
followed by manual curation. In addition, disease entries
referring to very general classes such as ‘Cardiovascular
Diseases’, Autoimmune Diseases’, ‘Neurodegenerative
Diseases and ’Lung Neoplasms” are indicated as ‘disease
group’.
Gene HGNC symbols and UniProt identiers are
mapped to Entrez gene identiers. The genes are described
Downloaded from https://academic.oup.com/nar/article/45/D1/D833/2290909 by U.S. Department of Justice user on 16 August 2022

Nucleic Acids Research, 2017, Vol. 45, Database issue D835
with their full name, the UniProt accession, the Reactome
(
17) upper level pathway and the Panther (18)proteinclass.
DisGeNET 4.0 has made a stronger emphasis on the re-
lationship between variants and diseases. Variant-disease
information in DisGeNET originates from ClinVar, the
GWAS Catalog, UniProt, GAD and from BeFree data.
Variants are identied using the NCBI Short Genetic Varia-
tions database (dbSNP) identiers, and annotated with the
chromosomal position, the reference and alternative alle-
les and the variant class, obtained from dbSNP database
(
19). Additionally, variant allele frequencies computed by
the Exome Aggregation Consortium, which aggregates and
harmonizes exome sequencing data from a variety of large-
scale sequencing projects (
20), and by the 1000 Genomes
Project, a public catalogue of human variation and geno-
type data (
21), are provided. Finally, the most severe conse-
quence type of each variant is displayed, obtained using the
Ensembl Variant Effect Predictor (
22).
For both, GDAs and VDAs, the publication(s) support-
ing the association, a representative sentence from each
publication and the original source are available. In Dis-
GeNET 4.0 the full collection of publications supporting
each association has been made available (in previous re-
leases only 10 representative papers were shown). Addition-
ally, the associations can now be sorted or ltered by publi-
cation year. The new release also benets from an improved
disambiguation between genes and diseases implemented in
the BeFree text-mining system. Furthermore, false positive
GDAs have been semi-automatically removed from all text-
mining data sets (BeFree, GAD and LHGDN). Lastly, each
GDA is characterized with the DisGeNET gene-disease as-
sociation type ontology that has been updated to include
new association types (highlighted in purple in Supplemen-
tary Figure S2).
PRIORITIZATION FEATURES TO GUIDE THE EXPLO-
RATION OF THE GENETIC BASIS OF DISEASE
To help navigating the more than 400 000 GDAs in Dis-
GeNET, these are rated with a condence score (the Dis-
GeNET score) that reects the recurrence of a GDA
across all data sources. The DisGeNET score takes into
account the number of sources supporting the asso-
ciation and the reliability of each of them (for fur-
ther details see
http://disgenet.org/web/DisGeNET/menu/
dbinfo#score
). The score is updated in each release to in-
clude the new sources incorporated in the database.
In order to facilitate ranking the genes associated with a
disease, in this release we introduced two new metrics. The
Disease Specicity Index (DSI) ranges from 0 to 1, and is
inversely proportional to the number of diseases associated
to a particular gene. A gene associated to a large number of
diseases (e.g. TNF, associated to more than 1500 diseases)
will have a DSI close to zero, while a gene associated to only
one disease, is more ‘specic’ for that disease and has DSI
of 1. The Disease Pleiotropy Index (DPI) ranges from 0 to
1 and is proportional to the number of different (MeSH)
disease classes a gene is associated to. Thus, a gene associ-
ated to diseases of diverse classes (such as APOE, associated
to Cardiovascular Diseases, Mental Disorders, Neoplasms,
Respiratory Tract Diseases, etc), will have a DPI close to
1. Conversely, the PSCA, associated to 58 diseases, most of
which are neoplasms has a relatively low DPI.
FLEXIBLE DATA ACCESS TO ANSWER DIFFERENT
RESEARCH QUESTIONS
The DisGeNET platform includes several ways of access-
ing the data, which makes it a versatile resource to an-
swer questions posed by different types of users, and to
achieve diverse goals (Figure
1). The web interface is de-
signed for user-friendly exploration of small portions of the
data by users interested in a particular disease, gene or vari-
ant. Using customized scripts allows querying the database
for large lists of genes, diseases or variants, and includ-
ing DisGeNET data in computational workows. The Dis-
GeNET Cytoscape App is especially suited to carry out net-
work medicine analyses and visualize their results. Access-
ing the data using Semantic Web technologies enables to
combine DisGeNET data with other types of biological in-
formation available in the Linked Open Data (LOD) cloud
(
http://lod-cloud.net/). Finally, the disgenet2r R package
facilitates exploring, analysing and visualizing the data us-
ing the powerful graphical and statistical capabilities of the
R environment.
The web interface
The DisGeNET web interface allows searching by single
gene, disease and variant, using different types of identi-
ers. Searching by short lists of genes, diseases and variants
is also possible. The user can also browse the data, using
the Browser entry point. To ease the interpretation of the
information, the different views of the web interface show
annotations of the different biological entities to the afore-
mentioned attributes. The interface also enables sorting and
ltering the data using the DisGeNET score, the DPI, the
DSI, the publication year and the DisGeNET gene-disease
association type. Links to other reference resources, such as
the NCBI gene, UniProt, UMLS and dbSNP, are also pro-
vided.
Programmatic access
As a result of exploring DisGeNET data using the web in-
terface, scripts in R, Perl, Python and Bash are automati-
cally generated. These scripts can be downloaded and cus-
tomized to generate the same or similar queries, allowing to
reproduce the results of t he analysis performed through the
web interface and to incorporate DisGeNET data as part
of automatic computational workows. Additionally, in the
downloads section of the website, several exemplary scripts
are provided to perform queries to the DisGeNET database,
with use case examples.
The DisGeNET Cytoscape App
The DisGeNET Cytoscape plugin (1) has been updated to
correspond to Cytoscape versions 3.0 or higher. The Dis-
GeNET Cytoscape App allows visualization of the GDAs
as bipartite networks, and also exploration of the data in a
disease or gene-centric way. A variety of features can be used
Downloaded from https://academic.oup.com/nar/article/45/D1/D833/2290909 by U.S. Department of Justice user on 16 August 2022

D836 Nucleic Acids Research, 2017, Vol. 45, Database issue
Figure 1. The DisGeNET platform provides several ways to access the data.
to construct the networks, such as ltering by disease class,
the DisGeNET gene-disease association type, data source
and score range. The networks can also be built around
a particular gene, or disease. The DisGeNET Cytoscape
App, and a detailed tutorial on how to use it, are available
for download at
http://disgenet.org/web/DisGeNET/menu/
app
.
DisGeNET-RDF
DisGeNET data are also published as machine-readable
data through DisGeNET-RDF (
23) and nanopublications
(24) linked data sets, which increases the FAIRness of
the data (25). Entities and properties in DisGeNET-RDF
are semantically dened making extensive reuse of stan-
dard identiers, vocabularies and ontologies such as the
National Cancer Institute thesaurus, for medical vocabu-
lary and the Semanticscience Integrated Ontology (
26)for
general science. DisGeNET-RDF is interlinked to other
biomedical databases available in the LOD cloud that en-
ables performing complex queries that need the interroga-
tion of different resources to be answered. The DisGeNET
SPARQL endpoint allows exploration of the DisGeNET-
RDF data set and query federation to expand DisGeNET
gene-disease association information with data on gene ex-
pression, drug activity and biological pathways, among oth-
ers. Representative queries linking the data to other re-
Downloaded from https://academic.oup.com/nar/article/45/D1/D833/2290909 by U.S. Department of Justice user on 16 August 2022

Nucleic Acids Research, 2017, Vol. 45, Database issue D837
sources such as Wikipathways (
27), ChEMBL (28) and the
Gene Expression Atlas (
29) are available at the RDF page
in the website (
http://www.disgenet.org/web/DisGeNET/
menu/rdf#sparql-queries
).
The disgenet2r R package
DisGeNET data can also be accessed via an R package, dis-
genet2r. The package contains a series of functions to re-
trieve gene-disease and variant-disease data, and to perform
mappings between several biomedical vocabularies. The re-
sults of the queries can be visualized using a variety of plots,
such as heatmaps, barplots and several types of networks.
The package is especially well suited to explore the genetic
basis of diseases, and disease comorbidities. Furthermore,
the disgenet2r package permits beneting from the Seman-
tic Web technologies, without the need of special expertise
in this area through a set of functions that use DisGeNET-
RDF and other resources available in the LOD cloud. These
functions include retrieving the druggable targets for a dis-
ease of interest, or obtaining the biological pathways for a
list of disease genes. The disgenet2r package also expedites
the integration of DisGeNET data with other R packages.
The source code and documentation of disgenet2r package
are available at
https://bitbucket.org/ibi
group/disgenet2r.
CONCLUSIONS AND FUTURE PERSPECTIVES
New trends are starting to emerge in disease therapeutics.
Some examples include immunotherapies, gene therapy, the
use of siRNA and anti-sense oligonucleotides, and the more
recent possibility of genome editing with CRISPR-cas9 sys-
tems. The structural constraints imposed by ‘classical’ drug
development will no longer be a limitation, marking the
end of the so called ‘druggable genome’ (
30). With this, a
new type of therapy, based on disease mechanisms is start-
ing to emerge (
31). The precise knowledge of the molecu-
lar processes underlying disease pathophysiology will be-
come the new limiting step in drug development. In or-
der to elucidate these mechanisms, integrative approaches
that aggregate all the available information on the genet-
ics basis of diseases are an essential step. Currently, some
of the more popular resources only represent a fraction of
the available knowledge. For example, OMIM (
32)covers
only Mendelian diseases, Orphanet (
6) is a compilation of
rare diseases, while the GWAS Catalog (
7) is a repository for
GWAS data, involving mainly complex diseases and traits
(For a detailed list of available resources, see Supplementary
Table S1). Conversely, DisGeNET aims at integrating infor-
mation on the genetic underpinnings of all disease thera-
peutic areas, and in such endeavor, at being a repository of
reference for closing the genotype–phenotype gap.
DisGeNET platform has been used to study a variety
of biomedical problems, which include investigating the
molecular basis of specic diseases (
33–36), annotating lists
of genes produced by different types of omics and sequenc-
ing protocols (
37–39), validating disease genes prediction
methods (
40–42), understanding disease mechanisms in the
context of protein networks (43,44), gaining insight into
drug action (
45) and drug adverse reactions mechanisms
(46), drug repurposing (47), exploring the molecular ba-
sis of disease comorbidities (
48,49), assessing the perfor-
mance of text-mining algorithms (
50) and as part of other
resources (
51–53).
DisGeNET is a well-established resource with four sta-
ble releases (as of October, 2016). It is regularly growing,
fuelled and kept up-to-date by the new research, by the in-
corporation of new data sources, and by the interest of a
growing community of users. The careful use of standards,
and state of the art biomedical ontologies, the attention paid
to keeping track of the provenance of the information, to-
gether with the extensive documentation of the data pro-
cessing and the multiple access points, makes of DisGeNET
a platform of choice to support translational research.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
ACKNOWLEDGEMENTS
The authors would like to thank the Exome Aggregation
Consortium and the groups that provided exome variant
data for comparison. A full list of contributing groups can
be found at
http://exac.broadinstitute.org/about.
FUNDING
Instituto de Salud Carlos III-Fondo Europeo de Desar-
rollo Regional [CP10/00524 and PI13/00082]; Innovative
Medicines Initiative Joint Undertaking [Open PHACTs
No. 115191]; resources of which are composed of nancial
contribution from the European Unions Seventh Frame-
work Programme [FP7/2007-2013]; EFPIA companies’ in
kind contribution; European Union Horizon 2020 Pro-
gramme 2014-2020 [MedBioinformatics No. 634143 and
Elixir-Excelerate No. 676559]. The Research Programme on
Biomedical Informatics (GRIB) is a member of the Span-
ish National Bioinformatics Institute ( INB), PRB2-ISCIII
and is supported by grant PT13/0001/0023, of the PE
I+D+i 2013-2016, funded by ISCIII and FEDER.Funding
for open access charge: Instituto de Salud Carlos III-Fondo
Europeo de Desarrollo Regional [PI13/00082].
Conict of interest statement. None declared.
REFERENCES
1. Bauer-Mehren,A., Rautschka,M., Sanz,F. and Furlong,L.I. (2010)
DisGeNET: a Cytoscape plugin to visualize, integrate, search and
analyze gene-disease networks. Bioinformatics, 26, 2924–2926.
2. Pi
˜
nero,J., Queralt-Rosinach,N., Bravo,A., Deu-Pons,J.,
Bauer-Mehren,A., Baron,M., Sanz,F. and Furlong,L.I. (2015)
DisGeNET: a discovery platform for the dynamical exploration of
human diseases and their genes. Database, bav028.
3. Davis,A.P., Grondin,C.J., Lennon-Hopkins,K.,
Saraceni-Richards,C., Sciaky,D., King,B.L., Wiegers,T.C. and
Mattingly,C.J. (2015) The Comparative Toxicogenomics Database’s
10th year anniversary: update 2015. Nucleic Acids Res., 43,
D914–D920.
4. The UniProt Consortium (2014) UniProt: a hub for protein
information. Nucleic Acids Res., 43, D204–D212.
5. Landrum,M.J., Lee,J.M., B enson,M., Brown,G., Chao,C.,
Chitipiralla,S., Gu,B., Hart,J., Hoffman,D., Hoover,J. et al. (2016)
ClinVar: public archive of interpretations of clinically relevant
variants. Nucleic Acids Res., 44, D862–D868.
Downloaded from https://academic.oup.com/nar/article/45/D1/D833/2290909 by U.S. Department of Justice user on 16 August 2022

Citations
More filters
Journal ArticleDOI

g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update).

TL;DR: G:Profiler is now capable of analysing data from any organism, including vertebrates, plants, fungi, insects and parasites, and the 2019 update introduces an extensive technical rewrite making the services faster and more flexible.
Journal ArticleDOI

The DisGeNET knowledge platform for disease genomics: 2019 update.

TL;DR: The DisGeNET platform, a knowledge management platform integrating and standardizing data about disease associated genes and variants from multiple sources, is an interoperable resource supporting a variety of applications in genomic medicine and drug R&D.
Journal ArticleDOI

WebGestalt 2017: a more comprehensive, powerful, flexible and interactive gene set enrichment analysis toolkit

TL;DR: Gene Set Enrichment Analysis and Network Topology-based Analysis have been added to WebGestalt 2017, providing complementary approaches to the interpretation of high-throughput omics data.
Journal ArticleDOI

Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits.

Evangelos Evangelou, +341 more
- 17 Sep 2018 - 
TL;DR: In this article, the largest genetic association study of blood pressure traits (systolic, diastolic and pulse pressure) to date in over 1 million people of European ancestry was conducted.
Journal ArticleDOI

Hierarchy of transcriptomic specialization across human cortex captured by structural neuroimaging topography.

TL;DR: In this paper, a noninvasive neuroimaging measure, T1w/T2w mapping, was used to identify a hierarchical axis linking cortical transcription and anatomy, along which gradients of micro-scale properties may contribute to the macroscale specialization of cortical function.
References
More filters
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Journal ArticleDOI

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek, +106 more
- 18 Aug 2016 - 
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Journal ArticleDOI

The FAIR Guiding Principles for scientific data management and stewardship

TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Journal ArticleDOI

dbSNP: the NCBI database of genetic variation

TL;DR: The dbSNP database is a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, and is integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data.
Related Papers (5)