scispace - formally typeset
Search or ask a question

Showing papers by "Carl Kesselman published in 2020"


Journal ArticleDOI
TL;DR: A scalable footprinting workflow is developed using two state-of-the-art algorithms: Wellington and HINT to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues.

25 citations


Journal ArticleDOI
TL;DR: The FaceBase Consortium provides dynamic, freely available resources, including comprehensive datasets on human craniofacial development and animal models, and develops innovative tools for data visualization and analysis.
Abstract: The FaceBase Consortium was established by the National Institute of Dental and Craniofacial Research in 2009 as a 'big data' resource for the craniofacial research community. Over the past decade, researchers have deposited hundreds of annotated and curated datasets on both normal and disordered craniofacial development in FaceBase, all freely available to the research community on the FaceBase Hub website. The Hub has developed numerous visualization and analysis tools designed to promote integration of multidisciplinary data while remaining dedicated to the FAIR principles of data management (findability, accessibility, interoperability and reusability) and providing a faceted search infrastructure for locating desired data efficiently. Summaries of the datasets generated by the FaceBase projects from 2014 to 2019 are provided here. FaceBase 3 now welcomes contributions of data on craniofacial and dental development in humans, model organisms and cell lines. Collectively, the FaceBase Consortium, along with other NIH-supported data resources, provide a continuously growing, dynamic and current resource for the scientific community while improving data reproducibility and fulfilling data sharing requirements.

21 citations


Proceedings ArticleDOI
07 Jul 2020
TL;DR: This paper presents an architecture for data-centric ecosystems that allows the components to seamlessly co-evolve by centralizing the models and mappings at the data service and pushing model-adaptive interactions to the database clients.
Abstract: Database evolution is a notoriously difficult task, and it is exacerbated by the necessity to evolve database-dependent applications. As science becomes increasingly dependent on sophisticated data management, the need to evolve an array of database-driven systems will only intensify. In this paper, we present an architecture for data-centric ecosystems that allows the components to seamlessly co-evolve by centralizing the models and mappings at the data service and pushing model-adaptive interactions to the database clients. Boundary objects fill the gap where applications are unable to adapt and need a stable interface to interact with the components of the ecosystem. Finally, evolution of the ecosystem is enabled via integrated schema modification and model management operations. We present use cases from actual experiences that demonstrate the utility of our approach.

9 citations


Proceedings ArticleDOI
26 Jul 2020
TL;DR: The current identifier ecosystem is summarized, best-practices recommendations for identifier use are presented, and the FAIR Research Identifiers service is described, which supports multiple identifier providers and uses Globus Auth to implement a rich user- and group-based authorization model for identifier creation.
Abstract: Persistent identifiers (PIDs) are essential for making data Findable, Accessible, Interoperable, and Reusable, or FAIR. While the advantages of PIDs for data publication and citation are well understood, and Digital Object Identifiers (DOIs) are increasingly applied to data, there are two gaps in the current identifier ecosystem: 1) services that provide a consistent baseline of capabilities encompassing key aspects of the research data lifecycle, including canonical landing pages and machine-readable metadata via the same URL; and 2) support for identifiers to be applied to ephemeral data, particularly as data move across system boundaries, such as during workflows. To address these gaps, we have implemented the FAIR Research Identifiers service. This service supports multiple identifier providers (ARK, Handle, DOIs via DataCite, etc.) and uses Globus Auth to implement a rich user- and group-based authorization model for identifier creation. This paper summarizes the current identifier ecosystem, presents best-practices recommendations for identifier use, and describes our FAIR Research Identifiers service.

8 citations


Posted Content
TL;DR: This work revisits from this perspective the development and application of grid computing from the mid-1990s onwards, and finds that a translational framing is useful for understanding the technology’s development and impact.
Abstract: A growing gap between progress in biological knowledge and improved health outcomes inspired the new discipline of translational medicine, in which the application of new knowledge is an explicit part of a research plan. Abramson and Parashar argue that a similar gap between complex computational technologies and ever-more-challenging applications demands an analogous discipline of translational computer science, in which the deliberate movement of research results into large-scale practice becomes a central research focus rather than an afterthought. We revisit from this perspective the development and application of grid computing from the mid-1990s onwards, and find that a translational framing is useful for understanding the technology's development and impact. We discuss how the development of grid computing infrastructure, and the Globus Toolkit, in particular, benefited from a translational approach. We identify lessons learned that can be applied to other translational computer science initiatives.

1 citations