Showing papers on "Software published in 2019"

PDF

Open Access

Journal Article•DOI•

BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis.

[...]

Remco R. Bouckaert¹, Remco R. Bouckaert², Timothy G. Vaughan³, Timothy G. Vaughan⁴, Joëlle Barido-Sottani³, Joëlle Barido-Sottani⁴, Sebastián Duchêne⁵, Mathieu Fourment⁶, Alexandra Gavryushkina⁷, Joseph Heled, Graham Jones⁸, Denise Kühnert², Nicola De Maio⁹, Michael Matschiner¹⁰, Fábio K. Mendes¹, Nicola F. Müller³, Nicola F. Müller⁴, Huw A. Ogilvie¹¹, Louis du Plessis¹², Alex Popinga¹, Andrew Rambaut¹³, David A. Rasmussen¹⁴, Igor Siveroni¹⁵, Marc A. Suchard¹⁶, Chieh-Hsi Wu¹², Dong Xie¹, Chi Zhang¹⁷, Tanja Stadler⁴, Tanja Stadler³, Alexei J. Drummond¹ - Show less +26 more•Institutions (17)

University of Auckland¹, Max Planck Society², Swiss Institute of Bioinformatics³, ETH Zurich⁴, University of Melbourne⁵, University of Technology, Sydney⁶, University of Otago⁷, University of Gothenburg⁸, European Bioinformatics Institute⁹, University of Basel¹⁰, Rice University¹¹, University of Oxford¹², University of Edinburgh¹³, North Carolina State University¹⁴, Imperial College London¹⁵, University of California, Los Angeles¹⁶, Chinese Academy of Sciences¹⁷

08 Apr 2019-PLOS Computational Biology

TL;DR: A series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release are described.

...read moreread less

Abstract: Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.

...read moreread less

2,045 citations

Book•

Dose-Response Analysis Using R

[...]

Christian Ritz¹, Florent Baty², Jens C. Streibig¹, Daniel Gerhard³•Institutions (3)

University of Copenhagen¹, Kantonsspital St. Gallen², University of Canterbury³

11 Jul 2019

TL;DR: The aim of the present paper is to provide an overview of state-of-the-art dose-response analysis, both in terms of general concepts that have evolved and matured over the years and by means of concrete examples.

...read moreread less

Abstract: Dose-response analysis can be carried out using multi-purpose commercial statistical software, but except for a few special cases the analysis easily becomes cumbersome as relevant, non-standard output requires manual programming. The extension package drc for the statistical environment R provides a flexible and versatile infrastructure for dose-response analyses in general. The present version of the package, reflecting extensions and modifications over the last decade, provides a user-friendly interface to specify the model assumptions about the dose-response relationship and comes with a number of extractors for summarizing fitted models and carrying out inference on derived parameters. The aim of the present paper is to provide an overview of state-of-the-art dose-response analysis, both in terms of general concepts that have evolved and matured over the years and by means of concrete examples.

...read moreread less

1,827 citations

Journal Article•DOI•

WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs.

[...]

Yuxing Liao¹, Jing Wang¹, Eric J. Jaehnig¹, Zhiao Shi¹, Bing Zhang¹ - Show less +1 more•Institutions (1)

Baylor College of Medicine¹

02 Jul 2019-Nucleic Acids Research

TL;DR: In the 2019 update, WebGestalt supports 12 organisms, 342 gene identifiers and 155 175 functional categories, as well as user-uploaded functional databases and has completely redesigned result visualizations and user interfaces to improve user-friendliness and to provide multiple types of interactive and publication-ready figures.

...read moreread less

Abstract: WebGestalt is a popular tool for the interpretation of gene lists derived from large scale -omics studies. In the 2019 update, WebGestalt supports 12 organisms, 342 gene identifiers and 155 175 functional categories, as well as user-uploaded functional databases. To address the growing and unique need for phosphoproteomics data interpretation, we have implemented phosphosite set analysis to identify important kinases from phosphoproteomics data. We have completely redesigned result visualizations and user interfaces to improve user-friendliness and to provide multiple types of interactive and publication-ready figures. To facilitate comprehension of the enrichment results, we have implemented two methods to reduce redundancy between enriched gene sets. We introduced a web API for other applications to get data programmatically from the WebGestalt server or pass data to WebGestalt for analysis. We also wrapped the core computation into an R package called WebGestaltR for users to perform analysis locally or in third party workflows. WebGestalt can be freely accessed at http://www.webgestalt.org.

...read moreread less

1,789 citations

Posted Content•

Optuna: A Next-generation Hyperparameter Optimization Framework

[...]

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, Masanori Koyama - Show less +1 more

25 Jul 2019-arXiv: Learning

TL;DR: New design-criteria for next-generation hyperparameter optimization software are introduced, including define-by-run API that allows users to construct the parameter search space dynamically, and easy-to-setup, versatile architecture that can be deployed for various purposes.

...read moreread less

Abstract: The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (this https URL).

...read moreread less

1,448 citations

Proceedings Article•DOI•

Optuna: A Next-generation Hyperparameter Optimization Framework

[...]

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, Masanori Koyama - Show less +1 more

25 Jul 2019

TL;DR: Optuna as mentioned in this paper is a next-generation hyperparameter optimization software with define-by-run (DBR) API that allows users to construct the parameter search space dynamically.

...read moreread less

1,248 citations

Journal Article•DOI•

SwissTargetPrediction: updated data and new features for efficient prediction of protein targets of small molecules

[...]

Antoine Daina¹, Olivier Michielin², Olivier Michielin¹, Vincent Zoete³, Vincent Zoete¹ - Show less +1 more•Institutions (3)

Swiss Institute of Bioinformatics¹, University Hospital of Lausanne², University of Lausanne³

02 Jul 2019-Nucleic Acids Research

TL;DR: The 2019 version of SwissTargetPrediction is described, which represents a major update in terms of underlying data, backend and web interface, and high levels of predictive performance were maintained despite more extended biological and chemical spaces to be explored.

...read moreread less

Abstract: SwissTargetPrediction is a web tool, on-line since 2014, that aims to predict the most probable protein targets of small molecules. Predictions are based on the similarity principle, through reverse screening. Here, we describe the 2019 version, which represents a major update in terms of underlying data, backend and web interface. The bioactivity data were updated, the model retrained and similarity thresholds redefined. In the new version, the predictions are performed by searching for similar molecules, in 2D and 3D, within a larger collection of 376 342 compounds known to be experimentally active on an extended set of 3068 macromolecular targets. An efficient backend implementation allows to speed up the process that returns results for a druglike molecule on human proteins in 15-20 s. The refreshed web interface enhances user experience with new features for easy input and improved analysis. Interoperability capacity enables straightforward submission of any input or output molecule to other on-line computer-aided drug design tools, developed by the SIB Swiss Institute of Bioinformatics. High levels of predictive performance were maintained despite more extended biological and chemical spaces to be explored, e.g. achieving at least one correct human target in the top 15 predictions for >70% of external compounds. The new SwissTargetPrediction is available free of charge (www.swisstargetprediction.ch).

...read moreread less

1,244 citations

Journal Article•DOI•

MRtrix3: A fast, flexible and open software framework for medical image processing and visualisation

[...]

Jacques-Donald Tournier¹, Robert E. Smith², Robert E. Smith³, David Raffelt², Rami Tabbara², Thijs Dhollander³, Thijs Dhollander², Maximilian Pietsch¹, Daan Christiaens¹, Ben Jeurissen⁴, Chun-Hung Yeh², Chun-Hung Yeh³, Alan Connelly³, Alan Connelly² - Show less +10 more•Institutions (4)

King's College London¹, Florey Institute of Neuroscience and Mental Health², University of Melbourne³, University of Antwerp⁴

15 Nov 2019-NeuroImage

TL;DR: A high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software is provided.

...read moreread less

1,228 citations

Posted Content•DOI•

MRtrix3: A fast, flexible and open software framework for medical image processing and visualisation

[...]

Jacques-Donald Tournier¹, Robert E. Smith², Robert E. Smith³, David Raffelt³, Rami Tabbara³, Thijs Dhollander², Thijs Dhollander³, Maximilian Pietsch¹, Daan Christiaens¹, Ben Jeurissen⁴, Chun-Hung Yeh³, Chun-Hung Yeh², Alan Connelly³, Alan Connelly² - Show less +10 more•Institutions (4)

King's College London¹, University of Melbourne², Florey Institute of Neuroscience and Mental Health³, University of Antwerp⁴

15 Feb 2019-bioRxiv

TL;DR: A high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software is provided.

...read moreread less

Abstract: MRtrix3 is an open-source, cross-platform software package for medical image processing, analysis and visualization, with a particular emphasis on the investigation of the brain using diffusion MRI. It is implemented using a fast, modular and flexible general-purpose code framework for image data access and manipulation, enabling efficient development of new applications, whilst retaining high computational performance and a consistent command-line interface between applications. In this article, we provide a high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software.

...read moreread less

728 citations

Journal Article•DOI•

Deep learning: new computational modelling techniques for genomics

[...]

Gökcen Eraslan¹, Žiga Avsec¹, Julien Gagneur¹, Fabian J. Theis¹•Institutions (1)

Technische Universität München¹

01 Jul 2019-Nature Reviews Genetics

TL;DR: This Review describes different deep learning techniques and how they can be applied to extract biologically relevant information from large, complex genomic data sets.

...read moreread less

Abstract: As a data-driven science, genomics largely utilizes machine learning to capture dependencies in data and derive novel biological hypotheses. However, the ability to extract new insights from the exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively leveraging large data sets, deep learning has transformed fields such as computer vision and natural language processing. Now, it is becoming the method of choice for many genomics modelling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing. This Review describes different deep learning techniques and how they can be applied to extract biologically relevant information from large, complex genomic data sets.

...read moreread less

685 citations

Journal Article•DOI•

Real-time cryo-electron microscopy data preprocessing with Warp.

[...]

Dimitry Tegunov¹, Patrick Cramer¹•Institutions (1)

Max Planck Society¹

07 Oct 2019-Nature Methods

TL;DR: Warp is described, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation, and includes deep-learning-based models for accurate particle picking and image denoising.

...read moreread less

Abstract: The acquisition of cryo-electron microscopy (cryo-EM) data from biological specimens must be tightly coupled to data preprocessing to ensure the best data quality and microscope usage. Here we describe Warp, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation. Warp corrects micrographs for global and local motion, estimates the local defocus and monitors key parameters for each recorded micrograph or tomographic tilt series in real time. The software further includes deep-learning-based models for accurate particle picking and image denoising. The output from Warp can be fed into established programs for particle classification and 3D-map refinement. Our benchmarks show improvement in the nominal resolution, which went from 3.9 A to 3.2 A, of a published cryo-EM data set for influenza virus hemagglutinin. Warp is easy to install from http://github.com/cramerlab/warp and computationally inexpensive, and has an intuitive, streamlined user interface. The user-friendly software tool Warp enables automated, on-the-fly preprocessing of cryo-EM data, including motion correction, defocus estimation, particle picking and image denoising.

...read moreread less

655 citations

Proceedings Article•DOI•

Software engineering for machine learning: a case study

[...]

Saleema Amershi¹, Andrew Begel¹, Christian Bird¹, Robert DeLine¹, Harald C. Gall², Ece Kamar¹, Nachiappan Nagappan¹, Besmira Nushi¹, Thomas Zimmermann¹ - Show less +5 more•Institutions (2)

Microsoft¹, University of Zurich²

27 May 2019

TL;DR: A study conducted on observing software teams at Microsoft as they develop AI-based applications finds that various Microsoft teams have united this workflow into preexisting, well-evolved, Agile-like software engineering processes, providing insights about several essential engineering challenges that organizations may face in creating large-scale AI solutions for the marketplace.

...read moreread less

Abstract: Recent advances in machine learning have stimulated widespread interest within the Information Technology sector on integrating AI capabilities into software and services. This goal has forced organizations to evolve their development processes. We report on a study that we conducted on observing software teams at Microsoft as they develop AI-based applications. We consider a nine-stage workflow process informed by prior experiences developing AI applications (e.g., search and NLP) and data science tools (e.g. application diagnostics and bug reporting). We found that various Microsoft teams have united this workflow into preexisting, well-evolved, Agile-like software engineering processes, providing insights about several essential engineering challenges that organizations may face in creating large-scale AI solutions for the marketplace. We collected some best practices from Microsoft teams to address these challenges. In addition, we have identified three aspects of the AI domain that make it fundamentally different from prior software application domains: 1) discovering, managing, and versioning the data needed for machine learning applications is much more complex and difficult than other types of software engineering, 2) model customization and model reuse require very different skills than are typically found in software teams, and 3) AI components are more difficult to handle as distinct modules than traditional software components --- models may be "entangled" in complex ways and experience non-monotonic error behavior. We believe that the lessons learned by Microsoft teams will be valuable to other organizations.

...read moreread less

Journal Article•DOI•

The Powder Diffraction File: a quality materials characterization database

[...]

S. Gates-Rector, Thomas N. Blanton

06 Nov 2019-Powder Diffraction

TL;DR: Details describing the content of database entries are presented to enhance the use of the ICDD's Powder Diffraction File to serve a wide range of disciplines covering academic, industrial, and government laboratories.

...read moreread less

Abstract: The ICDD's Powder Diffraction File™ (PDF®) is a database of inorganic and organic diffraction data used for phase identification and materials characterization by powder diffraction. The PDF has been available for over 75 years and finds application in X-ray, synchrotron, electron, and neutron diffraction analyses. With entries based on powder and single crystal data, the PDF is the only crystallographic database where every entry is editorially reviewed and marked with a quality mark that alerts the user to the reliability/quality of the submitted data. The editorial processes of ICDD's quality management system are unique in that they are ISO 9001:2015 certified. Initially offered as text on paper cards and books, the PDF evolved to a computer-readable database in the 1960s and today is both computer and web accessible. With data mining and phase identification software available in PDF products, and the databases’ compatibility with vendor (third party) software, the 1 000 000+ published PDF entries serve a wide range of disciplines covering academic, industrial, and government laboratories. Details describing the content of database entries are presented to enhance the use of the PDF.

...read moreread less

Journal Article•DOI•

Hyperparameters and tuning strategies for random forest

[...]

Philipp Probst¹, Marvin N. Wright², Anne-Laure Boulesteix¹•Institutions (2)

Ludwig Maximilian University of Munich¹, Leibniz Association²

01 May 2019-Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery

TL;DR: A literature review on the parameters' influence on the prediction performance and on variable importance measures is provided, and the application of one of the most established tuning strategies, model‐based optimization (MBO), is demonstrated.

...read moreread less

Abstract: The random forest algorithm (RF) has several hyperparameters that have to be set by the user, e.g., the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables drawn randomly for each split, the splitting rule, the minimum number of samples that a node must contain and the number of trees. In this paper, we first provide a literature review on the parameters' influence on the prediction performance and on variable importance measures. It is well known that in most cases RF works reasonably well with the default values of the hyperparameters specified in software packages. Nevertheless, tuning the hyperparameters can improve the performance of RF. In the second part of this paper, after a brief overview of tuning strategies we demonstrate the application of one of the most established tuning strategies, model-based optimization (MBO). To make it easier to use, we provide the tuneRanger R package that tunes RF with MBO automatically. In a benchmark study on several datasets, we compare the prediction performance and runtime of tuneRanger with other tuning implementations in R and RF with default hyperparameters.

...read moreread less

Proceedings Article•DOI•

The VIA Annotation Software for Images, Audio and Video

[...]

Abhishek Dutta¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

15 Oct 2019

TL;DR: A light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser, the VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video.

...read moreread less

Abstract: In this paper, we introduce a simple and standalone manual annotation tool for images, audio and video: the VGG Image Annotator (VIA). This is a light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser. The VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video. These manual annotations can be exported to plain text data formats such as JSON and CSV and therefore are amenable to further processing by other software tools. VIA also supports collaborative annotation of a large dataset by a group of human annotators. The BSD open source license of this software allows it to be used in any academic project or commercial application.

...read moreread less

Journal Article•DOI•

The development of software to support multiple systematic review types: the Joanna Briggs Institute System for the Unified Management, Assessment and Review of Information (JBI SUMARI).

[...]

Zachary Munn¹, Edoardo Aromataris¹, Catalin Tufanaru¹, Cindy Stern¹, Kylie Porritt¹, James Farrow, Craig Lockwood¹, Matthew Stephenson¹, Sandeep Moola¹, Lucylynn Lizarondo¹, Alexandra McArthur¹, Micah D J Peters¹, Alan Pearson¹, Zoe Jordan¹ - Show less +10 more•Institutions (1)

The Joanna Briggs Institute¹

01 Mar 2019-International Journal of Evidence-based Healthcare

TL;DR: The new systematic review software, the Joanna Briggs Institute System for the Unified Management, Assessment and Review of Information (JBI SUMARI), was successfully developed through an iterative process of development, feedback, testing and review.

...read moreread less

Abstract: Aim:Systematic reviews play an important role in ensuring trustworthy recommendations in healthcare. However, systematic reviews can be laborious to undertake and as such software has been developed to assist in the conduct and reporting of systematic reviews. The Joanna Briggs Institute and

...read moreread less

Journal Article•DOI•

Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data

[...]

George C. Linderman¹, Manas Rachh¹, Jeremy G. Hoskins¹, Stefan Steinerberger¹, Yuval Kluger¹ - Show less +1 more•Institutions (1)

Yale University¹

11 Feb 2019-Nature Methods

TL;DR: Kluger et al. as mentioned in this paper proposed a heatmap-style visualization for scRNA-seq based on one-dimensional t-distributed stochastic neighbor embedding (t-SNE) for simultaneously visualizing the expression patterns of thousands of genes.

...read moreread less

Abstract: t-distributed stochastic neighbor embedding (t-SNE) is widely used for visualizing single-cell RNA-sequencing (scRNA-seq) data, but it scales poorly to large datasets. We dramatically accelerate t-SNE, obviating the need for data downsampling, and hence allowing visualization of rare cell populations. Furthermore, we implement a heatmap-style visualization for scRNA-seq based on one-dimensional t-SNE for simultaneously visualizing the expression patterns of thousands of genes. Software is available at https://github.com/KlugerLab/FIt-SNE and https://github.com/KlugerLab/t-SNE-Heatmaps .

...read moreread less

Journal Article•DOI•

Partial least squares structural equation modeling using SmartPLS: a software review

[...]

Marko Sarstedt¹, Marko Sarstedt², Jun-Hwa Cheah³•Institutions (3)

Monash University Malaysia Campus¹, Otto-von-Guericke University Magdeburg², Universiti Putra Malaysia³

01 Sep 2019

TL;DR: This work reviews the latest version of SmartPLS and discusses its various features, and offers researchers with concrete guidance regarding their choice of a PLS-SEM software that fits their analytical needs.

...read moreread less

Abstract: In their effort to better understand consumer behavior, marketing researchers often analyze relationships between latent variables, measured by sets of observed variables. Partial least squares structural equation modeling (PLS-SEM) has become a popular tool for analyzing such relationships. Particularly the availability of SmartPLS, a comprehensive software program with an intuitive graphical user interface, helped popularize the method. We review the latest version of SmartPLS and discuss its various features. Our aim is to offer researchers with concrete guidance regarding their choice of a PLS-SEM software that fits their analytical needs.

...read moreread less

Journal Article•DOI•

A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides

[...]

Zhen-Lin Chen¹, Jia-Ming Meng¹, Yong Cao, Yin Jili¹, Run-Qian Fang¹, Sheng-Bo Fan¹, Chao Liu¹, Wen-Feng Zeng¹, Yue-He Ding, Dan Tan, Long Wu¹, Wen-Jing Zhou¹, Hao Chi¹, Rui-Xiang Sun, Meng-Qiu Dong, Si-Min He¹ - Show less +12 more•Institutions (1)

Chinese Academy of Sciences¹

30 Jul 2019-Nature Communications

TL;DR: The pLink 2 as discussed by the authors is a search engine with higher speed and reliability for proteome-scale identification of cross-linked peptides, with a two-stage open search strategy facilitated by fragment indexing.

...read moreread less

Abstract: We describe pLink 2, a search engine with higher speed and reliability for proteome-scale identification of cross-linked peptides. With a two-stage open search strategy facilitated by fragment indexing, pLink 2 is ~40 times faster than pLink 1 and 3~10 times faster than Kojak. Furthermore, using simulated datasets, synthetic datasets, 15N metabolically labeled datasets, and entrapment databases, four analysis methods were designed to evaluate the credibility of ten state-of-the-art search engines. This systematic evaluation shows that pLink 2 outperforms these methods in precision and sensitivity, especially at proteome scales. Lastly, re-analysis of four published proteome-scale cross-linking datasets with pLink 2 required only a fraction of the time used by pLink 1, with up to 27% more cross-linked residue pairs identified. pLink 2 is therefore an efficient and reliable tool for cross-linking mass spectrometry analysis, and the systematic evaluation methods described here will be useful for future software development. The identification of cross-linked peptides at a proteome scale for interactome analyses represents a complex challenge. Here the authors report an efficient and reliable search engine pLink 2 for proteome-scale cross-linking mass spectrometry analyses, and demonstrate how to systematically evaluate the credibility of search engines.

...read moreread less

Journal Article•DOI•

Many-body perturbation theory calculations using the yambo code.

[...]

Davide Sangalli, Andrea Ferretti, Henrique Pereira Coutada Miranda¹, Claudio Attaccalite², Ivan Marri, Elena Cannuccia², Elena Cannuccia³, Pedro Melo⁴, M. Marsili⁵, Fulvio Paleari⁶, Antimo Marrazzo⁷, Gianluca Prandini⁷, Pietro Bonfà, Michael O. Atambo, Fabio Affinito, Maurizia Palummo³, Alejandro Molina-Sanchez⁸, Conor Hogan³, Myrta Grüning⁹, Daniele Varsano, Andrea Marini - Show less +17 more•Institutions (9)

Université catholique de Louvain¹, Aix-Marseille University², University of Rome Tor Vergata³, University of Liège⁴, University of Padua⁵, University of Luxembourg⁶, École Polytechnique Fédérale de Lausanne⁷, University of Valencia⁸, Queen's University Belfast⁹

29 May 2019-Journal of Physics: Condensed Matter

TL;DR: yambo as mentioned in this paper is an open source project aimed at studying excited state properties of condensed matter systems from first principles using many-body methods using ground state electronic structure data as computed by density functional theory codes such as Quantum ESPRESSO and Abinit.

...read moreread less

Abstract: yambo is an open source project aimed at studying excited state properties of condensed matter systems from first principles using many-body methods. As input, yambo requires ground state electronic structure data as computed by density functional theory codes such as Quantum ESPRESSO and Abinit. yambo's capabilities include the calculation of linear response quantities (both independent-particle and including electron-hole interactions), quasi-particle corrections based on the GW formalism, optical absorption, and other spectroscopic quantities. Here we describe recent developments ranging from the inclusion of important but oft-neglected physical effects such as electron-phonon interactions to the implementation of a real-time propagation scheme for simulating linear and non-linear optical properties. Improvements to numerical algorithms and the user interface are outlined. Particular emphasis is given to the new and efficient parallel structure that makes it possible to exploit modern high performance computing architectures. Finally, we demonstrate the possibility to automate workflows by interfacing with the yambopy and AiiDA software tools.

...read moreread less

Journal Article•DOI•

BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.

[...]

Bin Liu¹, Xin Gao², Hanyu Zhang²•Institutions (2)

Beijing Institute of Technology¹, Harbin Institute of Technology²

18 Nov 2019-Nucleic Acids Research

TL;DR: The experimental results indicated that the predictors developed by the BioSeq-Analysis2.0 can achieve comparable or even better performance than the existing state-of-the-art predictors.

...read moreread less

Abstract: As the first web server to analyze various biological sequences at sequence level based on machine learning approaches, many powerful predictors in the field of computational biology have been developed with the assistance of the BioSeq-Analysis. However, the BioSeq-Analysis can be only applied to the sequence-level analysis tasks, preventing its applications to the residue-level analysis tasks, and an intelligent tool that is able to automatically generate various predictors for biological sequence analysis at both residue level and sequence level is highly desired. In this regard, we decided to publish an important updated server covering a total of 26 features at the residue level and 90 features at the sequence level called BioSeq-Analysis2.0 (http://bliulab.net/BioSeq-Analysis2.0/), by which the users only need to upload the benchmark dataset, and the BioSeq-Analysis2.0 can generate the predictors for both residue-level analysis and sequence-level analysis tasks. Furthermore, the corresponding stand-alone tool was also provided, which can be downloaded from http://bliulab.net/BioSeq-Analysis2.0/download/. To the best of our knowledge, the BioSeq-Analysis2.0 is the first tool for generating predictors for biological sequence analysis tasks at residue level. Specifically, the experimental results indicated that the predictors developed by BioSeq-Analysis2.0 can achieve comparable or even better performance than the existing state-of-the-art predictors.

...read moreread less

Journal Article•DOI•

Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments.

[...]

Luyi Tian¹, Luyi Tian², Xueyi Dong², Xueyi Dong³, Saskia Freytag⁴, Saskia Freytag², Kim-Anh Lê Cao¹, Shian Su², Abolfazl JalalAbadi¹, Daniela Amann-Zalcenstein¹, Daniela Amann-Zalcenstein², Thomas Weber², Thomas Weber¹, Azadeh Seidi, Jafar S. Jabbari, Shalin H. Naik¹, Shalin H. Naik², Matthew E. Ritchie¹, Matthew E. Ritchie² - Show less +15 more•Institutions (4)

University of Melbourne¹, Walter and Eliza Hall Institute of Medical Research², Zhejiang University³, Harry Perkins Institute of Medical Research⁴

27 May 2019-Nature Methods

TL;DR: This work generated a realistic benchmark experiment that included single cells and admixtures of cells or RNA to create ‘pseudo cells’ from up to five distinct cancer cell lines and provided a comprehensive framework for benchmarking most common scRNA-seq analysis steps.

...read moreread less

Abstract: Single cell RNA-sequencing (scRNA-seq) technology has undergone rapid development in recent years, leading to an explosion in the number of tailored data analysis methods. However, the current lack of gold-standard benchmark datasets makes it difficult for researchers to systematically compare the performance of the many methods available. Here, we generated a realistic benchmark experiment that included single cells and admixtures of cells or RNA to create 'pseudo cells' from up to five distinct cancer cell lines. In total, 14 datasets were generated using both droplet and plate-based scRNA-seq protocols. We compared 3,913 combinations of data analysis methods for tasks ranging from normalization and imputation to clustering, trajectory analysis and data integration. Evaluation revealed pipelines suited to different types of data for different tasks. Our data and analysis provide a comprehensive framework for benchmarking most common scRNA-seq analysis steps.

...read moreread less

Journal Article•DOI•

Super-resolution fight club: assessment of 2D and 3D single-molecule localization microscopy software.

[...]

Daniel Sage¹, Thanh-an Pham¹, Hazen P. Babcock², Tomas Lukes¹, Tomas Lukes³, Thomas Pengo⁴, Jerry Chao⁵, Ramraj Velmurugan⁵, Alex Herbert⁶, Anurag Agrawal, Silvia Colabrese¹, Silvia Colabrese⁷, Ann P. Wheeler⁸, Anna Archetti¹, Bernd Rieger⁹, Raimund J. Ober⁵, Raimund J. Ober¹⁰, Guy M. Hagen¹¹, Jean-Baptiste Sibarita¹², Jean-Baptiste Sibarita¹³, Jonas Ries, Ricardo Henriques¹⁴, Michael Unser¹, Seamus Holden¹⁵ - Show less +20 more•Institutions (15)

École Polytechnique Fédérale de Lausanne¹, Harvard University², Czech Technical University in Prague³, University of Minnesota⁴, Texas A&M University⁵, University of Sussex⁶, Istituto Italiano di Tecnologia⁷, University of Edinburgh⁸, Delft University of Technology⁹, University of Southampton¹⁰, University of Colorado Colorado Springs¹¹, University of Bordeaux¹², Centre national de la recherche scientifique¹³, University College London¹⁴, Newcastle University¹⁵

08 Apr 2019-Nature Methods

TL;DR: This study reports results from the second community-wide single-molecule localization microscopy software challenge, which tested over 30 software packages on realistic simulated data for multiple popular 3D image acquisition modes, as well as 2D localization microscopes.

...read moreread less

Abstract: With the widespread uptake of two-dimensional (2D) and three-dimensional (3D) single-molecule localization microscopy (SMLM), a large set of different data analysis packages have been developed to generate super-resolution images. In a large community effort, we designed a competition to extensively characterize and rank the performance of 2D and 3D SMLM software packages. We generated realistic simulated datasets for popular imaging modalities-2D, astigmatic 3D, biplane 3D and double-helix 3D-and evaluated 36 participant packages against these data. This provides the first broad assessment of 3D SMLM software and provides a holistic view of how the latest 2D and 3D SMLM packages perform in realistic conditions. This resource allows researchers to identify optimal analytical software for their experiments, allows 3D SMLM software developers to benchmark new software against the current state of the art, and provides insight into the current limits of the field.

...read moreread less

Journal Article•DOI•

Genetic association testing using the GENESIS R/Bioconductor package.

[...]

Stephanie M. Gogarten¹, Tamar Sofer², Tamar Sofer³, Han Chen⁴, Chaoyu Yu¹, Jennifer A. Brody¹, Timothy A. Thornton¹, Kenneth Rice¹, Matthew P. Conomos¹ - Show less +5 more•Institutions (4)

University of Washington¹, Brigham and Women's Hospital², Harvard University³, University of Texas Health Science Center at Houston⁴

15 Dec 2019-Bioinformatics

TL;DR: GDS format provides efficient storage and retrieval of genotypes measured by microarrays and sequencing, and GENESIS implements highly flexible mixed models, allowing for different link functions, multiple variance components, and phenotypic heteroskedasticity.

...read moreread less

Abstract: Summary The Genomic Data Storage (GDS) format provides efficient storage and retrieval of genotypes measured by microarrays and sequencing. We developed GENESIS to perform various single- and aggregate-variant association tests using genotype data stored in GDS format. GENESIS implements highly flexible mixed models, allowing for different link functions, multiple variance components and phenotypic heteroskedasticity. GENESIS integrates cohesively with other R/Bioconductor packages to build a complete genomic analysis workflow entirely within the R environment. Availability and implementation https://bioconductor.org/packages/GENESIS; vignettes included. Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

Journal Article•DOI•

Realistic volumetric-approach to simulate transcranial electric stimulation—ROAST—a fully automated open-source pipeline

[...]

Yu Huang¹, Abhishek Datta, Marom Bikson¹, Lucas C. Parra¹•Institutions (1)

City University of New York¹

30 Jul 2019-Journal of Neural Engineering

TL;DR: ROAST is released as an open-source, easy-to-install and fully-automated pipeline for individualized TES modeling and its performance with commercial FEM software, and SimNIBS, a well-established open- source modeling pipeline is compared.

...read moreread less

Abstract: Objective Research in the area of transcranial electrical stimulation (TES) often relies on computational models of current flow in the brain. Models are built based on magnetic resonance images (MRI) of the human head to capture detailed individual anatomy. To simulate current flow on an individual, the subject's MRI is segmented, virtual electrodes are placed on this anatomical model, the volume is tessellated into a mesh, and a finite element model (FEM) is solved numerically to estimate the current flow. Various software tools are available for each of these steps, as well as processing pipelines that connect these tools for automated or semi-automated processing. The goal of the present tool-realistic volumetric-approach to simulate transcranial electric simulation (ROAST)-is to provide an end-to-end pipeline that can automatically process individual heads with realistic volumetric anatomy leveraging open-source software and custom scripts to improve segmentation and execute electrode placement. Approach ROAST combines the segmentation algorithm of SPM12, a Matlab script for touch-up and automatic electrode placement, the finite element mesher iso2mesh and the solver getDP. We compared its performance with commercial FEM software, and SimNIBS, a well-established open-source modeling pipeline. Main results The electric fields estimated with ROAST differ little from the results obtained with commercial meshing and FEM solving software. We also do not find large differences between the various automated segmentation methods used by ROAST and SimNIBS. We do find bigger differences when volumetric segmentation are converted into surfaces in SimNIBS. However, evaluation on intracranial recordings from human subjects suggests that ROAST and SimNIBS are not significantly different in predicting field distribution, provided that users have detailed knowledge of SimNIBS. Significance We hope that the detailed comparisons presented here of various choices in this modeling pipeline can provide guidance for future tool development. We released ROAST as an open-source, easy-to-install and fully-automated pipeline for individualized TES modeling.

...read moreread less

Book Chapter•DOI•

TIRA Integrated Research Architecture.

[...]

Martin Potthast¹, Tim Gollub², Matti Wiegmann², Benno Stein²•Institutions (2)

Leipzig University¹, Bauhaus University, Weimar²

01 Jan 2019

TL;DR: This chapter introduces the TIRA Integrated Research Architecture, its design requirements, its workflows from both the participants’ and the organizers’ perspectives, alongside a report on user experience and usage scenarios.

...read moreread less

Abstract: Data and software are immaterial. Scientists in computer science hence have the unique chance to let other scientists easily reproduce their findings. Similarly, and with the same ease, the organization of shared tasks, i.e., the collaborative search for new algorithms given a predefined problem, is possible. Experience shows that the potential of reproducibility is hardly tapped in either case. Based on this observation, and driven by the ambitious goal to find the best solutions for certain problems in our research field, we have been developing the TIRA Integrated Research Architecture. Within TIRA, the reproducibility requirement got top priority right from the start. This chapter introduces the platform, its design requirements, its workflows from both the participants’ and the organizers’ perspectives, alongside a report on user experience and usage scenarios.

...read moreread less

Proceedings Article•DOI•

The VIA Annotation Software for Images, Audio and Video

[...]

Abhishek Dutta¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

24 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: The VGG Image Annotator (VIA) as discussed by the authors is a simple and standalone manual annotation tool for images, audio and video that allows human annotators to define and describe spatial regions in images or video frames and temporal segments in audio or video.

...read moreread less

Journal Article•DOI•

Augmented Reality and Virtual Reality in Education. Myth or Reality

[...]

Noureddine Elmqaddem¹•Institutions (1)

École Normale Supérieure¹

14 Feb 2019-International Journal of Emerging Technologies in Learning (ijet)

TL;DR: This work consists of explaining the reasons behind the new rise of AR and VR and why their actual adoption in education will be a reality in a near fu-ture.

...read moreread less

Abstract: Augmented Reality and Virtual Reality are not new technologies. But several constraints prevented their actual adoption. Recent technological progresses added to the proliferation of affordable hardware and software have made AR and VR more viable and desirable in many domains, including educa-tion; they have been relaunched with new promises previously unimaginable. The nature of AR and VR promises new teaching and learning models that better meet the needs of the 21st century learner. We’re now on a path to re-invent education. This work consists of explaining the reasons behind the new rise of AR and VR and why their actual adoption in education will be a reality in a near fu-ture.

...read moreread less

Posted Content•

MLPerf Training Benchmark.

[...]

02 Oct 2019-arXiv: Learning

TL;DR: MLPerf as discussed by the authors is an ML benchmark that overcomes three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time-to-solution exhibits high variance.

...read moreread less

Abstract: Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML. But ML training presents three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time to solution exhibits high variance, and software and hardware systems are so diverse that fair benchmarking with the same binary, code, and even hyperparameters is difficult. We therefore present MLPerf, an ML benchmark that overcomes these challenges. Our analysis quantitatively evaluates MLPerf's efficacy at driving performance and scalability improvements across two rounds of results from multiple vendors.

...read moreread less

Journal Article•DOI•

The EVcouplings Python framework for coevolutionary sequence analysis

[...]

Thomas A. Hopf¹, Anna G. Green¹, Benjamin Schubert¹, Sophia Mersmann¹, Charlotta P I Schärfe², Charlotta P I Schärfe¹, John Ingraham¹, Agnes Toth-Petroczy¹, Kelly P Brock¹, Adam J. Riesselman¹, Perry Palmedo³, Perry Palmedo¹, Chan Kang¹, Robert L. Sheridan⁴, Eli J. Draizen⁵, Christian Dallago⁶, Christian Dallago¹, Chris Sander¹, Debora S. Marks¹ - Show less +15 more•Institutions (6)

Harvard University¹, University of Tübingen², Massachusetts Institute of Technology³, Memorial Sloan Kettering Cancer Center⁴, University of Virginia⁵, Technische Universität München⁶

01 May 2019-Bioinformatics

TL;DR: The EVcouplings framework is presented, a fully integrated open-source application and Python package for coevolutionary analysis that enables generation of sequence alignments, calculation and evaluation of evolutionary couplings, and de novo prediction of structure and mutation effects.

...read moreread less

Abstract: Summary Coevolutionary sequence analysis has become a commonly used technique for de novo prediction of the structure and function of proteins, RNA, and protein complexes. We present the EVcouplings framework, a fully integrated open-source application and Python package for coevolutionary analysis. The framework enables generation of sequence alignments, calculation and evaluation of evolutionary couplings (ECs), and de novo prediction of structure and mutation effects. The combination of an easy to use, flexible command line interface and an underlying modular Python package makes the full power of coevolutionary analyses available to entry-level and advanced users. Availability and implementation https://github.com/debbiemarkslab/evcouplings.

...read moreread less

Book Chapter•DOI•

VerifAI: A Toolkit for the Formal Design and Analysis of Artificial Intelligence-Based Systems

[...]

Tommaso Dreossi¹, Daniel J. Fremont¹, Shromona Ghosh¹, Edward Kim¹, Hadi Ravanbakhsh¹, Marcell Vazquez-Chanlatte¹, Sanjit A. Seshia¹ - Show less +3 more•Institutions (1)

University of California, Berkeley¹

15 Jul 2019

TL;DR: VerifAI particularly addresses challenges with applying formal methods to ML components such as perception systems based on deep neural networks, as well as systems containing them, and to model and analyze system behavior in the presence of environment uncertainty.

...read moreread less

Abstract: We present VerifAI, a software toolkit for the formal design and analysis of systems that include artificial intelligence (AI) and machine learning (ML) components. VerifAI particularly addresses challenges with applying formal methods to ML components such as perception systems based on deep neural networks, as well as systems containing them, and to model and analyze system behavior in the presence of environment uncertainty. We describe the initial version of VerifAI, which centers on simulation-based verification and synthesis, guided by formal models and specifications. We give examples of several use cases, including temporal-logic falsification, model-based systematic fuzz testing, parameter synthesis, counterexample analysis, and data set augmentation.

...read moreread less

Collapse