scispace - formally typeset
Search or ask a question

Showing papers on "Software published in 2019"


Journal ArticleDOI
TL;DR: A series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release are described.
Abstract: Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.

2,045 citations


Book
11 Jul 2019
TL;DR: The aim of the present paper is to provide an overview of state-of-the-art dose-response analysis, both in terms of general concepts that have evolved and matured over the years and by means of concrete examples.
Abstract: Dose-response analysis can be carried out using multi-purpose commercial statistical software, but except for a few special cases the analysis easily becomes cumbersome as relevant, non-standard output requires manual programming. The extension package drc for the statistical environment R provides a flexible and versatile infrastructure for dose-response analyses in general. The present version of the package, reflecting extensions and modifications over the last decade, provides a user-friendly interface to specify the model assumptions about the dose-response relationship and comes with a number of extractors for summarizing fitted models and carrying out inference on derived parameters. The aim of the present paper is to provide an overview of state-of-the-art dose-response analysis, both in terms of general concepts that have evolved and matured over the years and by means of concrete examples.

1,827 citations


Journal ArticleDOI
Yuxing Liao1, Jing Wang1, Eric J. Jaehnig1, Zhiao Shi1, Bing Zhang1 
TL;DR: In the 2019 update, WebGestalt supports 12 organisms, 342 gene identifiers and 155 175 functional categories, as well as user-uploaded functional databases and has completely redesigned result visualizations and user interfaces to improve user-friendliness and to provide multiple types of interactive and publication-ready figures.
Abstract: WebGestalt is a popular tool for the interpretation of gene lists derived from large scale -omics studies. In the 2019 update, WebGestalt supports 12 organisms, 342 gene identifiers and 155 175 functional categories, as well as user-uploaded functional databases. To address the growing and unique need for phosphoproteomics data interpretation, we have implemented phosphosite set analysis to identify important kinases from phosphoproteomics data. We have completely redesigned result visualizations and user interfaces to improve user-friendliness and to provide multiple types of interactive and publication-ready figures. To facilitate comprehension of the enrichment results, we have implemented two methods to reduce redundancy between enriched gene sets. We introduced a web API for other applications to get data programmatically from the WebGestalt server or pass data to WebGestalt for analysis. We also wrapped the core computation into an R package called WebGestaltR for users to perform analysis locally or in third party workflows. WebGestalt can be freely accessed at http://www.webgestalt.org.

1,789 citations


Posted Content
TL;DR: New design-criteria for next-generation hyperparameter optimization software are introduced, including define-by-run API that allows users to construct the parameter search space dynamically, and easy-to-setup, versatile architecture that can be deployed for various purposes.
Abstract: The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (this https URL).

1,448 citations


Proceedings ArticleDOI
25 Jul 2019
TL;DR: Optuna as mentioned in this paper is a next-generation hyperparameter optimization software with define-by-run (DBR) API that allows users to construct the parameter search space dynamically.
Abstract: The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (https://github.com/pfnet/optuna/).

1,248 citations


Journal ArticleDOI
TL;DR: The 2019 version of SwissTargetPrediction is described, which represents a major update in terms of underlying data, backend and web interface, and high levels of predictive performance were maintained despite more extended biological and chemical spaces to be explored.
Abstract: SwissTargetPrediction is a web tool, on-line since 2014, that aims to predict the most probable protein targets of small molecules. Predictions are based on the similarity principle, through reverse screening. Here, we describe the 2019 version, which represents a major update in terms of underlying data, backend and web interface. The bioactivity data were updated, the model retrained and similarity thresholds redefined. In the new version, the predictions are performed by searching for similar molecules, in 2D and 3D, within a larger collection of 376 342 compounds known to be experimentally active on an extended set of 3068 macromolecular targets. An efficient backend implementation allows to speed up the process that returns results for a druglike molecule on human proteins in 15-20 s. The refreshed web interface enhances user experience with new features for easy input and improved analysis. Interoperability capacity enables straightforward submission of any input or output molecule to other on-line computer-aided drug design tools, developed by the SIB Swiss Institute of Bioinformatics. High levels of predictive performance were maintained despite more extended biological and chemical spaces to be explored, e.g. achieving at least one correct human target in the top 15 predictions for >70% of external compounds. The new SwissTargetPrediction is available free of charge (www.swisstargetprediction.ch).

1,244 citations


Journal ArticleDOI
TL;DR: A high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software is provided.

1,228 citations


Posted ContentDOI
15 Feb 2019-bioRxiv
TL;DR: A high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software is provided.
Abstract: MRtrix3 is an open-source, cross-platform software package for medical image processing, analysis and visualization, with a particular emphasis on the investigation of the brain using diffusion MRI. It is implemented using a fast, modular and flexible general-purpose code framework for image data access and manipulation, enabling efficient development of new applications, whilst retaining high computational performance and a consistent command-line interface between applications. In this article, we provide a high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software.

728 citations


Journal ArticleDOI
TL;DR: This Review describes different deep learning techniques and how they can be applied to extract biologically relevant information from large, complex genomic data sets.
Abstract: As a data-driven science, genomics largely utilizes machine learning to capture dependencies in data and derive novel biological hypotheses. However, the ability to extract new insights from the exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively leveraging large data sets, deep learning has transformed fields such as computer vision and natural language processing. Now, it is becoming the method of choice for many genomics modelling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing. This Review describes different deep learning techniques and how they can be applied to extract biologically relevant information from large, complex genomic data sets.

685 citations


Journal ArticleDOI
TL;DR: Warp is described, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation, and includes deep-learning-based models for accurate particle picking and image denoising.
Abstract: The acquisition of cryo-electron microscopy (cryo-EM) data from biological specimens must be tightly coupled to data preprocessing to ensure the best data quality and microscope usage. Here we describe Warp, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation. Warp corrects micrographs for global and local motion, estimates the local defocus and monitors key parameters for each recorded micrograph or tomographic tilt series in real time. The software further includes deep-learning-based models for accurate particle picking and image denoising. The output from Warp can be fed into established programs for particle classification and 3D-map refinement. Our benchmarks show improvement in the nominal resolution, which went from 3.9 A to 3.2 A, of a published cryo-EM data set for influenza virus hemagglutinin. Warp is easy to install from http://github.com/cramerlab/warp and computationally inexpensive, and has an intuitive, streamlined user interface. The user-friendly software tool Warp enables automated, on-the-fly preprocessing of cryo-EM data, including motion correction, defocus estimation, particle picking and image denoising.

655 citations


Proceedings ArticleDOI
27 May 2019
TL;DR: A study conducted on observing software teams at Microsoft as they develop AI-based applications finds that various Microsoft teams have united this workflow into preexisting, well-evolved, Agile-like software engineering processes, providing insights about several essential engineering challenges that organizations may face in creating large-scale AI solutions for the marketplace.
Abstract: Recent advances in machine learning have stimulated widespread interest within the Information Technology sector on integrating AI capabilities into software and services. This goal has forced organizations to evolve their development processes. We report on a study that we conducted on observing software teams at Microsoft as they develop AI-based applications. We consider a nine-stage workflow process informed by prior experiences developing AI applications (e.g., search and NLP) and data science tools (e.g. application diagnostics and bug reporting). We found that various Microsoft teams have united this workflow into preexisting, well-evolved, Agile-like software engineering processes, providing insights about several essential engineering challenges that organizations may face in creating large-scale AI solutions for the marketplace. We collected some best practices from Microsoft teams to address these challenges. In addition, we have identified three aspects of the AI domain that make it fundamentally different from prior software application domains: 1) discovering, managing, and versioning the data needed for machine learning applications is much more complex and difficult than other types of software engineering, 2) model customization and model reuse require very different skills than are typically found in software teams, and 3) AI components are more difficult to handle as distinct modules than traditional software components --- models may be "entangled" in complex ways and experience non-monotonic error behavior. We believe that the lessons learned by Microsoft teams will be valuable to other organizations.

Journal ArticleDOI
TL;DR: Details describing the content of database entries are presented to enhance the use of the ICDD's Powder Diffraction File to serve a wide range of disciplines covering academic, industrial, and government laboratories.
Abstract: The ICDD's Powder Diffraction File™ (PDF®) is a database of inorganic and organic diffraction data used for phase identification and materials characterization by powder diffraction. The PDF has been available for over 75 years and finds application in X-ray, synchrotron, electron, and neutron diffraction analyses. With entries based on powder and single crystal data, the PDF is the only crystallographic database where every entry is editorially reviewed and marked with a quality mark that alerts the user to the reliability/quality of the submitted data. The editorial processes of ICDD's quality management system are unique in that they are ISO 9001:2015 certified. Initially offered as text on paper cards and books, the PDF evolved to a computer-readable database in the 1960s and today is both computer and web accessible. With data mining and phase identification software available in PDF products, and the databases’ compatibility with vendor (third party) software, the 1 000 000+ published PDF entries serve a wide range of disciplines covering academic, industrial, and government laboratories. Details describing the content of database entries are presented to enhance the use of the PDF.

Journal ArticleDOI
TL;DR: A literature review on the parameters' influence on the prediction performance and on variable importance measures is provided, and the application of one of the most established tuning strategies, model‐based optimization (MBO), is demonstrated.
Abstract: The random forest algorithm (RF) has several hyperparameters that have to be set by the user, e.g., the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables drawn randomly for each split, the splitting rule, the minimum number of samples that a node must contain and the number of trees. In this paper, we first provide a literature review on the parameters' influence on the prediction performance and on variable importance measures. It is well known that in most cases RF works reasonably well with the default values of the hyperparameters specified in software packages. Nevertheless, tuning the hyperparameters can improve the performance of RF. In the second part of this paper, after a brief overview of tuning strategies we demonstrate the application of one of the most established tuning strategies, model-based optimization (MBO). To make it easier to use, we provide the tuneRanger R package that tunes RF with MBO automatically. In a benchmark study on several datasets, we compare the prediction performance and runtime of tuneRanger with other tuning implementations in R and RF with default hyperparameters.

Proceedings ArticleDOI
15 Oct 2019
TL;DR: A light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser, the VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video.
Abstract: In this paper, we introduce a simple and standalone manual annotation tool for images, audio and video: the VGG Image Annotator (VIA). This is a light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser. The VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video. These manual annotations can be exported to plain text data formats such as JSON and CSV and therefore are amenable to further processing by other software tools. VIA also supports collaborative annotation of a large dataset by a group of human annotators. The BSD open source license of this software allows it to be used in any academic project or commercial application.

Journal ArticleDOI
TL;DR: The new systematic review software, the Joanna Briggs Institute System for the Unified Management, Assessment and Review of Information (JBI SUMARI), was successfully developed through an iterative process of development, feedback, testing and review.
Abstract: Aim:Systematic reviews play an important role in ensuring trustworthy recommendations in healthcare. However, systematic reviews can be laborious to undertake and as such software has been developed to assist in the conduct and reporting of systematic reviews. The Joanna Briggs Institute and

Journal ArticleDOI
TL;DR: Kluger et al. as mentioned in this paper proposed a heatmap-style visualization for scRNA-seq based on one-dimensional t-distributed stochastic neighbor embedding (t-SNE) for simultaneously visualizing the expression patterns of thousands of genes.
Abstract: t-distributed stochastic neighbor embedding (t-SNE) is widely used for visualizing single-cell RNA-sequencing (scRNA-seq) data, but it scales poorly to large datasets. We dramatically accelerate t-SNE, obviating the need for data downsampling, and hence allowing visualization of rare cell populations. Furthermore, we implement a heatmap-style visualization for scRNA-seq based on one-dimensional t-SNE for simultaneously visualizing the expression patterns of thousands of genes. Software is available at https://github.com/KlugerLab/FIt-SNE and https://github.com/KlugerLab/t-SNE-Heatmaps .

Journal ArticleDOI
01 Sep 2019
TL;DR: This work reviews the latest version of SmartPLS and discusses its various features, and offers researchers with concrete guidance regarding their choice of a PLS-SEM software that fits their analytical needs.
Abstract: In their effort to better understand consumer behavior, marketing researchers often analyze relationships between latent variables, measured by sets of observed variables. Partial least squares structural equation modeling (PLS-SEM) has become a popular tool for analyzing such relationships. Particularly the availability of SmartPLS, a comprehensive software program with an intuitive graphical user interface, helped popularize the method. We review the latest version of SmartPLS and discuss its various features. Our aim is to offer researchers with concrete guidance regarding their choice of a PLS-SEM software that fits their analytical needs.

Journal ArticleDOI
TL;DR: The pLink 2 as discussed by the authors is a search engine with higher speed and reliability for proteome-scale identification of cross-linked peptides, with a two-stage open search strategy facilitated by fragment indexing.
Abstract: We describe pLink 2, a search engine with higher speed and reliability for proteome-scale identification of cross-linked peptides. With a two-stage open search strategy facilitated by fragment indexing, pLink 2 is ~40 times faster than pLink 1 and 3~10 times faster than Kojak. Furthermore, using simulated datasets, synthetic datasets, 15N metabolically labeled datasets, and entrapment databases, four analysis methods were designed to evaluate the credibility of ten state-of-the-art search engines. This systematic evaluation shows that pLink 2 outperforms these methods in precision and sensitivity, especially at proteome scales. Lastly, re-analysis of four published proteome-scale cross-linking datasets with pLink 2 required only a fraction of the time used by pLink 1, with up to 27% more cross-linked residue pairs identified. pLink 2 is therefore an efficient and reliable tool for cross-linking mass spectrometry analysis, and the systematic evaluation methods described here will be useful for future software development. The identification of cross-linked peptides at a proteome scale for interactome analyses represents a complex challenge. Here the authors report an efficient and reliable search engine pLink 2 for proteome-scale cross-linking mass spectrometry analyses, and demonstrate how to systematically evaluate the credibility of search engines.

Journal ArticleDOI
TL;DR: yambo as mentioned in this paper is an open source project aimed at studying excited state properties of condensed matter systems from first principles using many-body methods using ground state electronic structure data as computed by density functional theory codes such as Quantum ESPRESSO and Abinit.
Abstract: yambo is an open source project aimed at studying excited state properties of condensed matter systems from first principles using many-body methods. As input, yambo requires ground state electronic structure data as computed by density functional theory codes such as Quantum ESPRESSO and Abinit. yambo's capabilities include the calculation of linear response quantities (both independent-particle and including electron-hole interactions), quasi-particle corrections based on the GW formalism, optical absorption, and other spectroscopic quantities. Here we describe recent developments ranging from the inclusion of important but oft-neglected physical effects such as electron-phonon interactions to the implementation of a real-time propagation scheme for simulating linear and non-linear optical properties. Improvements to numerical algorithms and the user interface are outlined. Particular emphasis is given to the new and efficient parallel structure that makes it possible to exploit modern high performance computing architectures. Finally, we demonstrate the possibility to automate workflows by interfacing with the yambopy and AiiDA software tools.

Journal ArticleDOI
TL;DR: The experimental results indicated that the predictors developed by the BioSeq-Analysis2.0 can achieve comparable or even better performance than the existing state-of-the-art predictors.
Abstract: As the first web server to analyze various biological sequences at sequence level based on machine learning approaches, many powerful predictors in the field of computational biology have been developed with the assistance of the BioSeq-Analysis. However, the BioSeq-Analysis can be only applied to the sequence-level analysis tasks, preventing its applications to the residue-level analysis tasks, and an intelligent tool that is able to automatically generate various predictors for biological sequence analysis at both residue level and sequence level is highly desired. In this regard, we decided to publish an important updated server covering a total of 26 features at the residue level and 90 features at the sequence level called BioSeq-Analysis2.0 (http://bliulab.net/BioSeq-Analysis2.0/), by which the users only need to upload the benchmark dataset, and the BioSeq-Analysis2.0 can generate the predictors for both residue-level analysis and sequence-level analysis tasks. Furthermore, the corresponding stand-alone tool was also provided, which can be downloaded from http://bliulab.net/BioSeq-Analysis2.0/download/. To the best of our knowledge, the BioSeq-Analysis2.0 is the first tool for generating predictors for biological sequence analysis tasks at residue level. Specifically, the experimental results indicated that the predictors developed by BioSeq-Analysis2.0 can achieve comparable or even better performance than the existing state-of-the-art predictors.

Journal ArticleDOI
TL;DR: This work generated a realistic benchmark experiment that included single cells and admixtures of cells or RNA to create ‘pseudo cells’ from up to five distinct cancer cell lines and provided a comprehensive framework for benchmarking most common scRNA-seq analysis steps.
Abstract: Single cell RNA-sequencing (scRNA-seq) technology has undergone rapid development in recent years, leading to an explosion in the number of tailored data analysis methods. However, the current lack of gold-standard benchmark datasets makes it difficult for researchers to systematically compare the performance of the many methods available. Here, we generated a realistic benchmark experiment that included single cells and admixtures of cells or RNA to create 'pseudo cells' from up to five distinct cancer cell lines. In total, 14 datasets were generated using both droplet and plate-based scRNA-seq protocols. We compared 3,913 combinations of data analysis methods for tasks ranging from normalization and imputation to clustering, trajectory analysis and data integration. Evaluation revealed pipelines suited to different types of data for different tasks. Our data and analysis provide a comprehensive framework for benchmarking most common scRNA-seq analysis steps.

Journal ArticleDOI
TL;DR: This study reports results from the second community-wide single-molecule localization microscopy software challenge, which tested over 30 software packages on realistic simulated data for multiple popular 3D image acquisition modes, as well as 2D localization microscopes.
Abstract: With the widespread uptake of two-dimensional (2D) and three-dimensional (3D) single-molecule localization microscopy (SMLM), a large set of different data analysis packages have been developed to generate super-resolution images. In a large community effort, we designed a competition to extensively characterize and rank the performance of 2D and 3D SMLM software packages. We generated realistic simulated datasets for popular imaging modalities-2D, astigmatic 3D, biplane 3D and double-helix 3D-and evaluated 36 participant packages against these data. This provides the first broad assessment of 3D SMLM software and provides a holistic view of how the latest 2D and 3D SMLM packages perform in realistic conditions. This resource allows researchers to identify optimal analytical software for their experiments, allows 3D SMLM software developers to benchmark new software against the current state of the art, and provides insight into the current limits of the field.

Journal ArticleDOI
TL;DR: GDS format provides efficient storage and retrieval of genotypes measured by microarrays and sequencing, and GENESIS implements highly flexible mixed models, allowing for different link functions, multiple variance components, and phenotypic heteroskedasticity.
Abstract: Summary The Genomic Data Storage (GDS) format provides efficient storage and retrieval of genotypes measured by microarrays and sequencing. We developed GENESIS to perform various single- and aggregate-variant association tests using genotype data stored in GDS format. GENESIS implements highly flexible mixed models, allowing for different link functions, multiple variance components and phenotypic heteroskedasticity. GENESIS integrates cohesively with other R/Bioconductor packages to build a complete genomic analysis workflow entirely within the R environment. Availability and implementation https://bioconductor.org/packages/GENESIS; vignettes included. Supplementary information Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: ROAST is released as an open-source, easy-to-install and fully-automated pipeline for individualized TES modeling and its performance with commercial FEM software, and SimNIBS, a well-established open- source modeling pipeline is compared.
Abstract: Objective Research in the area of transcranial electrical stimulation (TES) often relies on computational models of current flow in the brain. Models are built based on magnetic resonance images (MRI) of the human head to capture detailed individual anatomy. To simulate current flow on an individual, the subject's MRI is segmented, virtual electrodes are placed on this anatomical model, the volume is tessellated into a mesh, and a finite element model (FEM) is solved numerically to estimate the current flow. Various software tools are available for each of these steps, as well as processing pipelines that connect these tools for automated or semi-automated processing. The goal of the present tool-realistic volumetric-approach to simulate transcranial electric simulation (ROAST)-is to provide an end-to-end pipeline that can automatically process individual heads with realistic volumetric anatomy leveraging open-source software and custom scripts to improve segmentation and execute electrode placement. Approach ROAST combines the segmentation algorithm of SPM12, a Matlab script for touch-up and automatic electrode placement, the finite element mesher iso2mesh and the solver getDP. We compared its performance with commercial FEM software, and SimNIBS, a well-established open-source modeling pipeline. Main results The electric fields estimated with ROAST differ little from the results obtained with commercial meshing and FEM solving software. We also do not find large differences between the various automated segmentation methods used by ROAST and SimNIBS. We do find bigger differences when volumetric segmentation are converted into surfaces in SimNIBS. However, evaluation on intracranial recordings from human subjects suggests that ROAST and SimNIBS are not significantly different in predicting field distribution, provided that users have detailed knowledge of SimNIBS. Significance We hope that the detailed comparisons presented here of various choices in this modeling pipeline can provide guidance for future tool development. We released ROAST as an open-source, easy-to-install and fully-automated pipeline for individualized TES modeling.

Book ChapterDOI
01 Jan 2019
TL;DR: This chapter introduces the TIRA Integrated Research Architecture, its design requirements, its workflows from both the participants’ and the organizers’ perspectives, alongside a report on user experience and usage scenarios.
Abstract: Data and software are immaterial. Scientists in computer science hence have the unique chance to let other scientists easily reproduce their findings. Similarly, and with the same ease, the organization of shared tasks, i.e., the collaborative search for new algorithms given a predefined problem, is possible. Experience shows that the potential of reproducibility is hardly tapped in either case. Based on this observation, and driven by the ambitious goal to find the best solutions for certain problems in our research field, we have been developing the TIRA Integrated Research Architecture. Within TIRA, the reproducibility requirement got top priority right from the start. This chapter introduces the platform, its design requirements, its workflows from both the participants’ and the organizers’ perspectives, alongside a report on user experience and usage scenarios.

Proceedings ArticleDOI
TL;DR: The VGG Image Annotator (VIA) as discussed by the authors is a simple and standalone manual annotation tool for images, audio and video that allows human annotators to define and describe spatial regions in images or video frames and temporal segments in audio or video.
Abstract: In this paper, we introduce a simple and standalone manual annotation tool for images, audio and video: the VGG Image Annotator (VIA). This is a light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser. The VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video. These manual annotations can be exported to plain text data formats such as JSON and CSV and therefore are amenable to further processing by other software tools. VIA also supports collaborative annotation of a large dataset by a group of human annotators. The BSD open source license of this software allows it to be used in any academic project or commercial application.

Journal ArticleDOI
TL;DR: This work consists of explaining the reasons behind the new rise of AR and VR and why their actual adoption in education will be a reality in a near fu-ture.
Abstract: Augmented Reality and Virtual Reality are not new technologies. But several constraints prevented their actual adoption. Recent technological progresses added to the proliferation of affordable hardware and software have made AR and VR more viable and desirable in many domains, including educa-tion; they have been relaunched with new promises previously unimaginable. The nature of AR and VR promises new teaching and learning models that better meet the needs of the 21st century learner. We’re now on a path to re-invent education. This work consists of explaining the reasons behind the new rise of AR and VR and why their actual adoption in education will be a reality in a near fu-ture.

Posted Content
TL;DR: MLPerf as discussed by the authors is an ML benchmark that overcomes three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time-to-solution exhibits high variance.
Abstract: Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML. But ML training presents three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time to solution exhibits high variance, and software and hardware systems are so diverse that fair benchmarking with the same binary, code, and even hyperparameters is difficult. We therefore present MLPerf, an ML benchmark that overcomes these challenges. Our analysis quantitatively evaluates MLPerf's efficacy at driving performance and scalability improvements across two rounds of results from multiple vendors.

Journal ArticleDOI
TL;DR: The EVcouplings framework is presented, a fully integrated open-source application and Python package for coevolutionary analysis that enables generation of sequence alignments, calculation and evaluation of evolutionary couplings, and de novo prediction of structure and mutation effects.
Abstract: Summary Coevolutionary sequence analysis has become a commonly used technique for de novo prediction of the structure and function of proteins, RNA, and protein complexes. We present the EVcouplings framework, a fully integrated open-source application and Python package for coevolutionary analysis. The framework enables generation of sequence alignments, calculation and evaluation of evolutionary couplings (ECs), and de novo prediction of structure and mutation effects. The combination of an easy to use, flexible command line interface and an underlying modular Python package makes the full power of coevolutionary analyses available to entry-level and advanced users. Availability and implementation https://github.com/debbiemarkslab/evcouplings.

Book ChapterDOI
15 Jul 2019
TL;DR: VerifAI particularly addresses challenges with applying formal methods to ML components such as perception systems based on deep neural networks, as well as systems containing them, and to model and analyze system behavior in the presence of environment uncertainty.
Abstract: We present VerifAI, a software toolkit for the formal design and analysis of systems that include artificial intelligence (AI) and machine learning (ML) components. VerifAI particularly addresses challenges with applying formal methods to ML components such as perception systems based on deep neural networks, as well as systems containing them, and to model and analyze system behavior in the presence of environment uncertainty. We describe the initial version of VerifAI, which centers on simulation-based verification and synthesis, guided by formal models and specifications. We give examples of several use cases, including temporal-logic falsification, model-based systematic fuzz testing, parameter synthesis, counterexample analysis, and data set augmentation.