scispace - formally typeset
Search or ask a question

Showing papers in "Nature Machine Intelligence in 2019"


Journal ArticleDOI
Cynthia Rudin1
TL;DR: This Perspective clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications whereinterpretable models could potentially replace black box models in criminal justice, healthcare and computer vision.
Abstract: Black box machine learning models are currently being used for high-stakes decision making throughout society, causing problems in healthcare, criminal justice and other domains. Some people hope that creating methods for explaining these black box models will alleviate some of the problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practice and can potentially cause great harm to society. The way forward is to design models that are inherently interpretable. This Perspective clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare and computer vision. There has been a recent rise of interest in developing methods for ‘explainable AI’, where models are created to explain how a first ‘black box’ machine learning model arrives at a specific decision. It can be argued that instead efforts should be directed at building inherently interpretable models in the first place, in particular where they are applied in applications that directly affect human lives, such as in healthcare and criminal justice.

3,609 citations


Journal ArticleDOI
TL;DR: In this article, a global convergence emerging around five ethical principles (transparency, justice and fairness, non-maleficence, responsibility and privacy), with substantive divergence in relation to how these principles are interpreted, why they are deemed important, what issue, domain or actors they pertain to, and how they should be implemented.
Abstract: In the past five years, private companies, research institutions and public sector organizations have issued principles and guidelines for ethical artificial intelligence (AI). However, despite an apparent agreement that AI should be ‘ethical’, there is debate about both what constitutes ‘ethical AI’ and which ethical requirements, technical standards and best practices are needed for its realization. To investigate whether a global agreement on these questions is emerging, we mapped and analysed the current corpus of principles and guidelines on ethical AI. Our results reveal a global convergence emerging around five ethical principles (transparency, justice and fairness, non-maleficence, responsibility and privacy), with substantive divergence in relation to how these principles are interpreted, why they are deemed important, what issue, domain or actors they pertain to, and how they should be implemented. Our findings highlight the importance of integrating guideline-development efforts with substantive ethical analysis and adequate implementation strategies.

1,419 citations


Journal ArticleDOI
TL;DR: This Review looks at several key aspects of modern neuroevolution, including large-scale computing, the benefits of novelty and diversity, the power of indirect encoding, and the field’s contributions to meta-learning and architecture search.
Abstract: Much of recent machine learning has focused on deep learning, in which neural network weights are trained through variants of stochastic gradient descent. An alternative approach comes from the field of neuroevolution, which harnesses evolutionary algorithms to optimize neural networks, inspired by the fact that natural brains themselves are the products of an evolutionary process. Neuroevolution enables important capabilities that are typically unavailable to gradient-based approaches, including learning neural network building blocks (for example activation functions), hyperparameters, architectures and even the algorithms for learning themselves. Neuroevolution also differs from deep learning (and deep reinforcement learning) by maintaining a population of solutions during search, enabling extreme exploration and massive parallelization. Finally, because neuroevolution research has (until recently) developed largely in isolation from gradient-based neural network research, it has developed many unique and effective techniques that should be effective in other machine learning areas too. This Review looks at several key aspects of modern neuroevolution, including large-scale computing, the benefits of novelty and diversity, the power of indirect encoding, and the field’s contributions to meta-learning and architecture search. Our hope is to inspire renewed interest in the field as it meets the potential of the increasing computation available today, to highlight how many of its ideas can provide an exciting resource for inspiration and hybridization to the deep learning, deep reinforcement learning and machine learning communities, and to explain how neuroevolution could prove to be a critical tool in the long-term pursuit of artificial general intelligence. Deep neural networks have become very successful at certain machine learning tasks partly due to the widely adopted method of training called backpropagation. An alternative way to optimize neural networks is by using evolutionary algorithms, which, fuelled by the increase in computing power, offers a new range of capabilities and modes of learning.

463 citations


Journal ArticleDOI
TL;DR: Brent Mittelstadt highlights important differences between medical practice and AI development that suggest a principled approach to AI development may not work in the case of AI.
Abstract: Artificial intelligence (AI) ethics is now a global topic of discussion in academic and policy circles. At least 84 public–private initiatives have produced statements describing high-level principles, values and other tenets to guide the ethical development, deployment and governance of AI. According to recent meta-analyses, AI ethics has seemingly converged on a set of principles that closely resemble the four classic principles of medical ethics. Despite the initial credibility granted to a principled approach to AI ethics by the connection to principles in medical ethics, there are reasons to be concerned about its future impact on AI development and governance. Significant differences exist between medicine and AI development that suggest a principled approach for the latter may not enjoy success comparable to the former. Compared to medicine, AI development lacks (1) common aims and fiduciary duties, (2) professional history and norms, (3) proven methods to translate principles into practice, and (4) robust legal and professional accountability mechanisms. These differences suggest we should not yet celebrate consensus around high-level principles that hide deep political and normative disagreement. AI ethics initiatives have seemingly converged on a set of principles that closely resemble the four classic principles of medical ethics. Despite this, Brent Mittelstadt highlights important differences between medical practice and AI development that suggest a principled approach may not work in the case of AI.

381 citations


Journal ArticleDOI
TL;DR: It is time to develop methods for systematically quantifying uncertainty underlying deep learning processes, which would lead to increased confidence in practical applicability of these approaches.
Abstract: Medicine, even from the earliest days of artificial intelligence (AI) research, has been one of the most inspiring and promising domains for the application of AI-based approaches. Equally, it has been one of the more challenging areas to see an effective adoption. There are many reasons for this, primarily the reluctance to delegate decision making to machine intelligence in cases where patient safety is at stake. To address some of these challenges, medical AI, especially in its modern data-rich deep learning guise, needs to develop a principled and formal uncertainty quantification (UQ) discipline, just as we have seen in fields such as nuclear stockpile stewardship and risk management. The data-rich world of AI-based learning and the frequent absence of a well-understood underlying theory poses its own unique challenges to straightforward adoption of UQ. These challenges, while not trivial, also present significant new research opportunities for the development of new theoretical approaches, and for the practical applications of UQ in the area of machine-assisted medical decision making. Understanding prediction system structure and defensibly quantifying uncertainty is possible, and, if done, can significantly benefit both research and practical applications of AI in this critical domain. Arguably one of the most promising as well as critical applications of deep learning is in supporting medical sciences and decision making. It is time to develop methods for systematically quantifying uncertainty underlying deep learning processes, which would lead to increased confidence in practical applicability of these approaches.

272 citations


Journal ArticleDOI
TL;DR: In this article, a modularized neural network for low-dose CT (LDCT) was proposed and compared with commercial iterative reconstruction methods from three leading CT vendors, and the learned workflow allows radiologists-in-the-loop to optimize the denoising depth in a task-specific fashion.
Abstract: Commercial iterative reconstruction techniques help to reduce CT radiation dose but altered image appearance and artifacts limit their adoptability and potential use. Deep learning has been investigated for low-dose CT (LDCT). Here we design a modularized neural network for LDCT and compared it with commercial iterative reconstruction methods from three leading CT vendors. While popular networks are trained for an end-to-end mapping, our network performs an end-to-process mapping so that intermediate denoised images are obtained with associated noise reduction directions towards a final denoised image. The learned workflow allows radiologists-in-the-loop to optimize the denoising depth in a task-specific fashion. Our network was trained with the Mayo LDCT Dataset, and tested on separate chest and abdominal CT exams from Massachusetts General Hospital. The best deep learning reconstructions were systematically compared to the best iterative reconstructions in a double-blinded reader study. This study confirms that our deep learning approach performed either favorably or comparably in terms of noise suppression and structural fidelity, and is much faster than the commercial iterative reconstruction algorithms.

265 citations


Journal ArticleDOI
TL;DR: It is demonstrated experimentally that the synaptic weights shared in different time steps in an LSTM can be implemented with a memristor crossbar array, which has a small circuit footprint, can store a large number of parameters and offers in-memory computing capability that contributes to circumventing the ‘von Neumann bottleneck’.
Abstract: Recent breakthroughs in recurrent deep neural networks with long short-term memory (LSTM) units have led to major advances in artificial intelligence. However, state-of-the-art LSTM models with significantly increased complexity and a large number of parameters have a bottleneck in computing power resulting from both limited memory capacity and limited data communication bandwidth. Here we demonstrate experimentally that the synaptic weights shared in different time steps in an LSTM can be implemented with a memristor crossbar array, which has a small circuit footprint, can store a large number of parameters and offers in-memory computing capability that contributes to circumventing the ‘von Neumann bottleneck’. We illustrate the capability of our crossbar system as a core component in solving real-world problems in regression and classification, which shows that memristor LSTM is a promising low-power and low-latency hardware platform for edge inference. Deep neural networks are increasingly popular in data-intensive applications, but are power-hungry. New types of computer chips that are suited to the task of deep learning, such as memristor arrays where data handling and computing take place within the same unit, are required. A well-used deep learning model called long short-term memory, which can handle temporal sequential data analysis, is now implemented in a memristor crossbar array, promising an energy-efficient and low-footprint deep learning platform.

251 citations


Journal ArticleDOI
TL;DR: A novel pathology whole-slide diagnosis method, powered by artificial intelligence, to address the lack of interpretable diagnosis, which provides an innovative and reliable means for making diagnostic suggestions and can be deployed at low cost as next-generation, artificial intelligence-enhanced CAD technology for use in diagnostic pathology.
Abstract: Diagnostic pathology is the foundation and gold standard for identifying carcinomas. However, high inter-observer variability substantially affects productivity in routine pathology and is especially ubiquitous in diagnostician-deficient medical centres. Despite rapid growth in computer-aided diagnosis (CAD), the application of whole-slide pathology diagnosis remains impractical. Here, we present a novel pathology whole-slide diagnosis method, powered by artificial intelligence, to address the lack of interpretable diagnosis. The proposed method masters the ability to automate the human-like diagnostic reasoning process and translate gigapixels directly to a series of interpretable predictions, providing second opinions and thereby encouraging consensus in clinics. Moreover, using 913 collected examples of whole-slide data representing patients with bladder cancer, we show that our method matches the performance of 17 pathologists in the diagnosis of urothelial carcinoma. We believe that our method provides an innovative and reliable means for making diagnostic suggestions and can be deployed at low cost as next-generation, artificial intelligence-enhanced CAD technology for use in diagnostic pathology. Diagnostic pathology currently requires substantial human expertise, often with high inter-observer variability. A whole-slide pathology method automates the prediction process and provides computer-aided diagnosis using artificial intelligence.

197 citations


Journal ArticleDOI
TL;DR: A fully convolutional neural network is used to create time-resolved three-dimensional dense segmentations of heart images that can efficiently predict human survival.
Abstract: Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using cardiac magnetic resonance imaging, to create time-resolved three-dimensional segmentations using a fully convolutional network trained on anatomical shape priors. This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. To handle right-censored survival outcomes, our network used a Cox partial likelihood loss function. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p = .0012) for our model C=0.75 (95% CI: 0.70 - 0.79) than the human benchmark of C=0.59 (95% CI: 0.53 - 0.65). This work demonstrates how a complex computer vision task using high-dimensional medical image data can efficiently predict human survival.

189 citations


Journal ArticleDOI
TL;DR: The European Commission's report "Ethics guidelines for trustworthy AI" provides a clear benchmark to evaluate the responsible development of AI systems, and facilitates international support for AI solutions that are good for humanity and the environment, says Luciano Floridi as discussed by the authors.
Abstract: The European Commission’s report ‘Ethics guidelines for trustworthy AI’ provides a clear benchmark to evaluate the responsible development of AI systems, and facilitates international support for AI solutions that are good for humanity and the environment, says Luciano Floridi.

176 citations


Journal ArticleDOI
TL;DR: This Review describes state-of-the-art work on RL in biological and artificial agents and focuses on points of contact between these disciplines and identifies areas where future research can benefit from information flow between these fields.
Abstract: There is and has been a fruitful flow of concepts and ideas between studies of learning in biological and artificial systems. Much early work that led to the development of reinforcement learning (RL) algorithms for artificial systems was inspired by learning rules first developed in biology by Bush and Mosteller, and Rescorla and Wagner. More recently, temporal-difference RL, developed for learning in artificial agents, has provided a foundational framework for interpreting the activity of dopamine neurons. In this Review, we describe state-of-the-art work on RL in biological and artificial agents. We focus on points of contact between these disciplines and identify areas where future research can benefit from information flow between these fields. Most work in biological systems has focused on simple learning problems, often embedded in dynamic environments where flexibility and ongoing learning are important, similar to real-world learning problems faced by biological systems. In contrast, most work in artificial agents has focused on learning a single complex problem in a static environment. Moving forward, work in each field will benefit from a flow of ideas that represent the strengths within each discipline. Research on reinforcement learning in artificial agents focuses on a single complex problem within a static environment. In biological agents, research focuses on simple learning problems embedded in flexible, dynamic environments. The authors review the literature on these topics and suggest areas of synergy between them.

Journal ArticleDOI
TL;DR: A new framework for efficient recovery of image quality from sparse optoacoustic data based on a deep convolutional neural network is proposed and its performance with whole body mouse imaging in vivo is demonstrated.
Abstract: The rapidly evolving field of optoacoustic (photoacoustic) imaging and tomography is driven by a constant need for better imaging performance in terms of resolution, speed, sensitivity, depth and contrast. In practice, data acquisition strategies commonly involve sub-optimal sampling of the tomographic data, resulting in inevitable performance trade-offs and diminished image quality. We propose a new framework for efficient recovery of image quality from sparse optoacoustic data based on a deep convolutional neural network and demonstrate its performance with whole body mouse imaging in vivo. To generate accurate high-resolution reference images for optimal training, a full-view tomographic scanner capable of attaining superior cross-sectional image quality from living mice was devised. When provided with images reconstructed from substantially undersampled data or limited-view scans, the trained network was capable of enhancing the visibility of arbitrarily oriented structures and restoring the expected image quality. Notably, the network also eliminated some reconstruction artefacts present in reference images rendered from densely sampled data. No comparable gains were achieved when the training was performed with synthetic or phantom data, underlining the importance of training with high-quality in vivo images acquired by full-view scanners. The new method can benefit numerous optoacoustic imaging applications by mitigating common image artefacts, enhancing anatomical contrast and image quantification capacities, accelerating data acquisition and image reconstruction approaches, while also facilitating the development of practical and affordable imaging systems. The suggested approach operates solely on image-domain data and thus can be seamlessly applied to artefactual images reconstructed with other modalities. Optoacoustic imaging can achieve high spatial and temporal resolution but image quality is often compromised by suboptimal data acquisition. A new method employing deep learning to recover high-quality images from sparse or limited-view optoacoustic scans has been developed and demonstrated for whole-body mouse imaging in vivo.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed scDeepCluster, a single-cell model-based deep embedded clustering method, which simultaneously learns feature representation and clustering via explicit modelling of scRNA-seq data generation.
Abstract: Single-cell RNA sequencing (scRNA-seq) promises to provide higher resolution of cellular differences than bulk RNA sequencing. Clustering transcriptomes profiled by scRNA-seq has been routinely conducted to reveal cell heterogeneity and diversity. However, clustering analysis of scRNA-seq data remains a statistical and computational challenge, due to the pervasive dropout events obscuring the data matrix with prevailing ‘false’ zero count observations. Here, we have developed scDeepCluster, a single-cell model-based deep embedded clustering method, which simultaneously learns feature representation and clustering via explicit modelling of scRNA-seq data generation. Based on testing extensive simulated data and real datasets from four representative single-cell sequencing platforms, scDeepCluster outperformed state-of-the-art methods under various clustering performance metrics and exhibited improved scalability, with running time increasing linearly with sample size. Its accuracy and efficiency make scDeepCluster a promising algorithm for clustering large-scale scRNA-seq data. Clustering groups of cells in single-cell RNA sequencing datasets can produce high-resolution information for complex biological questions. However, it is statistically and computationally challenging due to the low RNA capture rate, which results in a high number of false zero count observations. A deep learning approach called scDeepCluster, which efficiently combines a model for explicitly characterizing missing values with clustering, shows high performance and improved scalability with a computing time increasing linearly with sample size.

Journal ArticleDOI
TL;DR: In situ training of a five-level convolutional neural network that self-adapts to non-idealities of the one-transistor one-memristor array to classify the MNIST dataset is experimentally demonstrated, achieving a 75% reduction in weights without compromising accuracy.
Abstract: The explosive growth of machine learning is largely due to the recent advancements in hardware and architecture. The engineering of network structures, taking advantage of the spatial or temporal translational isometry of patterns, naturally leads to bio-inspired, shared-weight structures such as convolutional neural networks, which have markedly reduced the number of free parameters. State-of-the-art microarchitectures commonly rely on weight-sharing techniques, but still suffer from the von Neumann bottleneck of transistor-based platforms. Here, we experimentally demonstrate the in situ training of a five-level convolutional neural network that self-adapts to non-idealities of the one-transistor one-memristor array to classify the MNIST dataset, achieving similar accuracy to the memristor-based multilayer perceptron with a reduction in trainable parameters of ~75% owing to the shared weights. In addition, the memristors encoded both spatial and temporal translational invariance simultaneously in a convolutional long short-term memory network—a memristor-based neural network with intrinsic 3D input processing—which was trained in situ to classify a synthetic MNIST sequence dataset using just 850 weights. These proof-of-principle demonstrations combine the architectural advantages of weight sharing and the area/energy efficiency boost of the memristors, paving the way to future edge artificial intelligence. Memristive devices can provide energy-efficient neural network implementations, but they must be tailored to suit different network architectures. Wang et al. develop a trainable weight-sharing mechanism for memristor-based CNNs and ConvLSTMs, achieving a 75% reduction in weights without compromising accuracy.

Journal ArticleDOI
TL;DR: The key insight is to reduce state tomography to an unsupervised learning problem of the statistics of an informationally complete quantum measurement, which constitutes a modern machine learning approach to the validation of complex quantum devices.
Abstract: A major bottleneck in the development of scalable many-body quantum technologies is the difficulty in benchmarking state preparations, which suffer from an exponential ‘curse of dimensionality’ inherent to the classical description of quantum states. We present an experimentally friendly method for density matrix reconstruction based on neural network generative models. The learning procedure comes with a built-in approximate certificate of the reconstruction and makes no assumptions about the purity of the state under scrutiny. It can efficiently handle a broad class of complex systems including prototypical states in quantum information, as well as ground states of local spin models common to condensed matter physics. The key insight is to reduce state tomography to an unsupervised learning problem of the statistics of an informationally complete quantum measurement. This constitutes a modern machine learning approach to the validation of complex quantum devices, which may in addition prove relevant as a neural-network ansatz over mixed states suitable for variational optimization. Present day quantum technologies enable computations with tens and soon hundreds of qubits. A major outstanding challenge is to measure and benchmark the complete quantum state, a task that grows exponentially with the system size. Generative models based on restricted Boltzmann machines and recurrent neural networks can be employed to solve this quantum tomography problem in a scalable manner.

Journal ArticleDOI
TL;DR: A novel feedback-loop architecture is proposed, feedback GAN (FBGAN), to optimize the synthetic gene sequences for desired properties using an external function analyser, and it is demonstrated that the GAN-generated proteins have desirable biophysical properties.
Abstract: Generative adversarial networks (GANs) represent an attractive and novel approach to generate realistic data, such as genes, proteins or drugs, in synthetic biology. Here, we apply GANs to generate synthetic DNA sequences encoding for proteins of variable length. We propose a novel feedback-loop architecture, feedback GAN (FBGAN), to optimize the synthetic gene sequences for desired properties using an external function analyser. The proposed architecture also has the advantage that the analyser does not need to be differentiable. We apply the feedback-loop mechanism to two examples: generating synthetic genes coding for antimicrobial peptides, and optimizing synthetic genes for the secondary structure of their resulting peptides. A suite of metrics, calculated in silico, demonstrates that the GAN-generated proteins have desirable biophysical properties. The FBGAN architecture can also be used to optimize GAN-generated data points for useful properties in domains beyond genomics. Generative machine learning models are used in synthetic biology to find new structures such as DNA sequences, proteins and other macromolecules with applications in drug discovery, environmental treatment and manufacturing. Gupta and Zou propose and demonstrate in silico a feedback-loop architecture to optimize the output of a generative adversarial network that generates synthetic genes to produce ones specifically coding for antimicrobial peptides.

Journal ArticleDOI
TL;DR: A new deep learning based search heuristic performs well on the iconic Rubik’s cube and can also generalize to puzzles in which optimal solvers are intractable.
Abstract: The Rubik’s cube is a prototypical combinatorial puzzle that has a large state space with a single goal state. The goal state is unlikely to be accessed using sequences of randomly generated moves, posing unique challenges for machine learning. We solve the Rubik’s cube with DeepCubeA, a deep reinforcement learning approach that learns how to solve increasingly difficult states in reverse from the goal state without any specific domain knowledge. DeepCubeA solves 100% of all test configurations, finding a shortest path to the goal state 60.3% of the time. DeepCubeA generalizes to other combinatorial puzzles and is able to solve the 15 puzzle, 24 puzzle, 35 puzzle, 48 puzzle, Lights Out and Sokoban, finding a shortest path in the majority of verifiable cases. For some combinatorial puzzles, solutions can be verified to be optimal, for others, the state space is too large to be certain that a solution is optimal. A new deep learning based search heuristic performs well on the iconic Rubik’s cube and can also generalize to puzzles in which optimal solvers are intractable.

Journal ArticleDOI
TL;DR: A new deep learning-based method for delineating organs in the area of head and neck performs faster and more accurately than human experts, significantly outperforming human experts and the previous state-of-the-art method.
Abstract: Radiation therapy is one of the most widely used therapies for cancer treatment. A critical step in radiation therapy planning is to accurately delineate all organs at risk (OARs) to minimize potential adverse effects to healthy surrounding organs. However, manually delineating OARs based on computed tomography images is time-consuming and error-prone. Here, we present a deep learning model to automatically delineate OARs in head and neck, trained on a dataset of 215 computed tomography scans with 28 OARs manually delineated by experienced radiation oncologists. On a hold-out dataset of 100 computed tomography scans, our model achieves an average Dice similarity coefficient of 78.34% across the 28 OARs, significantly outperforming human experts and the previous state-of-the-art method by 10.05% and 5.18%, respectively. Our model takes only a few seconds to delineate an entire scan, compared to over half an hour by human experts. These findings demonstrate the potential for deep learning to improve the quality and reduce the treatment planning time of radiation therapy. To keep radiation therapy from damaging healthy tissue, expert radiologists have to segment CT scans into individual organs. A new deep learning-based method for delineating organs in the area of head and neck performs faster and more accurately than human experts.

Journal ArticleDOI
TL;DR: In this article, an approach for incorporating prior knowledge into machine learning algorithms is described. But this approach is aimed at applications in physics and signal processing in which we know that certain operations must be embedded into the algorithm.
Abstract: We describe an approach for incorporating prior knowledge into machine learning algorithms. We aim at applications in physics and signal processing in which we know that certain operations must be embedded into the algorithm. Any operation that allows computation of a gradient or sub-gradient towards its inputs is suited for our framework. We derive a maximal error bound for deep nets that demonstrates that inclusion of prior knowledge results in its reduction. Furthermore, we also show experimentally that known operators reduce the number of free parameters. We apply this approach to various tasks ranging from CT image reconstruction over vessel segmentation to the derivation of previously unknown imaging algorithms. As such the concept is widely applicable for many researchers in physics, imaging, and signal processing. We assume that our analysis will support further investigation of known operators in other fields of physics, imaging, and signal processing.

Journal ArticleDOI
TL;DR: In this article, orthogonal weights modification is proposed to avoid the catastrophic forgetting problem in deep neural networks, where the mapping rules are not kept the same but change according to different contexts.
Abstract: Deep neural networks are powerful tools in learning sophisticated but fixed mapping rules between inputs and outputs, thereby limiting their application in more complex and dynamic situations in which the mapping rules are not kept the same but change according to different contexts. To lift such limits, we developed an approach involving a learning algorithm, called orthogonal weights modification, with the addition of a context-dependent processing module. We demonstrated that with orthogonal weights modification to overcome catastrophic forgetting, and the context-dependent processing module to learn how to reuse a feature representation and a classifier for different contexts, a single network could acquire numerous context-dependent mapping rules in an online and continual manner, with as few as approximately ten samples to learn each. Our approach should enable highly compact systems to gradually learn myriad regularities of the real world and eventually behave appropriately within it. When neural networks are retrained to solve more than one problem, they tend to forget what they have learned earlier. Here, the authors propose orthogonal weights modification, a method to avoid this so-called catastrophic forgetting problem. Capitalizing on such an ability, a new module is introduced to enable the network to continually learn context-dependent processing.

Journal ArticleDOI
TL;DR: An annotated image dataset of over 18,000 white blood cells is compiled and used to train a convolutional neural network for leukocyte classification, which classifies the most important cell types with high accuracy and can answer clinically relevant binary questions with human-level performance.
Abstract: Reliable recognition of malignant white blood cells is a key step in the diagnosis of haematologic malignancies such as acute myeloid leukaemia. Microscopic morphological examination of blood cells is usually performed by trained human examiners, making the process tedious, time-consuming and hard to standardize. Here, we compile an annotated image dataset of over 18,000 white blood cells, use it to train a convolutional neural network for leukocyte classification and evaluate the network’s performance by comparing to inter- and intra-expert variability. The network classifies the most important cell types with high accuracy. It also allows us to decide two clinically relevant questions with human-level performance: (1) if a given cell has blast character and (2) if it belongs to the cell types normally present in non-pathological blood smears. Our approach holds the potential to be used as a classification aid for examining much larger numbers of cells in a smear than can usually be done by a human expert. This will allow clinicians to recognize malignant cell populations with lower prevalence at an earlier stage of the disease. Deep learning is currently transforming digital pathology, helping to make more reliable and faster clinical diagnoses. A promising application is in the recognition of malignant white blood cells—an essential step for detecting acute myeloid leukaemia that is challenging even for trained human examiners. An annotated image dataset of over 18,000 white blood cells is compiled and used to train a convolutional neural network for leukocyte classification. The network classifies the most important cell types with high accuracy and can answer clinically relevant binary questions with human-level performance.

Journal ArticleDOI
TL;DR: In this paper, an intuitive interface for data annotation and the display of neural network predictions within a commonly used digital pathology whole-slide viewer was created to address this gap in the field of pathology.
Abstract: Neural networks promise to bring robust, quantitative analysis to medical fields. However, their adoption is limited by the technicalities of training these networks and the required volume and quality of human-generated annotations. To address this gap in the field of pathology, we have created an intuitive interface for data annotation and the display of neural network predictions within a commonly used digital pathology whole-slide viewer. This strategy used a 'human-in-the-loop' to reduce the annotation burden. We demonstrate that segmentation of human and mouse renal micro compartments is repeatedly improved when humans interact with automatically generated annotations throughout the training process. Finally, to show the adaptability of this technique to other medical imaging fields, we demonstrate its ability to iteratively segment human prostate glands from radiology imaging data.

Journal ArticleDOI
TL;DR: A fully portable, wireless, flexible scalp electronic system, incorporating a set of dry electrodes and a flexible membrane circuit, allowing for wireless, real-time, universal electroencephalography classification for an electric wheelchair, a motorized vehicle and a keyboard-less presentation.
Abstract: Variation in human brains creates difficulty in implementing electroencephalography into universal brain–machine interfaces. Conventional electroencephalography systems typically suffer from motion artefacts, extensive preparation time and bulky equipment, while existing electroencephalography classification methods require training on a per-subject or per-session basis. Here, we introduce a fully portable, wireless, flexible scalp electronic system, incorporating a set of dry electrodes and a flexible membrane circuit. Time-domain analysis using convolutional neural networks allows for accurate, real-time classification of steady-state visually evoked potentials in the occipital lobe. Compared to commercial systems, the flexible electronics show the improved performance in detection of evoked potentials due to significant reduction of noise and electromagnetic interference. The two-channel scalp electronic system achieves a high information transfer rate (122.1 ± 3.53 bits per minute) with six human subjects, allowing for wireless, real-time, universal electroencephalography classification for an electric wheelchair, a motorized vehicle and a keyboard-less presentation. Brain–machine interfaces using steady-state visually evoked potentials (SSVEPs) show promise in therapeutic applications. With a combination of innovations in flexible and soft electronics and in deep learning approaches to classify potentials from two channels and from any subject, a compact, wireless and universal SSVEP interface is designed. Subjects can operate a wheelchair in real time with eye movements while wearing the new brain–machine interface.

Journal ArticleDOI
TL;DR: A combination of engineering advances shows promise for myoelectric prosthetic hands that are controlled by a user’s remaining muscle activity and a shared control scheme in which robotic automation aids in object grasping by maximizing the contact area between the hand and the object.
Abstract: Myoelectric prostheses allow users to recover lost functionality by controlling a robotic device with their remaining muscle activity. Such commercial devices can give users a high level of autonomy, but still do not approach the dexterity of the intact human hand. Here we present a method to control a robotic hand, shared between user intention and robotic automation. The algorithm allows user-controlled movements when high dexterity is desired, but also assisted grasping when robustness is paramount. This combination of features is currently lacking in commercial prostheses and can greatly improve prosthesis usability. First, we design and test a myoelectric proportional controller that can predict multiple joint angles simultaneously and with high accuracy. We then implement online control with both able-bodied and amputee subjects. Finally, we present a shared control scheme in which robotic automation aids in object grasping by maximizing the contact area between the hand and the object, greatly increasing grasp success and object hold times in both a virtual and a physical environment. Our results present a viable method of prosthesis control implemented in real time, for reliable articulation of multiple simultaneous degrees of freedom. A combination of engineering advances shows promise for myoelectric prosthetic hands that are controlled by a user’s remaining muscle activity. Fine finger movements are decoded from surface electromyograms with machine learning algorithms and this is combined with a robotic controller that is active only during object grasping to assist in maximizing contact. This shared control scheme allows user-controlled movements when high dexterity is desired, but also assisted grasping when robustness is required.

Journal ArticleDOI
TL;DR: It is argued that these perceptions of AI’s possibilities, which may be quite detached from the reality of the technology, can influence how it is developed, deployed and regulated, yet influence scientific goals, public understanding and regulation of AI.
Abstract: This paper categorizes some of the fundamental hopes and fears expressed in imaginings of artificial intelligence (AI), based on a survey of 300 fictional and non-fictional works. The categories are structured into four dichotomies, each comprising a hope and a parallel fear, mediated by the notion of control. These are: the hope for much longer lives (‘immortality’) and the fear of losing one’s identity (‘inhumanity’); the hope for a life free of work (‘ease’), and the fear of becoming redundant (‘obsolescence’); the hope that AI can fulfil one’s desires (‘gratification’), alongside the fear that humans will become redundant to each other (‘alienation’); and the hope that AI offers power over others (‘dominance’), with the fear that it will turn against us (‘uprising’). This Perspective further argues that these perceptions of AI’s possibilities, which may be quite detached from the reality of the technology, can influence how it is developed, deployed and regulated. A survey of 300 fictional and non-fictional works featuring artificial intelligence reveals that imaginings of intelligent machines may be grouped in four categories, each comprising a hope and a parallel fear. These perceptions are decoupled from what is realistically possible with current technology, yet influence scientific goals, public understanding and regulation of AI.

Journal ArticleDOI
Michael Davies1
TL;DR: In order for the neuromorphic research field to advance into the mainstream of computing, it needs to start quantifying gains, standardize on benchmarks and focus on feasible application challenges.
Abstract: In order for the neuromorphic research field to advance into the mainstream of computing, it needs to start quantifying gains, standardize on benchmarks and focus on feasible application challenges.

Journal ArticleDOI
TL;DR: This study reviews the progress of ncRNA type classification, specifically lncRNA, lincRNA, circular RNA and small nc RNA, and presents a comprehensive comparison of six deep learning based classification methods published in the past two years, and takes a close look at six state-of-the-art deep learning non-coding RNA classifiers.
Abstract: Non-coding (nc) RNA plays a vital role in biological processes and has been associated with diseases such as cancer. Classification of ncRNAs is necessary for understanding the underlying mechanisms of the diseases and to design effective treatments. Recently, deep learning has been employed for ncRNA identification and classification and has shown promising results. In this study, we review the progress of ncRNA type classification, specifically lncRNA, lincRNA, circular RNA and small ncRNA, and present a comprehensive comparison of six deep learning based classification methods published in the past two years. We identify research gaps and challenges of ncRNA types, such as the classification of subclasses of lncRNA, transcript length and compositional variation, dependency on database searches and the high false positive rate of existing approaches. We suggest future directions for cross-species performance deviation, deep learning model selection and sequence intrinsic features. Many functions of RNA strands that do not code for proteins are still to be deciphered. Methods to classify different groups of non-coding RNA increasingly use deep learning, but the landscape is diverse and methods need to be categorized and benchmarked to move forward. The authors take a close look at six state-of-the-art deep learning non-coding RNA classifiers and compare their performance and architecture.

Journal ArticleDOI
TL;DR: A reservoir computer that recognizes different forms of human action from video streams using photonic neural networks, comparable to state-of-the-art digital implementations, while promising a higher processing speed in comparison to the existing hardware approaches.
Abstract: The recognition of human actions in video streams is a challenging task in computer vision, with cardinal applications in e.g. brain-computer interface and surveillance. Deep learning has shown remarkable results recently, but can be found hard to use in practice, as its training requires large datasets and special purpose, energy-consuming hardware. In this work, we propose a photonic hardware approach. Our experimental setup comprises off-the-shelf components and implements an easy-to-train recurrent neural network with 16,384 nodes, scalable up to hundreds of thousands of nodes. The system, based on the reservoir computing paradigm, is trained to recognise six human actions from the KTH video database using either raw frames as inputs, or a set of features extracted with the histograms of oriented gradients algorithm. We report a classification accuracy of 91.3%, comparable to state-of-the-art digital implementations, while promising a higher processing speed in comparison to the existing hardware approaches. Because of the massively parallel processing capabilities offered by photonic architectures, we anticipate that this work will pave the way towards simply reconfigurable and energy-efficient solutions for real-time video processing.

Journal ArticleDOI
TL;DR: DECREASE, an efficient machine learning model that requires only a limited set of pairwise dose–response measurements for the accurate prediction of synergistic and antagonistic drug combinations, is implemented.
Abstract: High-throughput drug combination screening provides a systematic strategy to discover unexpected combinatorial synergies in pre-clinical cell models. However, phenotypic combinatorial screening with multi-dose matrix assays is experimentally expensive, especially when the aim is to identify selective combination synergies across a large panel of cell lines or patient samples. Here we implemented DECREASE, an efficient machine learning model that requires only a limited set of pairwise dose-response measurements for accurate prediction of drug combination synergy and antagonism. Using a compendium of 23,595 drug combination matrices tested in various cancer cell lines, and malaria and Ebola infection models, we demonstrate how cost-effective experimental designs with DECREASE capture almost the same degree of information for synergy and antagonism detection as the fully-measured dose-response matrices. Measuring only the diagonal of the matrix provides an accurate and practical option for combinatorial screening. The open-source web-implementation enables applications of DECREASE to both pre-clinical and translational studies.

Journal ArticleDOI
TL;DR: A new class of machines with evaluation processes akin to feelings is proposed, based on the principles of homeostasis and developments in soft robotics and multisensory integration, to constitute a platform for investigating consciousness, intelligence and the feeling process itself.
Abstract: Attempts to create machines that behave intelligently often conceptualize intelligence as the ability to achieve goals, leaving unanswered a crucial question: whose goals? In a dynamic and unpredictable world, an intelligent agent should hold its own meta-goal of self-preservation, like living organisms whose survival relies on homeostasis: the regulation of body states aimed at maintaining conditions compatible with life. In organisms capable of mental states, feelings are a mental expression of the state of life in the body and play a critical role in regulating behaviour. Our goal here is to inquire about conditions that would potentially allow machines to care about what they do or think. Under certain conditions, machines capable of implementing a process resembling homeostasis might also acquire a source of motivation and a new means to evaluate behaviour, akin to that of feelings in living organisms. Drawing on recent developments in soft robotics and multisensory abstraction, we propose a new class of machines inspired by the principles of homeostasis. The resulting machines would (1) exhibit equivalents to feeling; (2) improve their functionality across a range of environments; and (3) constitute a platform for investigating consciousness, intelligence and the feeling process itself. Robots and machines are generally designed to perform specific tasks. Unlike humans, they lack the ability to generate feelings based on interactions with the world. The authors propose a new class of machines with evaluation processes akin to feelings, based on the principles of homeostasis and developments in soft robotics and multisensory integration.