scispace - formally typeset
Search or ask a question

Showing papers by "IBM published in 2019"


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of learning model parameters from data distributed across multiple edge nodes, without sending raw data to a centralized place, and propose a control algorithm that determines the best tradeoff between local update and global parameter aggregation to minimize the loss function under a given resource budget.
Abstract: Emerging technologies and applications including Internet of Things, social networking, and crowd-sourcing generate large amounts of data at the network edge. Machine learning models are often built from the collected data, to enable the detection, classification, and prediction of future events. Due to bandwidth, storage, and privacy concerns, it is often impractical to send all the data to a centralized location. In this paper, we consider the problem of learning model parameters from data distributed across multiple edge nodes, without sending raw data to a centralized place. Our focus is on a generic class of machine learning models that are trained using gradient-descent-based approaches. We analyze the convergence bound of distributed gradient descent from a theoretical point of view, based on which we propose a control algorithm that determines the best tradeoff between local update and global parameter aggregation to minimize the loss function under a given resource budget. The performance of the proposed algorithm is evaluated via extensive experiments with real datasets, both on a networked prototype system and in a larger-scale simulated environment. The experimentation results show that our proposed approach performs near to the optimum with various machine learning models and different data distributions.

1,441 citations


Journal ArticleDOI
13 Mar 2019-Nature
TL;DR: In this article, two quantum algorithms for machine learning on a superconducting processor are proposed and experimentally implemented, using a variational quantum circuit to classify the data in a way similar to the method of conventional SVMs.
Abstract: Machine learning and quantum computing are two technologies that each have the potential to alter how computation is performed to address previously untenable problems. Kernel methods for machine learning are ubiquitous in pattern recognition, with support vector machines (SVMs) being the best known method for classification problems. However, there are limitations to the successful solution to such classification problems when the feature space becomes large, and the kernel functions become computationally expensive to estimate. A core element in the computational speed-ups enabled by quantum algorithms is the exploitation of an exponentially large quantum state space through controllable entanglement and interference. Here we propose and experimentally implement two quantum algorithms on a superconducting processor. A key component in both methods is the use of the quantum state space as feature space. The use of a quantum-enhanced feature space that is only efficiently accessible on a quantum computer provides a possible path to quantum advantage. The algorithms solve a problem of supervised learning: the construction of a classifier. One method, the quantum variational classifier, uses a variational quantum circuit1,2 to classify the data in a way similar to the method of conventional SVMs. The other method, a quantum kernel estimator, estimates the kernel function on the quantum computer and optimizes a classical SVM. The two methods provide tools for exploring the applications of noisy intermediate-scale quantum computers3 to machine learning.

1,140 citations


Journal ArticleDOI
TL;DR: An overview of recent advances in physical reservoir computing is provided by classifying them according to the type of the reservoir to expand its practical applications and develop next-generation machine learning systems.

959 citations


Book
03 Feb 2019
TL;DR: A “branch and bound” algorithm is presented for solving the traveling salesman problem, where the set of all tours feasible solutions is broken up into increasingly small subsets by a procedure called branching.
Abstract: A “branch and bound” algorithm is presented for solving the traveling salesman problem. The set of all tours feasible solutions is broken up into increasingly small subsets by a procedure called branching. For each subset a lower bound on the length of the tours therein is calculated. Eventually, a subset is found that contains a single tour whose length is less than or equal to some lower bound for every tour. The motivation of the branching and the calculation of the lower bounds are based on ideas frequently used in solving assignment problems. Computationally, the algorithm extends the size of problem that can reasonably be solved without using methods special to the particular problem.

813 citations


Journal ArticleDOI
TL;DR: This paper provides a tutorial on fog computing and its related computing paradigms, including their similarities and differences, and provides a taxonomy of research topics in fog computing.

783 citations


Journal ArticleDOI
01 Mar 2019-Nature
TL;DR: This work applies the error mitigation protocol to mitigate errors in canonical single- and two-qubit experiments and extends its application to the variational optimization of Hamiltonians for quantum chemistry and magnetism.
Abstract: Quantum computation, a paradigm of computing that is completely different from classical methods, benefits from theoretically proved speed-ups for certain problems and can be used to study the properties of quantum systems1. Yet, because of the inherently fragile nature of the physical computing elements (qubits), achieving quantum advantages over classical computation requires extremely low error rates for qubit operations, as well as substantial physical qubits, to realize fault tolerance via quantum error correction2,3. However, recent theoretical work4,5 has shown that the accuracy of computation (based on expectation values of quantum observables) can be enhanced through an extrapolation of results from a collection of experiments of varying noise. Here we demonstrate this error mitigation protocol on a superconducting quantum processor, enhancing its computational capability, with no additional hardware modifications. We apply the protocol to mitigate errors in canonical single- and two-qubit experiments and then extend its application to the variational optimization6–8 of Hamiltonians for quantum chemistry and magnetism9. We effectively demonstrate that the suppression of incoherent errors helps to achieve an otherwise inaccessible level of accuracy in the variational solutions using our noisy processor. These results demonstrate that error mitigation techniques will enable substantial improvements in the capabilities of near-term quantum computing hardware. The accuracy of computations on noisy, near-term quantum systems can be enhanced by extrapolating results from experiments with various noise levels, without requiring additional hardware modifications.

690 citations


Proceedings ArticleDOI
11 Nov 2019
TL;DR: This paper presents an alternative approach that utilizes both differential privacy and SMC to balance these trade-offs and enables the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust.
Abstract: Federated learning facilitates the collaborative training of models without the sharing of raw data. However, recent attacks demonstrate that simply maintaining data locality during training processes does not provide sufficient privacy guarantees. Rather, we need a federated learning system capable of preventing inference over both the messages exchanged during training and the final trained model while ensuring the resulting model also has acceptable predictive accuracy. Existing federated learning approaches either use secure multiparty computation (SMC) which is vulnerable to inference or differential privacy which can lead to low accuracy given a large number of parties with relatively small amounts of data each. In this paper, we present an alternative approach that utilizes both differential privacy and SMC to balance these trade-offs. Combining differential privacy with secure multiparty computation enables us to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust. Our system is therefore a scalable approach that protects against inference threats and produces models with high accuracy. Additionally, our system can be used to train a variety of machine learning models, which we validate with experimental results on 3 different machine learning algorithms. Our experiments demonstrate that our approach out-performs state of the art solutions.

538 citations


Journal ArticleDOI
Andrew W. Cross1, Lev S. Bishop1, Sarah Sheldon1, P. D. Nation1, Jay M. Gambetta1 
TL;DR: A single-number metric, quantum volume, that can be measured using a concrete protocol on near-term quantum computers of modest size, and measured on several state-of-the-art transmon devices, finding values as high as 16.5%.
Abstract: We introduce a single-number metric, quantum volume, that can be measured using a concrete protocol on near-term quantum computers of modest size ($n\ensuremath{\lesssim}50$), and measure it on several state-of-the-art transmon devices, finding values as high as 16. The quantum volume is linked to system error rates, and is empirically reduced by uncontrolled interactions within the system. It quantifies the largest random circuit of equal width and depth that the computer successfully implements. Quantum computing systems with high-fidelity operations, high connectivity, large calibrated gate sets, and circuit rewriting toolchains are expected to have higher quantum volumes. The quantum volume is a pragmatic way to measure and compare progress toward improved system-wide gate error rates for near-term quantum computation and error-correction experiments.

532 citations


Proceedings ArticleDOI
01 Jan 2019
TL;DR: The core idea of the FLAIR framework is to present a simple, unified interface for conceptually very different types of word and document embeddings, which effectively hides all embedding-specific engineering complexity and allows researchers to “mix and match” variousembeddings with little effort.
Abstract: We present FLAIR, an NLP framework designed to facilitate training and distribution of state-of-the-art sequence labeling, text classification and language models. The core idea of the framework is to present a simple, unified interface for conceptually very different types of word and document embeddings. This effectively hides all embedding-specific engineering complexity and allows researchers to “mix and match” various embeddings with little effort. The framework also implements standard model training and hyperparameter selection routines, as well as a data fetching module that can download publicly available NLP datasets and convert them into data structures for quick set up of experiments. Finally, FLAIR also ships with a “model zoo” of pre-trained models to allow researchers to use state-of-the-art NLP models in their applications. This paper gives an overview of the framework and its functionality. The framework is available on GitHub at https://github.com/zalandoresearch/flair .

499 citations


Journal ArticleDOI
16 May 2019-Cell
TL;DR: This large-scale, single-cell atlas deepens the understanding of breast tumor ecosystems and suggests that ecosystem-based patient classification will facilitate identification of individuals for precision medicine approaches targeting the tumor and its immunoenvironment.

470 citations


Proceedings ArticleDOI
01 Oct 2019
TL;DR: A Self-similarity Grouping (SSG) approach, which exploits the potential similarity of unlabeled samples to build multiple clusters from different views automatically, and introduces a clustering-guided semisupervised approach named SSG ++ to conduct the one-shot domain adaption in an open set setting.
Abstract: Domain adaptation in person re-identification (re-ID) has always been a challenging task. In this work, we explore how to harness the similar natural characteristics existing in the samples from the target domain for learning to conduct person re-ID in an unsupervised manner. Concretely, we propose a Self-similarity Grouping (SSG) approach, which exploits the potential similarity (from the global body to local parts) of unlabeled samples to build multiple clusters from different views automatically. These independent clusters are then assigned with labels, which serve as the pseudo identities to supervise the training process. We repeatedly and alternatively conduct such a grouping and training process until the model is stable. Despite the apparent simplify, our SSG outperforms the state-of-the-arts by more than 4.6% (DukeMTMC→Market1501) and 4.4% (Market1501→DukeMTMC) in mAP, respectively. Upon our SSG, we further introduce a clustering-guided semisupervised approach named SSG ++ to conduct the one-shot domain adaption in an open set setting (i.e. the number of independent identities from the target domain is unknown). Without spending much effort on labeling, our SSG ++ can further promote the mAP upon SSG by 10.7% and 6.9%, respectively. Our Code is available at: https://github.com/OasisYang/SSG .

Journal ArticleDOI
TL;DR: This study paves the way for operators of smart environments to monitor their IoT assets for presence, functionality, and cyber-security without requiring any specialized devices or protocols.
Abstract: The Internet of Things (IoT) is being hailed as the next wave revolutionizing our society, and smart homes, enterprises, and cities are increasingly being equipped with a plethora of IoT devices. Yet, operators of such smart environments may not even be fully aware of their IoT assets, let alone whether each IoT device is functioning properly safe from cyber-attacks. In this paper, we address this challenge by developing a robust framework for IoT device classification using traffic characteristics obtained at the network level. Our contributions are fourfold. First, we instrument a smart environment with 28 different IoT devices spanning cameras, lights, plugs, motion sensors, appliances, and health-monitors. We collect and synthesize traffic traces from this infrastructure for a period of six months, a subset of which we release as open data for the community to use. Second, we present insights into the underlying network traffic characteristics using statistical attributes such as activity cycles, port numbers, signalling patterns, and cipher suites. Third, we develop a multi-stage machine learning based classification algorithm and demonstrate its ability to identify specific IoT devices with over 99 percent accuracy based on their network activity. Finally, we discuss the trade-offs between cost, speed, and performance involved in deploying the classification framework in real-time. Our study paves the way for operators of smart environments to monitor their IoT assets for presence, functionality, and cyber-security without requiring any specialized devices or protocols.

Journal ArticleDOI
TL;DR: This manuscript describes the most recommendable methodologies for the fabrication, characterization, and simulation of RS devices, as well as the proper methods to display the data obtained.
Abstract: Resistive switching (RS) is an interesting property shown by some materials systems that, especially during the last decade, has gained a lot of interest for the fabrication of electronic devices, with electronic nonvolatile memories being those that have received the most attention. The presence and quality of the RS phenomenon in a materials system can be studied using different prototype cells, performing different experiments, displaying different figures of merit, and developing different computational analyses. Therefore, the real usefulness and impact of the findings presented in each study for the RS technology will be also different. This manuscript describes the most recommendable methodologies for the fabrication, characterization, and simulation of RS devices, as well as the proper methods to display the data obtained. The idea is to help the scientific community to evaluate the real usefulness and impact of an RS study for the development of RS technology. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Posted Content
TL;DR: This work proposes EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings, and captures the dynamism of the graph sequence through using an RNN to evolve the GCN parameters.
Abstract: Graph representation learning resurges as a trending research subject owing to the widespread use of deep learning for Euclidean data, which inspire various creative designs of neural networks in the non-Euclidean domain, particularly graphs. With the success of these graph neural networks (GNN) in the static setting, we approach further practical scenarios where the graph dynamically evolves. Existing approaches typically resort to node embeddings and use a recurrent neural network (RNN, broadly speaking) to regulate the embeddings and learn the temporal dynamics. These methods require the knowledge of a node in the full time span (including both training and testing) and are less applicable to the frequent change of the node set. In some extreme scenarios, the node sets at different time steps may completely differ. To resolve this challenge, we propose EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings. The proposed approach captures the dynamism of the graph sequence through using an RNN to evolve the GCN parameters. Two architectures are considered for the parameter evolution. We evaluate the proposed approach on tasks including link prediction, edge classification, and node classification. The experimental results indicate a generally higher performance of EvolveGCN compared with related approaches. The code is available at \url{this https URL}.

Journal ArticleDOI
TL;DR: This work shows that a multihead attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark data set and is able to handle inputs without a reactant–reagent split and including stereochemistry, which makes the method universally applicable.
Abstract: Organic synthesis is one of the key stumbling blocks in medicinal chemistry. A necessary yet unsolved step in planning synthesis is solving the forward problem: Given reactants and reagents, predict the products. Similar to other work, we treat reaction prediction as a machine translation problem between simplified molecular-input line-entry system (SMILES) strings (a text-based representation) of reactants, reagents, and the products. We show that a multihead attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark data set. Molecular Transformer makes predictions by inferring the correlations between the presence and absence of chemical motifs in the reactant, reagent, and product present in the data set. Our model requires no handcrafted rules and accurately predicts subtle chemical transformations. Crucially, our model can accurately estimate its own uncertainty, with an uncertainty score that is 89% accurate in terms of classifying whether a prediction is correct. Furthermore, we show that the model is able to handle inputs without a reactant-reagent split and including stereochemistry, which makes our method universally applicable.

Proceedings ArticleDOI
01 Jun 2019
TL;DR: In this article, the authors present commonsenseQA, a dataset for commonsense question answering with prior knowledge, where workers are asked to create multiple-choice questions with complex semantics that often require prior knowledge.
Abstract: When answering a question, people often draw upon their rich world knowledge in addition to the particular context. Recent work has focused primarily on answering questions given some relevant document or context, and required very little general background. To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering. To capture common sense beyond associations, we extract from ConceptNet (Speer et al., 2017) multiple target concepts that have the same semantic relation to a single source concept. Crowd-workers are asked to author multiple-choice questions that mention the source concept and discriminate in turn between each of the target concepts. This encourages workers to create questions with complex semantics that often require prior knowledge. We create 12,247 questions through this procedure and demonstrate the difficulty of our task with a large number of strong baselines. Our best baseline is based on BERT-large (Devlin et al., 2018) and obtains 56% accuracy, well below human performance, which is 89%.

Journal ArticleDOI
20 Sep 2019-Science
TL;DR: In this paper, a cyclo[18] carbon (C18) was generated using atom manipulation on bilayer NaCl on Cu(111) at 5 kelvin by eliminating carbon monoxide from a cyclocarbon oxide molecule, C24O6.
Abstract: Carbon allotropes built from rings of two-coordinate atoms, known as cyclo[n]carbons, have fascinated chemists for many years, but until now they could not be isolated or structurally characterized because of their high reactivity. We generated cyclo[18]carbon (C18) using atom manipulation on bilayer NaCl on Cu(111) at 5 kelvin by eliminating carbon monoxide from a cyclocarbon oxide molecule, C24O6 Characterization of cyclo[18]carbon by high-resolution atomic force microscopy revealed a polyynic structure with defined positions of alternating triple and single bonds. The high reactivity of cyclocarbon and cyclocarbon oxides allows covalent coupling between molecules to be induced by atom manipulation, opening an avenue for the synthesis of other carbon allotropes and carbon-rich materials from the coalescence of cyclocarbon molecules.

Proceedings Article
06 Sep 2019
TL;DR: A highly automated platform that enables gathering datasets with controls at scale using automated tools throughout machine learning to generate datasets that exercise models in new ways thus providing valuable feedback to researchers is developed.
Abstract: We collect a large real-world test set, ObjectNet, for object recognition with controls where object backgrounds, rotations, and imaging viewpoints are random. Most scientific experiments have controls, confounds which are removed from the data, to ensure that subjects cannot perform a task by exploiting trivial correlations in the data. Historically, large machine learning and computer vision datasets have lacked such controls. This has resulted in models that must be fine-tuned for new datasets and perform better on datasets than in real-world applications. When tested on ObjectNet, object detectors show a 40-45% drop in performance, with respect to their performance on other benchmarks, due to the controls for biases. Controls make ObjectNet robust to fine-tuning showing only small performance increases. We develop a highly automated platform that enables gathering datasets with controls by crowdsourcing image capturing and annotation. ObjectNet is the same size as the ImageNet test set (50,000 images), and by design does not come paired with a training set in order to encourage generalization. The dataset is both easier than ImageNet (objects are largely centered and unoccluded) and harder (due to the controls). Although we focus on object recognition here, data with controls can be gathered at scale using automated tools throughout machine learning to generate datasets that exercise models in new ways thus providing valuable feedback to researchers. This work opens up new avenues for research in generalizable, robust, and more human-like computer vision and in creating datasets where results are predictive of real-world performance.

Journal ArticleDOI
24 Apr 2019-Nature
TL;DR: In this paper, phase-dependent zero-bias peak measured by tunnelling spectroscopy at the end of Josephson junctions realized on a heterostructure consisting of aluminium on indium arsenide was found.
Abstract: Majorana zero modes—quasiparticle states localized at the boundaries of topological superconductors—are expected to be ideal building blocks for fault-tolerant quantum computing1,2. Several observations of zero-bias conductance peaks measured by tunnelling spectroscopy above a critical magnetic field have been reported as experimental indications of Majorana zero modes in superconductor–semiconductor nanowires3–8. On the other hand, two-dimensional systems offer the alternative approach of confining Majorana channels within planar Josephson junctions, in which the phase difference φ between the superconducting leads represents an additional tuning knob that is predicted to drive the system into the topological phase at lower magnetic fields than for a system without phase bias9,10. Here we report the observation of phase-dependent zero-bias conductance peaks measured by tunnelling spectroscopy at the end of Josephson junctions realized on a heterostructure consisting of aluminium on indium arsenide. Biasing the junction to φ ≈ π reduces the critical field at which the zero-bias peak appears, with respect to φ = 0. The phase and magnetic-field dependence of the zero-energy states is consistent with a model of Majorana zero modes in finite-size Josephson junctions. As well as providing experimental evidence of phase-tuned topological superconductivity, our devices are compatible with superconducting quantum electrodynamics architectures11 and are scalable to the complex geometries needed for topological quantum computing9,12,13. Evidence is found for phase-tunable Majorana zero modes in scalable two-dimensional Josephson junctions produced by top-down fabrication.

Journal ArticleDOI
TL;DR: The authors adapts the image prior learned by GANs to image statistics of an individual image, which can accurately reconstruct the input image and synthesize new content consistent with the appearance of the original image.
Abstract: Despite the recent success of GANs in synthesizing images conditioned on inputs such as a user sketch, text, or semantic labels, manipulating the high-level attributes of an existing natural photograph with GANs is challenging for two reasons. First, it is hard for GANs to precisely reproduce an input image. Second, after manipulation, the newly synthesized pixels often do not fit the original image. In this paper, we address these issues by adapting the image prior learned by GANs to image statistics of an individual image. Our method can accurately reconstruct the input image and synthesize new content, consistent with the appearance of the input image. We demonstrate our interactive system on several semantic image editing tasks, including synthesizing new objects consistent with background, removing unwanted objects, and changing the appearance of an object. Quantitative and qualitative comparisons against several existing methods demonstrate the effectiveness of our method.

Proceedings ArticleDOI
07 Jan 2019
TL;DR: In this article, distance metric learning (DML) is applied to object classification, both in the standard regime of rich training data and in the few-shot scenario, where each category is represented by only a few examples.
Abstract: Distance metric learning (DML) has been successfully applied to object classification, both in the standard regime of rich training data and in the few-shot scenario, where each category is represented by only a few examples. In this work, we propose a new method for DML that simultaneously learns the backbone network parameters, the embedding space, and the multi-modal distribution of each of the training categories in that space, in a single end-to-end training process. Our approach outperforms state-of-the-art methods for DML-based object classification on a variety of standard fine-grained datasets. Furthermore, we demonstrate the effectiveness of our approach on the problem of few-shot object detection, by incorporating the proposed DML architecture as a classification head into a standard object detection model. We achieve the best results on the ImageNet-LOC dataset compared to strong baselines, when only a few training examples are available. We also offer the community a new episodic benchmark based on the ImageNet dataset for the few-shot object detection task.

Journal ArticleDOI
17 Jul 2019
TL;DR: A simple yet effective Horizontal Pyramid Matching (HPM) approach to fully exploit various partial information of a given person, so that correct person candidates can be still identified even even some key parts are missing.
Abstract: Despite the remarkable progress in person re-identification (Re-ID), such approaches still suffer from the failure cases where the discriminative body parts are missing. To mitigate this type of failure, we propose a simple yet effective Horizontal Pyramid Matching (HPM) approach to fully exploit various partial information of a given person, so that correct person candidates can be identified even if some key parts are missing. With HPM, we make the following contributions to produce more robust feature representations for the Re-ID task: 1) we learn to classify using partial feature representations at different horizontal pyramid scales, which successfully enhance the discriminative capabilities of various person parts; 2) we exploit average and max pooling strategies to account for person-specific discriminative information in a global-local manner. To validate the effectiveness of our proposed HPM method, extensive experiments are conducted on three popular datasets including Market-1501, DukeMTMCReID and CUHK03. Respectively, we achieve mAP scores of 83.1%, 74.5% and 59.7% on these challenging benchmarks, which are the new state-of-the-arts.

Journal ArticleDOI
TL;DR: Direct prospective comparison of circulating tumor DNA and tissue biopsy sequencing shows the superiority of liquid biopsies for capturing clinically relevant alterations mediating resistance to targeted therapies in cancer patients.
Abstract: During cancer therapy, tumor heterogeneity can drive the evolution of multiple tumor subclones harboring unique resistance mechanisms in an individual patient1–3. Previous case reports and small case series have suggested that liquid biopsy (specifically, cell-free DNA (cfDNA)) may better capture the heterogeneity of acquired resistance4–8. However, the effectiveness of cfDNA versus standard single-lesion tumor biopsies has not been directly compared in larger-scale prospective cohorts of patients following progression on targeted therapy. Here, in a prospective cohort of 42 patients with molecularly defined gastrointestinal cancers and acquired resistance to targeted therapy, direct comparison of postprogression cfDNA versus tumor biopsy revealed that cfDNA more frequently identified clinically relevant resistance alterations and multiple resistance mechanisms, detecting resistance alterations not found in the matched tumor biopsy in 78% of cases. Whole-exome sequencing of serial cfDNA, tumor biopsies and rapid autopsy specimens elucidated substantial geographic and evolutionary differences across lesions. Our data suggest that acquired resistance is frequently characterized by profound tumor heterogeneity, and that the emergence of multiple resistance alterations in an individual patient may represent the ‘rule’ rather than the ‘exception’. These findings have profound therapeutic implications and highlight the potential advantages of cfDNA over tissue biopsy in the setting of acquired resistance. Direct prospective comparison of circulating tumor DNA and tissue biopsy sequencing shows the superiority of liquid biopsies for capturing clinically relevant alterations mediating resistance to targeted therapies in cancer patients.

Journal ArticleDOI
TL;DR: State-of-the-art machine-learning classifiers outperformed human experts in the diagnosis of pigmented skin lesions and should have a more important role in clinical practice.
Abstract: Summary Background Whether machine-learning algorithms can diagnose all pigmented skin lesions as accurately as human experts is unclear. The aim of this study was to compare the diagnostic accuracy of state-of-the-art machine-learning algorithms with human readers for all clinically relevant types of benign and malignant pigmented skin lesions. Methods For this open, web-based, international, diagnostic study, human readers were asked to diagnose dermatoscopic images selected randomly in 30-image batches from a test set of 1511 images. The diagnoses from human readers were compared with those of 139 algorithms created by 77 machine-learning labs, who participated in the International Skin Imaging Collaboration 2018 challenge and received a training set of 10 015 images in advance. The ground truth of each lesion fell into one of seven predefined disease categories: intraepithelial carcinoma including actinic keratoses and Bowen's disease; basal cell carcinoma; benign keratinocytic lesions including solar lentigo, seborrheic keratosis and lichen planus-like keratosis; dermatofibroma; melanoma; melanocytic nevus; and vascular lesions. The two main outcomes were the differences in the number of correct specific diagnoses per batch between all human readers and the top three algorithms, and between human experts and the top three algorithms. Findings Between Aug 4, 2018, and Sept 30, 2018, 511 human readers from 63 countries had at least one attempt in the reader study. 283 (55·4%) of 511 human readers were board-certified dermatologists, 118 (23·1%) were dermatology residents, and 83 (16·2%) were general practitioners. When comparing all human readers with all machine-learning algorithms, the algorithms achieved a mean of 2·01 (95% CI 1·97 to 2·04; p Interpretation State-of-the-art machine-learning classifiers outperformed human experts in the diagnosis of pigmented skin lesions and should have a more important role in clinical practice. However, a possible limitation of these algorithms is their decreased performance for out-of-distribution images, which should be addressed in future research. Funding None.

Journal ArticleDOI
TL;DR: It is proved that the Pockels effect remains strong even in nanoscale devices, and shown as a practical example data modulation up to 50 Gbit s−1.
Abstract: The electro-optical Pockels effect is an essential nonlinear effect used in many applications. The ultrafast modulation of the refractive index is, for example, crucial to optical modulators in photonic circuits. Silicon has emerged as a platform for integrating such compact circuits, but a strong Pockels effect is not available on silicon platforms. Here, we demonstrate a large electro-optical response in silicon photonic devices using barium titanate. We verify the Pockels effect to be the physical origin of the response, with r42 = 923 pm V−1, by confirming key signatures of the Pockels effect in ferroelectrics: the electro-optic response exhibits a crystalline anisotropy, remains strong at high frequencies, and shows hysteresis on changing the electric field. We prove that the Pockels effect remains strong even in nanoscale devices, and show as a practical example data modulation up to 50 Gbit s−1. We foresee that our work will enable novel device concepts with an application area largely extending beyond communication technologies. Electro-optic modulators based on epitaxial barium titanate (BTO) integrated on silicon exhibit speeds up to 50 Gbit s–1 while the Pockels coefficient of the BTO film is found to be approaching the bulk value.

Journal ArticleDOI
TL;DR: This article reviews the plethora of recent experimental results in this area and discusses the various theoretical models which have been used to describe the observations and summarises the current approaches to solving this fundamentally important problem in solid-state physics.
Abstract: Amorphous solids show surprisingly universal behaviour at low temperatures. The prevailing wisdom is that this can be explained by the existence of two-state defects within the material. The so-called standard tunneling model has become the established framework to explain these results, yet it still leaves the central question essentially unanswered-what are these two-level defects (TLS)? This question has recently taken on a new urgency with the rise of superconducting circuits in quantum computing, circuit quantum electrodynamics, magnetometry, electrometry and metrology. Superconducting circuits made from aluminium or niobium are fundamentally limited by losses due to TLS within the amorphous oxide layers encasing them. On the other hand, these circuits also provide a novel and effective method for studying the very defects which limit their operation. We can now go beyond ensemble measurements and probe individual defects-observing the quantum nature of their dynamics and studying their formation, their behaviour as a function of applied field, strain, temperature and other properties. This article reviews the plethora of recent experimental results in this area and discusses the various theoretical models which have been used to describe the observations. In doing so, it summarises the current approaches to solving this fundamentally important problem in solid-state physics.

Proceedings ArticleDOI
01 Jun 2019
TL;DR: This work proposes a method in which it dynamically aggregate contextualized embeddings of each unique string that the authors encounter and uses a pooling operation to distill a ”global” word representation from all contextualized instances.
Abstract: Contextual string embeddings are a recent type of contextualized word embedding that were shown to yield state-of-the-art results when utilized in a range of sequence labeling tasks. They are based on character-level language models which treat text as distributions over characters and are capable of generating embeddings for any string of characters within any textual context. However, such purely character-based approaches struggle to produce meaningful embeddings if a rare string is used in a underspecified context. To address this drawback, we propose a method in which we dynamically aggregate contextualized embeddings of each unique string that we encounter. We then use a pooling operation to distill a ”global” word representation from all contextualized instances. We evaluate these ”pooled contextualized embeddings” on common named entity recognition (NER) tasks such as CoNLL-03 and WNUT and show that our approach significantly improves the state-of-the-art for NER. We make all code and pre-trained models available to the research community for use and reproduction.

Journal ArticleDOI
17 Jul 2019
TL;DR: Li et al. as discussed by the authors proposed an adaptive random gradient estimation strategy to balance query counts and distortion, and an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration.
Abstract: Recent studies have shown that adversarial examples in state-of-the-art image classifiers trained by deep neural networks (DNN) can be easily generated when the target model is transparent to an attacker, known as the white-box setting. However, when attacking a deployed machine learning service, one can only acquire the input-output correspondences of the target model; this is the so-called black-box attack setting. The major drawback of existing black-box attacks is the need for excessive model queries, which may give a false sense of model robustness due to inefficient query designs. To bridge this gap, we propose a generic framework for query-efficient blackbox attacks. Our framework, AutoZOOM, which is short for Autoencoder-based Zeroth Order Optimization Method, has two novel building blocks towards efficient black-box attacks: (i) an adaptive random gradient estimation strategy to balance query counts and distortion, and (ii) an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration. Experimental results suggest that, by applying AutoZOOM to a state-of-the-art black-box attack (ZOO), a significant reduction in model queries can be achieved without sacrificing the attack success rate and the visual quality of the resulting adversarial examples. In particular, when compared to the standard ZOO method, AutoZOOM can consistently reduce the mean query counts in finding successful adversarial examples (or reaching the same distortion level) by at least 93% on MNIST, CIFAR-10 and ImageNet datasets, leading to novel insights on adversarial robustness.

Proceedings ArticleDOI
25 Jul 2019
TL;DR: Wang et al. as discussed by the authors introduced a pooling operator based on graph Fourier transform, which can utilize the node features and local structures during the pooling process, and designed pooling layers based on the pool operator, which are further combined with traditional GCN convolutional layers to form a graph neural network framework for graph classification.
Abstract: Graph neural networks, which generalize deep neural network models to graph structured data, have attracted increasing attention in recent years. They usually learn node representations by transforming, propagating and aggregating node features and have been proven to improve the performance of many graph related tasks such as node classification and link prediction. To apply graph neural networks for the graph classification task, approaches to generate thegraph representation from node representations are demanded. A common way is to globally combine the node representations. However, rich structural information is overlooked. Thus a hierarchical pooling procedure is desired to preserve the graph structure during the graph representation learning. There are some recent works on hierarchically learning graph representation analogous to the pooling step in conventional convolutional neural (CNN) networks. However, the local structural information is still largely neglected during the pooling process. In this paper, we introduce a pooling operator $\pooling$ based on graph Fourier transform, which can utilize the node features and local structures during the pooling process. We then design pooling layers based on the pooling operator, which are further combined with traditional GCN convolutional layers to form a graph neural network framework $\m$ for graph classification. Theoretical analysis is provided to understand $\pooling$ from both local and global perspectives. Experimental results of the graph classification task on $6$ commonly used benchmarks demonstrate the effectiveness of the proposed framework.

Proceedings ArticleDOI
01 Oct 2019
TL;DR: This work visualize mode collapse at both the distribution level and the instance level, and deploys a semantic segmentation network to compare the distribution of segmented objects in the generated images with the target distribution in the training set.
Abstract: Despite the success of Generative Adversarial Networks (GANs), mode collapse remains a serious issue during GAN training. To date, little work has focused on understanding and quantifying which modes have been dropped by a model. In this work, we visualize mode collapse at both the distribution level and the instance level. First, we deploy a semantic segmentation network to compare the distribution of segmented objects in the generated images with the target distribution in the training set. Differences in statistics reveal object classes that are omitted by a GAN. Second, given the identified omitted object classes, we visualize the GAN's omissions directly. In particular, we compare specific differences between individual photos and their approximate inversions by a GAN. To this end, we relax the problem of inversion and solve the tractable problem of inverting a GAN layer instead of the entire generator. Finally, we use this framework to analyze several recent GANs trained on multiple datasets and identify their typical failure cases.