scispace - formally typeset
Search or ask a question

Showing papers by "IBM published in 2018"


Proceedings ArticleDOI
23 Apr 2018
TL;DR: This paper describes Fabric, its architecture, the rationale behind various design decisions, its most prominent implementation aspects, as well as its distributed application programming model, and shows that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations.
Abstract: Fabric is a modular and extensible open-source system for deploying and operating permissioned blockchains and one of the Hyperledger projects hosted by the Linux Foundation (www.hyperledger.org). Fabric is the first truly extensible blockchain system for running distributed applications. It supports modular consensus protocols, which allows the system to be tailored to particular use cases and trust models. Fabric is also the first blockchain system that runs distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency. This stands in sharp contrast to existing block-chain platforms that require "smart-contracts" to be written in domain-specific languages or rely on a cryptocurrency. Fabric realizes the permissioned model using a portable notion of membership, which may be integrated with industry-standard identity management. To support such flexibility, Fabric introduces an entirely novel blockchain design and revamps the way blockchains cope with non-determinism, resource exhaustion, and performance attacks. This paper describes Fabric, its architecture, the rationale behind various design decisions, its most prominent implementation aspects, as well as its distributed application programming model. We further evaluate Fabric by implementing and benchmarking a Bitcoin-inspired digital currency. We show that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations, with sub-second latency, scaling well to over 100 peers.

2,813 citations


Proceedings ArticleDOI
04 Apr 2018
TL;DR: The most recent edition of the dermoscopic image analysis benchmark challenge as discussed by the authors was organized to support research and development of algorithms for automated diagnosis of melanoma, the most lethal skin cancer.
Abstract: This article describes the design, implementation, and results of the latest installment of the dermoscopic image analysis benchmark challenge. The goal is to support research and development of algorithms for automated diagnosis of melanoma, the most lethal skin cancer. The challenge was divided into 3 tasks: lesion segmentation, feature detection, and disease classification. Participation involved 593 registrations, 81 pre-submissions, 46 finalized submissions (including a 4-page manuscript), and approximately 50 attendees, making this the largest standardized and comparative study in this field to date. While the official challenge duration and ranking of participants has concluded, the dataset snapshots remain available for further research and development.

1,419 citations


Posted ContentDOI
Spyridon Bakas1, Mauricio Reyes, Andras Jakab2, Stefan Bauer3  +435 moreInstitutions (111)
TL;DR: This study assesses the state-of-the-art machine learning methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018, and investigates the challenge of identifying the best ML algorithms for each of these tasks.
Abstract: Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumoris a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses thestate-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross tota lresection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.

1,165 citations


Proceedings Article
01 Aug 2018
TL;DR: This paper proposes to leverage the internal states of a trained character language model to produce a novel type of word embedding which they refer to as contextual string embeddings, which are fundamentally model words as sequences of characters and are contextualized by their surrounding text.
Abstract: Recent advances in language modeling using recurrent neural networks have made it viable to model language as distributions over characters. By learning to predict the next character on the basis of previous characters, such models have been shown to automatically internalize linguistic concepts such as words, sentences, subclauses and even sentiment. In this paper, we propose to leverage the internal states of a trained character language model to produce a novel type of word embedding which we refer to as contextual string embeddings. Our proposed embeddings have the distinct properties that they (a) are trained without any explicit notion of words and thus fundamentally model words as sequences of characters, and (b) are contextualized by their surrounding text, meaning that the same word will have different embeddings depending on its contextual use. We conduct a comparative evaluation against previous embeddings and find that our embeddings are highly useful for downstream tasks: across four classic sequence labeling tasks we consistently outperform the previous state-of-the-art. In particular, we significantly outperform previous work on English and German named entity recognition (NER), allowing us to report new state-of-the-art F1-scores on the CoNLL03 shared task. We release all code and pre-trained language models in a simple-to-use framework to the research community, to enable reproduction of these experiments and application of our proposed embeddings to other tasks: https://github.com/zalandoresearch/flair

1,152 citations


Journal ArticleDOI
TL;DR: The core opportunities and risks of AI for society are introduced; a synthesis of five ethical principles that should undergird its development and adoption are presented; and 20 concrete recommendations are offered to serve as a firm foundation for the establishment of a Good AI Society.
Abstract: This article reports the findings of AI4People, an Atomium—EISMD initiative designed to lay the foundations for a “Good AI Society”. We introduce the core opportunities and risks of AI for society; present a synthesis of five ethical principles that should undergird its development and adoption; and offer 20 concrete recommendations—to assess, to develop, to incentivise, and to support good AI—which in some cases may be undertaken directly by national or supranational policy makers, while in others may be led by other stakeholders. If adopted, these recommendations would serve as a firm foundation for the establishment of a Good AI Society.

855 citations


Posted Content
Jie Chen1, Tengfei Ma1, Cao Xiao1
TL;DR: Enhanced with importance sampling, FastGCN not only is efficient for training but also generalizes well for inference, and is orders of magnitude more efficient while predictions remain comparably accurate.
Abstract: The graph convolutional networks (GCN) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning This model, however, was originally designed to be learned with the presence of both training and test data Moreover, the recursive neighborhood expansion across layers poses time and memory challenges for training with large, dense graphs To relax the requirement of simultaneous availability of test data, we interpret graph convolutions as integral transforms of embedding functions under probability measures Such an interpretation allows for the use of Monte Carlo approaches to consistently estimate the integrals, which in turn leads to a batched training scheme as we propose in this work---FastGCN Enhanced with importance sampling, FastGCN not only is efficient for training but also generalizes well for inference We show a comprehensive set of experiments to demonstrate its effectiveness compared with GCN and related models In particular, training is orders of magnitude more efficient while predictions remain comparably accurate

786 citations


Journal ArticleDOI
02 Feb 2018-Science
TL;DR: The HLA-I genotype of 1535 advanced cancer patients treated with immune checkpoint blockade is determined and Maximal heterozygosity at Hla-I loci improved overall survival after ICB compared with patients who were homozygous for at least one HLA locus.
Abstract: CD8 + T cell–dependent killing of cancer cells requires efficient presentation of tumor antigens by human leukocyte antigen class I (HLA-I) molecules. However, the extent to which patient-specific HLA-I genotype influences response to anti–programmed cell death protein 1 or anti–cytotoxic T lymphocyte–associated protein 4 is currently unknown. We determined the HLA-I genotype of 1535 advanced cancer patients treated with immune checkpoint blockade (ICB). Maximal heterozygosity at HLA-I loci (“A,” “B,” and “C”) improved overall survival after ICB compared with patients who were homozygous for at least one HLA locus. In two independent melanoma cohorts, patients with the HLA-B44 supertype had extended survival, whereas the HLA-B62 supertype (including HLA-B*15:01) or somatic loss of heterozygosity at HLA-I was associated with poor outcome. Molecular dynamics simulations of HLA-B*15:01 revealed different elements that may impair CD8 + T cell recognition of neoantigens. Our results have important implications for predicting response to ICB and for the design of neoantigen-based therapeutic vaccines.

739 citations


Journal ArticleDOI
06 Jun 2018-Nature
TL;DR: Mixed hardware–software neural-network implementations that involve up to 204,900 synapses and that combine long-term storage in phase-change memory, near-linear updates of volatile capacitors and weight-data transfer with ‘polarity inversion’ to cancel out inherent device-to-device variations are demonstrated.
Abstract: Neural-network training can be slow and energy intensive, owing to the need to transfer the weight data for the network between conventional digital memory chips and processor chips. Analogue non-volatile memory can accelerate the neural-network training algorithm known as backpropagation by performing parallelized multiply-accumulate operations in the analogue domain at the location of the weight data. However, the classification accuracies of such in situ training using non-volatile-memory hardware have generally been less than those of software-based training, owing to insufficient dynamic range and excessive weight-update asymmetry. Here we demonstrate mixed hardware-software neural-network implementations that involve up to 204,900 synapses and that combine long-term storage in phase-change memory, near-linear updates of volatile capacitors and weight-data transfer with 'polarity inversion' to cancel out inherent device-to-device variations. We achieve generalization accuracies (on previously unseen data) equivalent to those of software-based training on various commonly used machine-learning test datasets (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100). The computational energy efficiency of 28,065 billion operations per second per watt and throughput per area of 3.6 trillion operations per second per square millimetre that we calculate for our implementation exceed those of today's graphical processing units by two orders of magnitude. This work provides a path towards hardware accelerators that are both fast and energy efficient, particularly on fully connected neural-network layers.

693 citations


Journal ArticleDOI
10 Jan 2018-Nature
TL;DR: In this article, it was shown that the lowest exciton in caesium lead halide perovskites (CsPbX_3, with X = Cl, Br or I) involves a highly emissive triplet state.
Abstract: Nanostructured semiconductors emit light from electronic states known as excitons. For organic materials, Hund’s rules state that the lowest-energy exciton is a poorly emitting triplet state. For inorganic semiconductors, similar rules predict an analogue of this triplet state known as the ‘dark exciton’. Because dark excitons release photons slowly, hindering emission from inorganic nanostructures, materials that disobey these rules have been sought. However, despite considerable experimental and theoretical efforts, no inorganic semiconductors have been identified in which the lowest exciton is bright. Here we show that the lowest exciton in caesium lead halide perovskites (CsPbX_3, with X = Cl, Br or I) involves a highly emissive triplet state. We first use an effective-mass model and group theory to demonstrate the possibility of such a state existing, which can occur when the strong spin–orbit coupling in the conduction band of a perovskite is combined with the Rashba effect. We then apply our model to CsPbX_3 nanocrystals, and measure size- and composition-dependent fluorescence at the single-nanocrystal level. The bright triplet character of the lowest exciton explains the anomalous photon-emission rates of these materials, which emit about 20 and 1,000 times faster than any other semiconductor nanocrystal at room and cryogenic temperatures, respectively. The existence of this bright triplet exciton is further confirmed by analysis of the fine structure in low-temperature fluorescence spectra. For semiconductor nanocrystals, which are already used in lighting, lasers and displays, these excitons could lead to materials with brighter emission. More generally, our results provide criteria for identifying other semiconductors that exhibit bright excitons, with potential implications for optoelectronic devices.

661 citations


Proceedings ArticleDOI
01 Jun 2018
TL;DR: Zhang et al. as mentioned in this paper proposed the Neuron Importance Score Propagation (NISP) algorithm to propagate the importance scores of final responses to every neuron in the network.
Abstract: To reduce the significant redundancy in deep Convolutional Neural Networks (CNNs), most existing methods prune neurons by only considering the statistics of an individual layer or two consecutive layers (e.g., prune one layer to minimize the reconstruction error of the next layer), ignoring the effect of error propagation in deep networks. In contrast, we argue that for a pruned network to retain its predictive power, it is essential to prune neurons in the entire neuron network jointly based on a unified goal: minimizing the reconstruction error of important responses in the "final response layer" (FRL), which is the second-to-last layer before classification. Specifically, we apply feature ranking techniques to measure the importance of each neuron in the FRL, formulate network pruning as a binary integer optimization problem, and derive a closed-form solution to it for pruning neurons in earlier layers. Based on our theoretical analysis, we propose the Neuron Importance Score Propagation (NISP) algorithm to propagate the importance scores of final responses to every neuron in the network. The CNN is pruned by removing neurons with least importance, and it is then fine-tuned to recover its predictive power. NISP is evaluated on several datasets with multiple CNN models and demonstrated to achieve significant acceleration and compression with negligible accuracy loss.

658 citations


Posted Content
TL;DR: It is shown, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets.
Abstract: Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost To address this cost, a number of quantization schemes have been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation This technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter $\alpha$ that is optimized during training to find the right quantization scale PACT allows quantizing activations to arbitrary bit precisions, while achieving much better accuracy relative to published state-of-the-art quantization schemes We show, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets We also show that exploiting these reduced-precision computational units in hardware can enable a super-linear improvement in inferencing performance due to a significant reduction in the area of accelerator compute engines coupled with the ability to retain the quantized model and activation data in on-chip memories

Proceedings ArticleDOI
TL;DR: The Fabric project as mentioned in this paper is a permissioned blockchain system for distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency, which allows the system to be tailored to particular use cases and trust models.
Abstract: Fabric is a modular and extensible open-source system for deploying and operating permissioned blockchains and one of the Hyperledger projects hosted by the Linux Foundation (this http URL). Fabric is the first truly extensible blockchain system for running distributed applications. It supports modular consensus protocols, which allows the system to be tailored to particular use cases and trust models. Fabric is also the first blockchain system that runs distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency. This stands in sharp contrast to existing blockchain platforms that require "smart-contracts" to be written in domain-specific languages or rely on a cryptocurrency. Fabric realizes the permissioned model using a portable notion of membership, which may be integrated with industry-standard identity management. To support such flexibility, Fabric introduces an entirely novel blockchain design and revamps the way blockchains cope with non-determinism, resource exhaustion, and performance attacks. This paper describes Fabric, its architecture, the rationale behind various design decisions, its most prominent implementation aspects, as well as its distributed application programming model. We further evaluate Fabric by implementing and benchmarking a Bitcoin-inspired digital currency. We show that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations, with sub-second latency, scaling well to over 100 peers.

Journal ArticleDOI
TL;DR: This review provides a system-level analysis of sustainable polymers and outlines key criteria with respect to the feedstocks the polymers are derived from, the manner in which thepolymers are generated, and the end-of-use options.
Abstract: The replacement of current petroleum-based plastics with sustainable alternatives is a crucial but formidable challenge for the modern society. Catalysis presents an enabling tool to facilitate the development of sustainable polymers. This review provides a system-level analysis of sustainable polymers and outlines key criteria with respect to the feedstocks the polymers are derived from, the manner in which the polymers are generated, and the end-of-use options. Specifically, we define sustainable polymers as a class of materials that are derived from renewable feedstocks and exhibit closed-loop life cycles. Among potential candidates, aliphatic polyesters and polycarbonates are promising materials due to their renewable resources and excellent biodegradability. The development of renewable monomers, the versatile synthetic routes to convert these monomers to polyesters and polycarbonate, and the different end-of-use options for these polymers are critically reviewed, with a focus on recent advances in c...

Journal ArticleDOI
James R. Lewis1
TL;DR: The System Usability Scale is the most widely used standardized questionnaire for the assessment of perceived usability and it is likely that the SUS will continue to be a popular measurement of perceived usefulness for the foreseeable future.
Abstract: The System Usability Scale (SUS) is the most widely used standardized questionnaire for the assessment of perceived usability. This review of the SUS covers its early history from inception in the ...

Proceedings ArticleDOI
01 Jan 2018
TL;DR: This paper used a population-based optimization algorithm to generate semantically and syntactically similar adversarial examples that fool well-trained sentiment analysis and textual entailment models with success rates of 97% and 70%, respectively.
Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples, perturbations to correctly classified examples which can cause the model to misclassify. In the image domain, these perturbations can often be made virtually indistinguishable to human perception, causing humans and state-of-the-art models to disagree. However, in the natural language domain, small perturbations are clearly perceptible, and the replacement of a single word can drastically alter the semantics of the document. Given these challenges, we use a black-box population-based optimization algorithm to generate semantically and syntactically similar adversarial examples that fool well-trained sentiment analysis and textual entailment models with success rates of 97% and 70%, respectively. We additionally demonstrate that 92.3% of the successful sentiment analysis adversarial examples are classified to their original label by 20 human annotators, and that the examples are perceptibly quite similar. Finally, we discuss an attempt to use adversarial training as a defense, but fail to yield improvement, demonstrating the strength and diversity of our adversarial examples. We hope our findings encourage researchers to pursue improving the robustness of DNNs in the natural language domain.

Journal ArticleDOI
19 Jun 2018
TL;DR: In this article, a general description of variational algorithms is provided and the mapping from fermions to qubits is explained, and simple error-mitigation schemes are introduced that could improve the accuracy of determining ground-state energies.
Abstract: Universal fault-tolerant quantum computers will require error-free execution of long sequences of quantum gate operations, which is expected to involve millions of physical qubits. Before the full power of such machines will be available, near-term quantum devices will provide several hundred qubits and limited error correction. Still, there is a realistic prospect to run useful algorithms within the limited circuit depth of such devices. Particularly promising are optimization algorithms that follow a hybrid approach: the aim is to steer a highly entangled state on a quantum system to a target state that minimizes a cost function via variation of some gate parameters. This variational approach can be used both for classical optimization problems as well as for problems in quantum chemistry. The challenge is to converge to the target state given the limited coherence time and connectivity of the qubits. In this context, the quantum volume as a metric to compare the power of near-term quantum devices is discussed. With focus on chemistry applications, a general description of variational algorithms is provided and the mapping from fermions to qubits is explained. Coupled-cluster and heuristic trial wave-functions are considered for efficiently finding molecular ground states. Furthermore, simple error-mitigation schemes are introduced that could improve the accuracy of determining ground-state energies. Advancing these techniques may lead to near-term demonstrations of useful quantum computation with systems containing several hundred qubits.

Proceedings ArticleDOI
01 Jan 2018
TL;DR: This work presents ZEUS—a framework to verify the correctness and validate the fairness of smart contracts, which leverages both abstract interpretation and symbolic model checking, along with the power of constrained horn clauses to quickly verify contracts for safety.
Abstract: A smart contract is hard to patch for bugs once it is deployed, irrespective of the money it holds. A recent bug caused losses worth around $50 million of cryptocurrency. We present ZEUS—a framework to verify the correctness and validate the fairness of smart contracts. We consider correctness as adherence to safe programming practices, while fairness is adherence to agreed upon higher-level business logic. ZEUS leverages both abstract interpretation and symbolic model checking, along with the power of constrained horn clauses to quickly verify contracts for safety. We have built a prototype of ZEUS for Ethereum and Fabric blockchain platforms, and evaluated it with over 22.4K smart contracts. Our evaluation indicates that about 94.6% of contracts (containing cryptocurrency worth more than $0.5 billion) are vulnerable. ZEUS is sound with zero false negatives and has a low false positive rate, with an order of magnitude improvement in analysis time as compared to prior art.

Journal ArticleDOI
TL;DR: A multi-memristive synaptic architecture with an efficient global counter-based arbitration scheme to address challenges associated with the non-ideal memristive device behavior is proposed.
Abstract: Neuromorphic computing has emerged as a promising avenue towards building the next generation of intelligent computing systems. It has been proposed that memristive devices, which exhibit history-dependent conductivity modulation, could efficiently represent the synaptic weights in artificial neural networks. However, precise modulation of the device conductance over a wide dynamic range, necessary to maintain high network accuracy, is proving to be challenging. To address this, we present a multi-memristive synaptic architecture with an efficient global counter-based arbitration scheme. We focus on phase change memory devices, develop a comprehensive model and demonstrate via simulations the effectiveness of the concept for both spiking and non-spiking neural networks. Moreover, we present experimental results involving over a million phase change memory devices for unsupervised learning of temporal correlations using a spiking neural network. The work presents a significant step towards the realization of large-scale and energy-efficient neuromorphic computing systems.

Proceedings ArticleDOI
01 Jun 2018
TL;DR: In this article, a generic classification network equipped with convolutional blocks of different dilated rates was designed to produce dense and reliable object localization maps and effectively benefit both weakly and semi-supervised semantic segmentation.
Abstract: Despite the remarkable progress, weakly supervised segmentation approaches are still inferior to their fully supervised counterparts. We obverse the performance gap mainly comes from their limitation on learning to produce high-quality dense object localization maps from image-level supervision. To mitigate such a gap, we revisit the dilated convolution [1] and reveal how it can be utilized in a novel way to effectively overcome this critical limitation of weakly supervised segmentation approaches. Specifically, we find that varying dilation rates can effectively enlarge the receptive fields of convolutional kernels and more importantly transfer the surrounding discriminative information to non-discriminative object regions, promoting the emergence of these regions in the object localization maps. Then, we design a generic classification network equipped with convolutional blocks of different dilated rates. It can produce dense and reliable object localization maps and effectively benefit both weakly- and semi- supervised semantic segmentation. Despite the apparent simplicity, our proposed approach obtains superior performance over state-of-the-arts. In particular, it achieves 60.8% and 67.6% mIoU scores on Pascal VOC 2012 test set in weakly- (only image-level labels are available) and semi- (1,464 segmentation masks are available) supervised settings, which are the new state-of-the-arts.

Journal ArticleDOI
TL;DR: A systematic review of deep learning models for electronic health record (EHR) data is conducted, and various deep learning architectures for analyzing different data sources and their target applications are illustrated.

Journal ArticleDOI
26 Feb 2018
TL;DR: In this paper, the challenges and opportunities of blockchain for business process management (BPM) are outlined and a summary of seven research directions for investigating the application of blockchain technology in the context of BPM are presented.
Abstract: Blockchain technology offers a sizable promise to rethink the way interorganizational business processes are managed because of its potential to realize execution without a central party serving as a single point of trust (and failure). To stimulate research on this promise and the limits thereof, in this article, we outline the challenges and opportunities of blockchain for business process management (BPM). We first reflect how blockchains could be used in the context of the established BPM lifecycle and second how they might become relevant beyond. We conclude our discourse with a summary of seven research directions for investigating the application of blockchain technology in the context of BPM.

Journal ArticleDOI
TL;DR: It is shown that the top face-verification results from the Labeled Faces in the Wild data set were obtained with networks containing hundreds of millions of parameters, using a mix of convolutional, locally connected, and fully connected layers.
Abstract: In recent years, deep neural networks (DNNs) have received increased attention, have been applied to different applications, and achieved dramatic accuracy improvements in many tasks. These works rely on deep networks with millions or even billions of parameters, and the availability of graphics processing units (GPUs) with very high computation capability plays a key role in their success. For example, Krizhevsky et al. [1] achieved breakthrough results in the 2012 ImageNet Challenge using a network containing 60 million parameters with five convolutional layers and three fully connected layers. Usually, it takes two to three days to train the whole model on the ImagetNet data set with an NVIDIA K40 machine. In another example, the top face-verification results from the Labeled Faces in the Wild (LFW) data set were obtained with networks containing hundreds of millions of parameters, using a mix of convolutional, locally connected, and fully connected layers [2], [3]. It is also very time-consuming to train such a model to obtain a reasonable performance. In architectures that only rely on fully connected layers, the number of parameters can grow to billions [4].

Journal ArticleDOI
TL;DR: In this paper, the authors summarized the recent advances, challenges, and prospects of both fundamental and applied aspects of stress in thin films and engineering coatings and systems, based on recent achievements presented during the 2016 Stress Workshop entitled “Stress Evolution in Thin Films and Coatings: from Fundamental Understanding to Control.
Abstract: The issue of stress in thin films and functional coatings is a persistent problem in materials science and technology that has congregated many efforts, both from experimental and fundamental points of view, to get a better understanding on how to deal with, how to tailor, and how to manage stress in many areas of applications. With the miniaturization of device components, the quest for increasingly complex film architectures and multiphase systems and the continuous demands for enhanced performance, there is a need toward the reliable assessment of stress on a submicron scale from spatially resolved techniques. Also, the stress evolution during film and coating synthesis using physical vapor deposition (PVD), chemical vapor deposition, plasma enhanced chemical vapor deposition (PECVD), and related processes is the result of many interrelated factors and competing stress sources so that the task to provide a unified picture and a comprehensive model from the vast amount of stress data remains very challenging. This article summarizes the recent advances, challenges, and prospects of both fundamental and applied aspects of stress in thin films and engineering coatings and systems, based on recent achievements presented during the 2016 Stress Workshop entitled “Stress Evolution in Thin Films and Coatings: from Fundamental Understanding to Control.” Evaluation methods, implying wafer curvature, x-ray diffraction, or focused ion beam removal techniques, are reviewed. Selected examples of stress evolution in elemental and alloyed systems, graded layers, and multilayer-stacks as well as amorphous films deposited using a variety of PVD and PECVD techniques are highlighted. Based on mechanisms uncovered by in situ and real-time diagnostics, a kinetic model is outlined that is capable of reproducing the dependence of intrinsic (growth) stress on the grain size, growth rate, and deposited energy. The problems and solutions related to stress in the context of optical coatings, inorganic coatings on plastic substrates, and tribological coatings for aerospace applications are critically examined. This review also suggests strategies to mitigate excessive stress levels from novel coating synthesis perspectives to microstructural design approaches, including the ability to empower crack-based fabrication processes, pathways leading to stress relaxation and compensation, as well as management of the film and coating growth conditions with respect to energetic ion bombardment. Future opportunities and challenges for stress engineering and stress modeling are considered and outlined.

Journal ArticleDOI
TL;DR: This review focuses on the fundamentals of hyperspectral image analysis and its modern applications such as food quality and safety assessment, medical diagnosis and image guided surgery, forensic document examination, defense and homeland security, remote sensing applicationssuch as precision agriculture and water resource management and material identification and mapping of artworks.
Abstract: Over the past three decades, significant developments have been made in hyperspectral imaging due to which it has emerged as an effective tool in numerous civil, environmental, and military applications. Modern sensor technologies are capable of covering large surfaces of earth with exceptional spatial, spectral, and temporal resolutions. Due to these features, hyperspectral imaging has been effectively used in numerous remote sensing applications requiring estimation of physical parameters of many complex surfaces and identification of visually similar materials having fine spectral signatures. In the recent years, ground based hyperspectral imaging has gained immense interest in the research on electronic imaging for food inspection, forensic science, medical surgery and diagnosis, and military applications. This review focuses on the fundamentals of hyperspectral image analysis and its modern applications such as food quality and safety assessment, medical diagnosis and image guided surgery, forensic document examination, defense and homeland security, remote sensing applications such as precision agriculture and water resource management and material identification and mapping of artworks. Moreover, recent research on the use of hyperspectral imaging for examination of forgery detection in questioned documents, aided by deep learning, is also presented. This review can be a useful baseline for future research in hyperspectral image analysis.

Proceedings Article
25 Apr 2018
TL;DR: In this article, the authors formulate the process of generating adversarial examples as an elastic-net regularized optimization problem, which can yield a distinct set of adversarial samples with small L 1 distortion.
Abstract: Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples — a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on L 2 and L ∞ distortion metrics. However, despite the fact that L 1 distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting L 1 -based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature L 1 -oriented adversarial examples and include the state-of-the-art L 2 attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small L 1 distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging L 1 distortion in adversarial machine learning and security implications of DNNs.

Journal ArticleDOI
TL;DR: It is found that IoT is foundational to any service transformation, although it is mostly needed to become an availability provider, and PA is essential for moving to the performance provider profile.
Abstract: The role of digital technologies in service business transformation is under-investigated. This paper contributes to filling this gap by addressing how the Internet of things (IoT), cloud computing...

Proceedings ArticleDOI
Jialong Zhang1, Zhongshu Gu1, Jiyong Jang1, Hui Wu1, Marc Ph. Stoecklin1, Heqing Huang1, Ian M. Molloy1 
29 May 2018
TL;DR: By extending the intrinsic generalization and memorization capabilities of deep neural networks, the models to learn specially crafted watermarks at training and activate with pre-specified predictions when observing the watermark patterns at inference, this paper generalizes the "digital watermarking'' concept from multimedia ownership verification to deep neural network (DNN) models.
Abstract: Deep learning technologies, which are the key components of state-of-the-art Artificial Intelligence (AI) services, have shown great success in providing human-level capabilities for a variety of tasks, such as visual analysis, speech recognition, and natural language processing and etc. Building a production-level deep learning model is a non-trivial task, which requires a large amount of training data, powerful computing resources, and human expertises. Therefore, illegitimate reproducing, distribution, and the derivation of proprietary deep learning models can lead to copyright infringement and economic harm to model creators. Therefore, it is essential to devise a technique to protect the intellectual property of deep learning models and enable external verification of the model ownership. In this paper, we generalize the "digital watermarking'' concept from multimedia ownership verification to deep neural network (DNNs) models. We investigate three DNN-applicable watermark generation algorithms, propose a watermark implanting approach to infuse watermark into deep learning models, and design a remote verification mechanism to determine the model ownership. By extending the intrinsic generalization and memorization capabilities of deep neural networks, we enable the models to learn specially crafted watermarks at training and activate with pre-specified predictions when observing the watermark patterns at inference. We evaluate our approach with two image recognition benchmark datasets. Our framework accurately (100%) and quickly verifies the ownership of all the remotely deployed deep learning models without affecting the model accuracy for normal input data. In addition, the embedded watermarks in DNN models are robust and resilient to different counter-watermark mechanisms, such as fine-tuning, parameter pruning, and model inversion attacks.

Proceedings Article
01 Nov 2018
TL;DR: This paper introduces CROWN, a general framework to certify robustness of neural networks with general activation functions for given input data points and facilitates the search for a tighter certified lower bound by adaptively selecting appropriate surrogates for each neuron activation.
Abstract: Finding minimum distortion of adversarial examples and thus certifying robustness in neural networks classifiers is known to be a challenging problem. Nevertheless, recently it has been shown to be possible to give a non-trivial certified lower bound of minimum distortion, and some recent progress has been made towards this direction by exploiting the piece-wise linear nature of ReLU activations. However, a generic robustness certification for \textit{general} activation functions still remains largely unexplored. To address this issue, in this paper we introduce CROWN, a general framework to certify robustness of neural networks with general activation functions. The novelty in our algorithm consists of bounding a given activation function with linear and quadratic functions, hence allowing it to tackle general activation functions including but not limited to the four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we facilitate the search for a tighter certified lower bound by \textit{adaptively} selecting appropriate surrogates for each neuron activation. Experimental results show that CROWN on ReLU networks can notably improve the certified lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while having comparable computational efficiency. Furthermore, CROWN also demonstrates its effectiveness and flexibility on networks with general activation functions, including tanh, sigmoid and arctan.

Proceedings ArticleDOI
24 Apr 2018
TL;DR: This paper introduces Kyber, a portfolio of post-quantum cryptographic primitives built around a key-encapsulation mechanism (KEM), based on hardness assumptions over module lattices, and introduces a CPA-secure public-key encryption scheme and eventually construct, in a black-box manner, CCA-secure encryption, key exchange, and authenticated-key-exchange schemes.
Abstract: Rapid advances in quantum computing, together with the announcement by the National Institute of Standards and Technology (NIST) to define new standards for digitalsignature, encryption, and key-establishment protocols, have created significant interest in post-quantum cryptographic schemes. This paper introduces Kyber (part of CRYSTALS – Cryptographic Suite for Algebraic Lattices – a package submitted to NIST post-quantum standardization effort in November 2017), a portfolio of post-quantum cryptographic primitives built around a key-encapsulation mechanism (KEM), based on hardness assumptions over module lattices. Our KEM is most naturally seen as a successor to the NEWHOPE KEM (Usenix 2016). In particular, the key and ciphertext sizes of our new construction are about half the size, the KEM offers CCA instead of only passive security, the security is based on a more general (and flexible) lattice problem, and our optimized implementation results in essentially the same running time as the aforementioned scheme. We first introduce a CPA-secure public-key encryption scheme, apply a variant of the Fujisaki–Okamoto transform to create a CCA-secure KEM, and eventually construct, in a black-box manner, CCA-secure encryption, key exchange, and authenticated-key-exchange schemes. The security of our primitives is based on the hardness of Module-LWE in the classical and quantum random oracle models, and our concrete parameters conservatively target more than 128 bits of postquantum security.

Journal ArticleDOI
TL;DR: Pd nanocrystals exhibit facet-dependent oxidase and peroxidase-like activities that endow them with excellent antibacterial properties via generation of reactive oxygen species, and a reverse trend of antibacterial activity is observed against Gram-negative bacteria.
Abstract: Noble metal-based nanomaterials have shown promise as potential enzyme mimetics, but the facet effect and underlying molecular mechanisms are largely unknown. Herein, with a combined experimental and theoretical approach, we unveil that palladium (Pd) nanocrystals exhibit facet-dependent oxidase and peroxidase-like activities that endow them with excellent antibacterial properties via generation of reactive oxygen species. The antibacterial efficiency of Pd nanocrystals against Gram-positive bacteria is consistent with the extent of their enzyme-like activity, that is {100}-faceted Pd cubes with higher activities kill bacteria more effectively than {111}-faceted Pd octahedrons. Surprisingly, a reverse trend of antibacterial activity is observed against Gram-negative bacteria, with Pd octahedrons displaying stronger penetration into bacterial membranes than Pd nanocubes, thereby exerting higher antibacterial activity than the latter. Our findings provide a deeper understanding of facet-dependent enzyme-like activities and might advance the development of noble metal-based nanomaterials with both enhanced and targeted antibacterial activities.