scispace - formally typeset
Search or ask a question

Showing papers in "Ibm Journal of Research and Development in 2019"


Journal ArticleDOI
TL;DR: A new open-source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license, to help facilitate the transition of fairness research algorithms for use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms.
Abstract: Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This article introduces a new open-source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license ( https://github.com/ibm/aif360 ). The main objectives of this toolkit are to help facilitate the transition of fairness research algorithms for use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms. The package includes a comprehensive set of fairness metrics for datasets and models, explanations for these metrics, and algorithms to mitigate bias in datasets and models. It also includes an interactive Web experience that provides a gentle introduction to the concepts and capabilities for line-of-business users, researchers, and developers to extend the toolkit with their new algorithms and improvements and to use it for performance benchmarking. A built-in testing infrastructure maintains code quality.

356 citations


Journal ArticleDOI
TL;DR: This paper envisiones an SDoC for AI services to contain purpose, performance, safety, security, and provenance information to be completed and voluntarily released by AI service providers for examination by consumers.
Abstract: Accuracy is an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety (which includes fairness and explainability), security, and provenance, are also critical elements to engender consumers’ trust in a service. Many industries use transparent, standardized, but often not legally required documents called supplier's declarations of conformity (SDoCs) to describe the lineage of a product along with the safety and performance testing it has undergone. SDoCs may be considered multidimensional fact sheets that capture and quantify various aspects of the product and its development to make it worthy of consumers’ trust. In this article, inspired by this practice, we propose FactSheets to help increase trust in AI services. We envision such documents to contain purpose, performance, safety, security, and provenance information to be completed by AI service providers for examination by consumers. We suggest a comprehensive set of declaration items tailored to AI in the Appendix of this article.

243 citations


Journal ArticleDOI
TL;DR: In this paper, the authors introduce the Fairness GAN, an approach for generating a dataset that is plausibly similar to a given multimedia dataset, but is more fair with respect to protected attributes in decision making.
Abstract: We introduce the Fairness GAN (generative adversarial network), an approach for generating a dataset that is plausibly similar to a given multimedia dataset, but is more fair with respect to protected attributes in decision making. We propose a novel auxiliary classifier GAN that strives for demographic parity or equality of opportunity and show empirical results on several datasets, including the CelebFaces Attributes (CelebA) dataset, the Quick, Draw! dataset, and a dataset of soccer player images and the offenses for which they were called. The proposed formulation is well suited to absorbing unlabeled data; we leverage this to augment the soccer dataset with the much larger CelebA dataset. The methodology tends to improve demographic parity and equality of opportunity while generating plausible images.

113 citations


Journal ArticleDOI
TL;DR: In this paper, peers encrypt their private data before storing it on the chain and use secure MPC whenever such private data are needed in a transaction on Hyperledger Fabric.
Abstract: Hyperledger Fabric is a “permissioned” blockchain architecture, providing a consistent distributed ledger, shared by a set of “peers” that must all have the same view of its state. For many applications, it is desirable to keep private data on the ledger, but the same-view principle makes it challenging to implement. In this paper, we explore supporting private data on Fabric using secure multiparty computation (MPC). In our solution, peers encrypt their private data before storing it on the chain and use secure MPC whenever such private data are needed in a transaction. We created a demo of our solution, implementing a bidding system where sellers list assets on the ledger with a secret reserve price, and bidders publish their bids on the ledger but keep secret the bidding price. We implemented a smart contract that runs the auction on this secret data, using a simple secure-MPC protocol that was built using the EMP-toolkit library. We identified two basic services that should be added to Hyperledger Fabric to support our solution, inspiring follow-up work to implement and add these services to the Hyperledger Fabric architecture.

92 citations


Journal ArticleDOI
TL;DR: This article describes the work on systematically identifying opportunities for PIM in real applications and quantifies potential gains for popular emerging applications (e.g., machine learning, data analytics, genome analysis) and describes challenges that remain for the widespread adoption of PIM.
Abstract: Many modern and emerging applications must process increasingly large volumes of data. Unfortunately, prevalent computing paradigms are not designed to efficiently handle such large-scale data: The energy and performance costs to move this data between the memory subsystem and the CPU now dominate the total costs of computation. This forces system architects and designers to fundamentally rethink how to design computers. Processing-in-memory (PIM) is a computing paradigm that avoids most data movement costs by bringing computation to the data. New opportunities in modern memory systems are enabling architectures that can perform varying degrees of processing inside the memory subsystem. However, many practical system-level issues must be tackled to construct PIM architectures, including enabling workloads and programmers to easily take advantage of PIM. This article examines three key domains of work toward the practical construction and widespread adoption of PIM architectures. First, we describe our work on systematically identifying opportunities for PIM in real applications and quantify potential gains for popular emerging applications (e.g., machine learning, data analytics, genome analysis). Second, we aim to solve several key issues in programming these applications for PIM architectures. Third, we describe challenges that remain for the widespread adoption of PIM.

91 citations


Journal ArticleDOI
TL;DR: According to the experimental results, BlueConnect can outperform the leading industrial communication library by wide margin, and the BlueConnect integrated Caffe2 can significantly reduce synchronization overhead by 87% on 192 GPUs for Resnet-50 training over prior schemes.
Abstract: As deep neural networks get more complex and input datasets get larger, it can take days or even weeks to train a deep neural network to the desired accuracy. Therefore, enabling distributed deep learning at a massive scale is critical since it offers the potential to reduce the training time from weeks to hours. In this article, we present BlueConnect, an efficient communication library for distributed deep learning that is highly optimized for popular GPU-based platforms. BlueConnect decomposes a single all-reduce operation into a large number of parallelizable reduce–scatter and all-gather operations to exploit the tradeoff between latency and bandwidth and adapt to a variety of network configurations. Therefore, each individual operation can be mapped to a different network fabric and take advantage of the best performing implementation for the corresponding fabric. According to our experimental results on two system configurations, BlueConnect can outperform the leading industrial communication library by a wide margin, and the BlueConnect-integrated Caffe2 can significantly reduce synchronization overhead by 87% on 192 GPUs for Resnet-50 training over prior schemes.

62 citations


Journal ArticleDOI
TL;DR: How blockchain data can be combined with external data sources for secure and private analytics, enable artificial intelligence (AI) model creation over geographically dispersed data, and create a history of model creation enabling provenance and lineage tracking for trusted AI are described.
Abstract: Blockchain records track information about financial payments, movements of products through supply chains, identity verification information, and many other assets. Analytics on this data can provide provenance histories, predictive planning, fraud identification, and regulatory compliance. In this paper, we describe analytics engines connected to blockchains to provide easy-to-use configurable dashboards, predictive models, provenance histories, and compliance checking. We also describe how blockchain data can be combined with external data sources for secure and private analytics, enable artificial intelligence (AI) model creation over geographically dispersed data, and create a history of model creation enabling provenance and lineage tracking for trusted AI.

51 citations


Journal ArticleDOI
TL;DR: In this article, innovative microarchitectural designs for multilayer deep neural networks (DNNs) implemented in crossbar arrays of analog memories are presented, finding that the current design could achieve up to 12–14 TOPs/s/W energy efficiency for training, while a projected scaled design could achieved up to 250 Topological efficiency.
Abstract: In this article, we present innovative microarchitectural designs for multilayer deep neural networks (DNNs) implemented in crossbar arrays of analog memories. Data is transferred in a fully parallel manner between arrays without explicit analog-to-digital converters. Design ideas including source follower-based readout, array segmentation, and transmit-by-duration are adopted to improve the circuit efficiency. The execution energy and throughput, for both DNN training and inference, are analyzed quantitatively using circuit simulations of a full CMOS design in the 90-nm technology node. We find that our current design could achieve up to 12–14 TOPs/s/W energy efficiency for training, while a projected scaled design could achieve up to 250 TOPs/s/W. Key challenges in realizing analog AI systems are discussed.

34 citations


Journal ArticleDOI
TL;DR: This paper developed and deployed a computational creativity system for culinary recipes and menus, Chef Watson, which can operate either autonomously or semiautonomously with human interaction, and presents the basic system architecture, data engineering, and algorithms that are involved.
Abstract: Computational creativity is an emerging branch of artificial intelligence that places computers in the center of the creative process. Broadly, creativity involves a generative step to produce many ideas and a selective step to determine the ones that are the best. Many previous attempts at computational creativity, however, have not been able to achieve a valid selective step. This paper shows how bringing data sources from the creative domain and from hedonic psychophysics together with machine learning and data analytics techniques can overcome this shortcoming to yield a system that can produce novel and high-quality creative artifacts. To demonstrate our data-driven approach, we developed and deployed a computational creativity system for culinary recipes and menus, Chef Watson, which can operate either autonomously or semiautonomously with human interaction. We present the basic system architecture, data engineering, and algorithms that are involved. Experimental results demonstrate the system passes the test for creativity based on the consensual assessment technique, producing a novel and flavorful recipe. Large-scale deployments are also discussed.

32 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose an infrastructure that allows CC researchers to build workflows that can be executed online and be easily reused by others through the workflow web address, leading to novel ways of software composition for computational purposes that were not expected in advance.
Abstract: Computational creativity (CC) is a multidisciplinary research field, studying how to engineer software that exhibits behavior that would reasonably be deemed creative. This paper shows how composition of software solutions in this field can effectively be supported through a CC infrastructure that supports user-friendly development of CC software components and workflows, their sharing, execution, and reuse. The infrastructure allows CC researchers to build workflows that can be executed online and be easily reused by others through the workflow web address. Moreover, it enables the building of procedures composed of software developed by different researchers from different laboratories, leading to novel ways of software composition for computational purposes that were not expected in advance. This capability is illustrated on a workflow that implements a Concept Generator prototype based on the Conceptual Blending framework. The prototype consists of a composition of modules made available as web services, and is explored and tested through experiments involving blending of texts from different domains, blending of images, and poetry generation.

31 citations


Journal ArticleDOI
TL;DR: The three-terminal electrochemical memory based on the redox transistor (RT) is introduced, which uses a gate to tune the red ox state of the channel and storage of information as a charge-compensated redox reaction in the bulk of the transistor enables high-density information storage.
Abstract: Efficiency bottlenecks inherent to conventional computing in executing neural algorithms have spurred the development of novel devices capable of “in-memory” computing. Commonly known as “memristors,” a variety of device concepts including conducting bridge, vacancy filament, phase change, and other types have been proposed as promising elements in artificial neural networks for executing inference and learning algorithms. In this article, we review the recent advances in memristor technology for neuromorphic computing and discuss strategies for addressing the most significant performance challenges, including nonlinearity, high read/write currents, and endurance. As an alternative to two-terminal memristors, we introduce the three-terminal electrochemical memory based on the redox transistor (RT), which uses a gate to tune the redox state of the channel. Decoupling the “read” and “write” operations using a third terminal and storage of information as a charge-compensated redox reaction in the bulk of the transistor enables high-density information storage. These properties enable low-energy operation without compromising analog performance and nonvolatility. We discuss the RT operating mechanisms using organic and inorganic materials, approaches for array integration, and prospects for achieving the device density and switching speeds necessary to make electrochemical memory competitive with established digital technology.

Journal ArticleDOI
TL;DR: A novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations and reinforcementlearning to learn to maximize environmental rewards, and a contextual-bandit-based orchestrator that allows the agent to mix policies in novel ways.
Abstract: Autonomous cyber-physical agents play an increasingly large role in our lives. To ensure that they behave in ways aligned with the values of society, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations and reinforcement learning to learn to maximize environmental rewards. A contextual-bandit-based orchestrator then picks between the two policies: constraint-based and environment reward-based. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward-maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using Pac-Man and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.

Journal ArticleDOI
TL;DR: This paper proposes a novel reference software architecture to address the complex requirements of modern supply chains that also integrates blockchain into several layers of the stack and demonstrates through a use case in production that integrating blockchain technology helps with providing visibility, documenting provenance, and allowing permissioned data access to facilitate the automation of many high-volume tasks.
Abstract: Increasing globalization, e-commerce usage, and social awareness are leading to increased consumer demand for variety, value, convenience, immediacy, verifiable authenticity and provenance, ethical materials sourcing and manufacturing, regulatory compliance, and services after sales. Fulfilling this increased complexity of consumer demand has required supply chains to evolve into multienterprise networks with numerous flow paths in production, merchandising, and fulfillment involving many organizational/institutional handoffs, to effectively manage a large number of complex products with shorter life cycles and high transaction volumes. The supply chain management models of today place higher demands on automation and require a transition from the traditional paradigm of planning followed by long-loop execution for a handful of segments to a paradigm of managing a portfolio of end-to-end instrumented data-rich microsegmented supply chains that are monitored and adjusted in near real time. These essential aspects and challenges of supply chain management require the supporting information technology to also evolve. In this paper, we propose a novel reference software architecture to address the complex requirements of modern supply chains that also integrates blockchain into several layers of the stack. We present several examples where this reference architecture is applicable, and then demonstrate through a use case in production that integrating blockchain technology helps with providing visibility, documenting provenance, and allowing permissioned data access to facilitate the automation of many high-volume tasks such as reconciliations, payments, and settlements.

Journal ArticleDOI
TL;DR: This article focuses on mixed-precision deep learning training with in-memory computing, and shows how the precision of in- memory computing can be further improved through architectural and device-level innovations.
Abstract: Performing computations on conventional von Neumann computing systems results in a significant amount of data being moved back and forth between the physically separated memory and processing units. This costs time and energy, and constitutes an inherent performance bottleneck. In-memory computing is a novel non-von Neumann approach, where certain computational tasks are performed in the memory itself. This is enabled by the physical attributes and state dynamics of memory devices, in particular, resistance-based nonvolatile memory technology. Several computational tasks such as logical operations, arithmetic operations, and even certain machine learning tasks can be implemented in such a computational memory unit. In this article, we first introduce the general notion of in-memory computing and then focus on mixed-precision deep learning training with in-memory computing. The efficacy of this new approach will be demonstrated by training the MNIST multilayer perceptron network achieving high accuracy. Moreover, we show how the precision of in-memory computing can be further improved through architectural and device-level innovations. Finally, we present system aspects, such as high-level system architecture, including core-to-core interconnect technologies, and high-level ideas and concepts of the software stack .

Journal ArticleDOI
TL;DR: A new feature called service discovery, presented in this paper, provides APIs that allow dynamic discovery of the configuration required for the client SDK to interact with the HLF platform, which enables the client to rapidly adapt to changes in the platform, thus improving the reliability of the application layer and making the HLf platform more consumable.
Abstract: Hyperledger Fabric (HLF) is a modular and extensible permissioned blockchain platform. The platform's design exhibits principles required by enterprise-grade business applications, such as supply chains, financial transactions, asset management, etc. For that end, HLF introduces several innovations, two of which are smart contracts in general-purpose languages (chaincode in HLF), and flexible endorsement policies, which govern whether a transaction is considered valid. Typical blockchain applications comprise two tiers: The “platform” tier defines the data schema and embedding of business rules by means of chaincode and endorsement policies; the “client-side” tier uses the HLF software development kit (SDK) to implement client application logic. The client side should be aware of the deployment address of chaincode and endorsement policies within the platform. In past releases, this was statically configured into the client side. As of HLF v1.2, a new feature called service discovery, presented in this paper, provides APIs that allow dynamic discovery of the configuration required for the client SDK to interact with the platform. This enables the client to rapidly adapt to changes in the platform, thus improving the reliability of the application layer and making the HLF platform more consumable.

Journal ArticleDOI
TL;DR: This paper implemented a toolchain that generates smart contracts of Hyperledger Fabric from template-based contract documents via a formal model and evaluated the feasibility of the approach through case studies of two types of real-world contracts in different domains.
Abstract: Smart contracts, which are widely recognized as key components of blockchain technology, enable automatic execution of agreements. Since each smart contract is a computer program that autonomously runs on a blockchain platform, their development requires much effort and care compared with the development of more common programs. In this paper, we propose a technique to automatically generate a smart contract from a human-understandable contract document that is created using a document template and a controlled natural language (CNL). The automation is based on a mapping from the document template and the CNL to a formal model that can define the terms and conditions in a contract including temporal constraints and procedures. The formal model is then translated into an executable smart contract. We implemented a toolchain that generates smart contracts of Hyperledger Fabric from template-based contract documents via a formal model. We then evaluated the feasibility of our approach through case studies of two types of real-world contracts in different domains.

Journal ArticleDOI
TL;DR: The silicon interconnect fabric (Si-IF) as a platform for integration of high performance scaled out systems including artificial intelligence systems is introduced and additional system-level methodologies to support heterogeneous ultralarge systems on the Si-IF platform are presented.
Abstract: The silicon interconnect fabric (Si-IF) as a platform for integration of high performance scaled out systems including artificial intelligence systems is introduced in this article. The Si-IF is a wafer-sized platform that enables integration of bare dies at fine pitch (2–10 µm) and small inter-die spacing (≤100 µm) comparable to on-die connectivity. The choice of materials, die size, and pitch for integration on the Si-IF is discussed. The assembly process of dies on the Si-IF, electrical and mechanical experimental results, and an approach to ensure reliability of the system are described in detail. Additional system-level methodologies for communication, power delivery, and heat extraction to support heterogeneous ultralarge systems on the Si-IF platform are presented.

Journal ArticleDOI
TL;DR: The prospects for designing hardware accelerators for neural networks using resistive crossbars are highlighted and the key open challenges and some possible approaches to address them are underscore.
Abstract: Deep neural networks (DNNs) achieve best-known accuracies in many machine learning tasks involved in image, voice, and natural language processing and are being used in an ever-increasing range of applications. However, their algorithmic benefits are accompanied by extremely high computation and storage costs, sparking intense efforts in optimizing the design of computing platforms for DNNs. Today, graphics processing units (GPUs) and specialized digital CMOS accelerators represent the state-of-the-art in DNN hardware, with near-term efforts focusing on approximate computing through reduced precision. However, the ever-increasing complexities of DNNs and the data they process have fueled an active interest in alternative hardware fabrics that can deliver the next leap in efficiency. Resistive crossbars designed using emerging nonvolatile memory technologies have emerged as a promising candidate building block for future DNN hardware fabrics since they can natively execute massively parallel vector-matrix multiplications (the dominant compute kernel in DNNs) in the analog domain within the memory arrays. Leveraging in-memory computing and dense storage, resistive-crossbar-based systems cater to both the high computation and storage demands of complex DNNs and promise energy efficiency beyond current DNN accelerators by mitigating data transfer and memory bottlenecks. However, several design challenges need to be addressed to enable their adoption. For example, the overheads of peripheral circuits (analog-to-digital converters and digital-to-analog converters) and other components (scratchpad memories and on-chip interconnect) may significantly diminish the efficiency benefits at the system level. Additionally, the analog crossbar computations are intrinsically subject to noise due to a range of device- and circuit-level nonidealities, potentially leading to lower accuracy at the application level. In this article, we highlight the prospects for designing hardware accelerators for neural networks using resistive crossbars. We also underscore the key open challenges and some possible approaches to address them.

Journal ArticleDOI
TL;DR: This paper applies to HR analytics the ethical frameworks discussed in other fields including medicine, robotics, learning analytics, and coaching to discuss the ethical implications of the application of sophisticated analytical methods to questions in HR management.
Abstract: The systematic application of analytical methods on human resources (HR)-related (big) data is referred to as HR analytics or people analytics. Typical problems in HR analytics include the estimation of churn rates, the identification of knowledge and skill in an organization, and the prediction of success on a job. HR analytics, as opposed to the simple use of key performance indicators, is a growing field of interest because of the rapid growth of volume, velocity, and variety of HR data, driven by the digitalization of work processes. Personnel files used to be in steel lockers in the past. They are now stored in company systems, along with data from hiring processes, employee satisfaction surveys, e-mails, and process data. With the growing prevalence of HR analytics, a discussion around its ethics needs to occur. The objective of this paper is to discuss the ethical implications of the application of sophisticated analytical methods to questions in HR management. This paper builds on previous literature in algorithmic fairness that focuses on technical options to identify, measure, and reduce discrimination in data analysis. This paper applies to HR analytics the ethical frameworks discussed in other fields including medicine, robotics, learning analytics, and coaching.

Journal ArticleDOI
TL;DR: This work detail a novel online agent that learns a set of behavioral constraints by observation and uses these learned constraints when making decisions in an online setting, while still being reactive to reward feedback.
Abstract: AI systems that learn through reward feedback about the actions they take are deployed in domains that have significant impact on our daily life. However, in many cases the online rewards should not be the only guiding criteria, as there are additional constraints and/or priorities imposed by regulations, values, preferences, or ethical principles. We detail a novel online agent that learns a set of behavioral constraints by observation and uses these learned constraints when making decisions in an online setting, while still being reactive to reward feedback. We propose a novel extension to the contextual multi-armed bandit setting and provide a new algorithm called Behavior Constrained Thompson Sampling (BCTS) that allows for online learning while obeying exogenous constraints. Our agent learns a constrained policy that implements observed behavioral constraints demonstrated by a teacher agent, and uses this constrained policy to guide its online exploration and exploitation. We characterize the upper bound on the expected regret of BCTS and provide a case study with real-world data in two application domains. Our experiments show that the designed agent is able to act within the set of behavior constraints without significantly degrading its overall reward performance.

Journal ArticleDOI
TL;DR: This work describes HCLS use cases leveraging these facets of blockchain, including patient consent and health data exchange, outcome-based contracts, next-generation clinical trials, supply chain, and payments and claims, and describes a blockchain-based architecture and platform for enabling these use cases.
Abstract: Major trends in healthcare and life sciences (HCLS) include huge amounts of and longitudinal patient data, policies on a patient's rights to access and control their data, a move from fee-for-service to outcome-based contracts, and regulatory and privacy requirements. Blockchain, as a distributed transactional system of record, can provide underpinnings to enable these trends and enable transformative opportunities in HCLS by providing immutable data on a shared ledger, secure and authenticated transactions, and smart contracts that can represent rules that are executed with secure transactions. We describe HCLS use cases leveraging these facets of blockchain, including patient consent and health data exchange, outcome-based contracts, next-generation clinical trials, supply chain, and payments and claims. We then describe a blockchain-based architecture and platform for enabling these use cases. Finally, we outline a realization of this architecture in a case study and outline further research topics in this domain.

Journal ArticleDOI
TL;DR: A set of story generation systems developed by the authors of this contribution are reviewed, each focusing on different aspects and functions of stories, to provide an initial breakdown of how the term “storytelling” might be either instantiated or broken down into component processes.
Abstract: Narrative generation, understood as the task of constructing computational models of the way in which humans build stories, has been shown to involve a number of separate processes, related to different purposes to which it can be applied, and focusing on specific features that make stories valuable. This paper reviews a set of story generation systems developed by the authors of this contribution, each focusing on different aspects and functions of stories. These systems provide an initial breakdown of how the term “storytelling” might be either instantiated or broken down into component processes. The systems cover functionalities such as generating valid plot structures, simulating character's behaviors or the evolution of affinities between them, either reporting or fictionalizing events observed in real life, and revising a story draft to maximize the suspense it induces in its readers. These functionalities are not intended to exhaust the set of possible operations involved in storytelling, but they constitute an initial set to understand the complexity of the task. The paper also includes two proposals—one theoretical and one technological—for understanding how a set of such functionalities might be composed into a broader operational process that produces more elaborate stories.

Journal ArticleDOI
TL;DR: This article discusses packaging technologies for efficient chip-to-chip communication and presents near-memory-processing architecture for AI accelerations that leverages 3D die-stacking and heterogeneous integration of CMOS and embedded non-volatile memory.
Abstract: The recent progress in artificial intelligence (AI) and machine learning (ML) has enabled computing platforms to solve highly complex difficult problems in computer vision, robotics, finance, security, and science. The algorithmic progress in AI/ML have motivated new research in hardware accelerators. The dedicated accelerators promise high energy efficiency compared to software solutions using CPU. However, as AI/ML models become complex, the increasing memory demands and, hence, high energy/time cost of communication between logic and memory possess a major challenge to energy efficiency. We review the potential of heterogeneous integration in addressing the preceding challenge and present different approaches to leverage heterogeneous integration for energy-efficient AI platforms. First, we discuss packaging technologies for efficient chip-to-chip communication. Second, we present near-memory-processing architecture for AI accelerations that leverages 3D die-stacking. Third, processing-in-memory architectures using heterogeneous integration of CMOS and embedded non-volatile memory are presented. Finally, the article presents case studies that integrate preceding concepts to advance AI/ML hardware platform for different application domains.

Journal ArticleDOI
TL;DR: In this paper, the authors propose ratings as a way to communicate bias risk and methods to rate AI services for bias in a black-box fashion without accessing services training data, which is designed not only to work on single services, but also the composition of services.
Abstract: New decision-support systems are being built using AI services that draw insights from a large corpus of data and incorporate those insights in human-in-the-loop decision environments. They promise to transform businesses, such as health care, with better, affordable, and timely decisions. However, it will be unreasonable to expect people to trust AI systems out of the box if they have been shown to exhibit discrimination across a variety of data usages: unstructured text, structured data, or images. Thus, AI systems come with certain risks, such as failing to recognize people or objects, introducing errors in their output, and leading to unintended harm. In response, we propose ratings as a way to communicate bias risk and methods to rate AI services for bias in a black-box fashion without accessing services training data. Our method is designed not only to work on single services, but also the composition of services, which is how complex AI applications are built. Thus, the proposed method can be used to rate a composite application, like a chatbot, for the severity of its bias by rating its constituent services and then composing the rating, rather than rating the whole system.

Journal ArticleDOI
TL;DR: Here, a mathematical abstraction is defined to capture key aspects of combinatorial creativity and study fundamental tradeoffs between novelty and quality and finds that the maturity of the creative domain directly parameterizes the fundamental limit.
Abstract: Creativity is the generation of an idea or artifact judged to be novel and high-quality by a knowledgeable social group, and is often said to be the pinnacle of intelligence. Several computational creativity systems of various designs are now being demonstrated and deployed. These myriad design possibilities raise the natural question: Are there fundamental limits to creativity? Here, we define a mathematical abstraction to capture key aspects of combinatorial creativity and study fundamental tradeoffs between novelty and quality. The functional form of this fundamental limit resembles the capacity-cost relationship in information theory, especially when measuring novelty using Bayesian surprise—the relative entropy between the empirical distribution of an inspiration set and that set updated with the new idea or artifact. As such, we show how information geometry techniques provide insight into the limits of creativity and find that the maturity of the creative domain directly parameterizes the fundamental limit. This result is extended to the case when there is a diverse audience for creativity and when the quality function is not known but must be estimated from samples.

Journal ArticleDOI
TL;DR: A computational model for generating visual conceptual blends in the domain of sketching is presented, motivated by the desire to encourage curiosity, facilitate creative ideation, and overcome design fixation.
Abstract: Computational creative systems have the potential to augment the creative design process, particularly in co-creative contexts. We present a computational model for generating visual conceptual blends in the domain of sketching. Our system is motivated by the desire to encourage curiosity, facilitate creative ideation, and overcome design fixation. We start with a model for “conceptual shift”: when a sketch recognized as belonging to one category is visually similar to a sketch from a semantically distinct category. Identifying a potential conceptual shift enables visual analogy and/or conceptual blending. The intent is that these conceptually shifted or blended sketches could be presented to designers to encourage analogical reasoning and creative outcomes. We define and demonstrate our model for the conceptual shift task and alternative approaches for enabling conceptual blends. We describe future plans for evaluation in co-creative contexts.

Journal ArticleDOI
TL;DR: In this article, the authors explore a strengths-based approach to predictive model building in the context of child welfare, using a risk-and-resilience framework applied in the field of social work.
Abstract: Artificial intelligence (AI), when combined with statistical techniques such as predictive analytics, has been increasingly applied in high-stakes decision-making systems seeking to predict and/or classify the risk of clients experiencing negative outcomes while receiving services. One such system is child welfare, where the disproportionate involvement of marginalized and vulnerable children and families raises ethical concerns about building fair and equitable models. One central issue in this debate is the over-representation of risk factors in algorithmic inputs and outputs, as well as the concomitant over-reliance on predicting risk. Would models perform better across groups if variables represented risk and protective factors associated with outcomes of interest? In addition, would models be more equitable across groups if they predicted alternative service outcomes? Using a risk-and-resilience framework applied in the field of social work, and the child welfare system as an illustrative example, this article explores a strengths-based approach to predictive model building. We define risk and protective factors, and then identify and illustrate how protective factors perform in a model trained to predict an alternative outcome of child welfare service involvement: the unsubstantiation of an allegation of maltreatment.

Journal ArticleDOI
TL;DR: It is shown that a co-designed neural net model can yield an improvement of 2.6/8.3× in inference speed and 2.25/7.5× in energy as compared to SqueezeNet/AlexNet, while improving the accuracy of the model.
Abstract: Deep Learning is arguably the most rapidly evolving research area in recent years. As a result, it is not surprising that the design of state-of-the-art deep neural net models often proceeds without much consideration of the latest hardware targets, and the design of neural net accelerators proceeds without much consideration of the characteristics of the latest deep neural net models. Nevertheless, in this article, we show that there are significant improvements available if deep neural net models and neural net accelerators are co-designed. In particular, we show that a co-designed neural net model can yield an improvement of 2.6/8.3× in inference speed and 2.25/7.5× in energy as compared to SqueezeNet/AlexNet, while improving the accuracy of the model. We also demonstrate that a careful tuning of the neural net accelerator architecture to a deep neural net model can lead to a 1.9–6.3× improvement in inference speed.

Journal ArticleDOI
TL;DR: Summit as mentioned in this paper is an accelerated-node architecture with 4,608 nodes, each with two IBM P9 and six NVIDIA Volta V100 GPU processors, significant DRAM footprint, robust HBM quantities supporting the GPUs, nonvolatile memory, and fast NVLink and Infiniband interconnects.
Abstract: Oak Ridge National Laboratory (ORNL) installed the Summit supercomputer in 2018. Summit is an accelerated-node architecture with 4,608 nodes, each with two IBM P9 and six NVIDIA Volta V100 GPU processors, significant DRAM footprint, robust HBM quantities supporting the GPUs, nonvolatile memory, and fast NVLink and Infiniband interconnects. This machine was designed to deliver over 200 peak double-precision petaflops for scientific modeling and simulation applications and over 3 peak reduced-precision ExaOps. Summit features impact application performance depending on whether the codes are simulation-oriented, write-intensive, data-analysis-oriented, read-intensive, or communication-intensive codes. In the context of artificial intelligence (AI) and machine learning (ML), these features support data-intensive applications that infer and predict statistical relationships in complex datasets. This article presents recent experiences at ORNL using Summit for applications in AI and ML and describes example code and algorithmic changes necessary to use Summit effectively. Finally, this article discusses research directions in scalable ML, including, algorithms research and combining data analysis with modeling and simulation in an accelerated-node, exascale environment.

Journal ArticleDOI
TL;DR: BSRA provides a foundational set of architectural artifacts that can help accelerate development of solutions based on Blockchain technologies and provides a starter list of architectural decisions that every project will have to set during development.
Abstract: Enterprise Blockchain network solutions facilitate the extension and optimization of processes across organizational boundaries. This increased level of sharing provides greater visibility, provenance of transactions to all parties involved, and reduces friction that arises due to organizational boundaries. Blockchain Solution Reference Architecture (BSRA) provides comprehensive guidance to architect and build end-to-end solutions based on Blockchain technologies. BSRA was primarily developed based on client engagements across industries such as retail, financial, supply chain, and telco, and addresses two major parts of a Blockchain solution: 1) building a Blockchain business network that receives, builds, and shares blocks in a secure manner; and 2) onboarding members on to the business network with the appropriate level of access and collaboration privileges. BSRA provides a foundational set of architectural artifacts that can help accelerate development of solutions based on Blockchain technologies. It also provides a starter list of architectural decisions that every project will have to set during development.