scispace - formally typeset
Search or ask a question
Author

William G. Macready

Bio: William G. Macready is an academic researcher from D-Wave Systems. The author has contributed to research in topics: Quantum computer & Optimization problem. The author has an hindex of 34, co-authored 91 publications receiving 13024 citations. Previous affiliations of William G. Macready include IBM & Santa Fe Institute.


Papers
More filters
Journal ArticleDOI
TL;DR: This work develops techniques for effectively encoding SAT -and, with some limitations, MaxSAT- into Ising problems compatible with sparse QA architectures, and provides the theoretical foundations for this mapping.
Abstract: Quantum annealers (QAs) are specialized quantum computers that minimize objective functions over discrete variables by physically exploiting quantum effects. Current QA platforms allow for the optimization of quadratic objectives defined over binary variables (qubits), also known as Ising problems. In the last decade, QA systems as implemented by D-Wave have scaled with Moore-like growth. Current architectures provide 2048 sparsely-connected qubits, and continued exponential growth is anticipated, together with increased connectivity. We explore the feasibility of such architectures for solving SAT and MaxSAT problems as QA systems scale. We develop techniques for effectively encoding SAT –and, with some limitations, MaxSAT– into Ising problems compatible with sparse QA architectures. We provide the theoretical foundations for this mapping, and present encoding techniques that combine offline Satisfiability and Optimization Modulo Theories with on-the-fly placement and routing. Preliminary empirical tests on a current generation 2048-qubit D-Wave system support the feasibility of the approach for certain SAT and MaxSAT problems.

22 citations

Posted Content
TL;DR: In this paper, the authors introduce a technology landscape into the search-theoretic framework and show that early in the search for technological improvements, if the initial position is poor or average, it is optimal to search far away on the technology landscape; but as the firm succeeds in finding technological improvements it was optimal to confine search to a local region of the landscape.
Abstract: Technological change at the firm-level has commonly been modeled as random sampling from a fixed distribution of possibilities. Such models typically ignore empirically important aspects of the firm's search process, namely the related observations that a firm's current technology constrains future innovation and that firms' technological search tends to be local in nature. In this paper we explicitly treat these aspects of the firm's search for technological improvements by introducing a technology landscape into the search-theoretic framework. Technological search is modeled as movement over a technology landscape with the firm's adaptive walk constrained by the firm's location on the landscape, the correlation structure of the landscape and the cost of innovation. We show that the standard search model is attained as a limiting case of a more general landscape search model. We obtain two key results, otherwise unavailable in the standard search model: the presence of local optima in space of technological possibilities and the determination of the optimal search distance. We find that early in the search for technological improvements, if the initial position is poor or average, it is optimal to search far away on the technology landscape; but as the firm succeeds in finding technological improvements it is optimal to confine search to a local region of the landscape. Notably, we obtain diminishing returns to search without having to make the assumption that the firm's repeated draws from the search space are independent and identically distributed. The distinction between dramatic technological improvements ("innovations") and minor technological improvements hinges on the distance at which a firm decides to sample the technology landscape. Submitted to J. Pol. Econ.

19 citations

Posted Content
TL;DR: An analytically simple bandit model is provided that is more directly applicable to optimization theory than the traditional bandit problem, and a near-optimal strategy is determined for that model.
Abstract: We explore the 2-armed bandit with Gaussian payoffs as a theoretical model for optimization. We formulate the problem from a Bayesian perspective, and provide the optimal strategy for both 1 and 2 pulls. We present regions of parameter space where a greedy strategy is provably optimal. We also compare the greedy and optimal strategies to a genetic-algorithm-based strategy. In doing so we correct a previous error in the literature concerning the Gaussian bandit problem and the supposed optimality of genetic algorithms for this problem. Finally, we provide an analytically simple bandit model that is more directly applicable to optimization theory than the traditional bandit problem, and determine a near-optimal strategy for that model.

17 citations

Patent
01 Apr 2015
TL;DR: In this article, a sampling device may be summarized as including updating a set of samples to include the sample from the probability distribution, and returning the set of sampled samples to the sampling device.
Abstract: The systems, devices, articles, and methods generally relate to sampling from an available probability distribution The samples maybe used to create a desirable probability distribution, for instance for use in computing values used in computational techniques including: Importance Sampling and Markov chain Monte Carlo systems An analog processor may operate as a sample generator, for example by: programming the analog processor with a configuration of the number of programmable parameters for the analog processor, which corresponds to a probability distribution over qubits of the analog processor, evolving the analog processor, and reading out states for the qubits The states for the qubits in the plurality of qubits correspond to a sample from the probability distribution Operation of the sampling device may be summarized as including updating a set of samples to include the sample from the probability distribution, and returning the set of samples

17 citations

Journal ArticleDOI
TL;DR: The rms hole-hole separation in the two-hole ground states is determined and evidence of important finite-size effects for {ital t}/{ital J}{approx gt}1, for which the rmshole-holeseparation is clearly constrained by the 4{times}4 lattice.
Abstract: We present accurate numerical results for low-lying one- and two-hole states in the {ital t}-{ital J} model on a 4{times}4 lattice. We find six level crossings in the one-hole ground state for 0{lt}{ital t}/{ital J}{lt}{infinity}; accurate {ital t}/{ital J} values of these crossings and the associated ground-state quantum numbers are given. A degeneracy of {bold k}=(0,0) S=1/2, and {ital S}=3/2 one-hole levels at {ital t}/{ital J}=1/2 is noted, which is consistent with a recent analytical result. For small {ital t}/{ital J}, the {ital S}=1/2 one-hole and {ital S}=0 two-hole bandwidths on the 4{times}4 lattice are {ital W}{sub {ital h}}=(1.190 445 7(1)){ital t} and {ital W}{sub hh}=(2.575(4)){ital t}{sup 2}/{ital J}, respectively. The origin of these qualitatively different behaviors is discussed, and a simple relation is found between the small-({ital t}/{ital J}) one-hole bandwidth and a static-hole ground-state matrix element. The linear-{ital t} term in {ital W}{sub {ital h}} is apparently a finite-lattice artifact. As a measure of finite-size effects we determined the rms hole-hole separation in the two-hole ground states; we find evidence of important finite-size effects for {ital t}/{ital J}{approx gt}1, for which the rms hole-hole separation is clearly constrained by the 4{times}4 lattice. Intermediate-({ital t}/{ital J}) hole separations andmore » binding energies for 0.3{approx lt}{ital t}/{ital J}{approx lt}1, however, scale approximately as powers of {ital t}/{ital J}, and can be used to give bulk-limit estimates for {ital t}/{ital J}=3. In particular, we estimate that the bulk-limit ground-state rms hole-hole separation at {ital t}/{ital J}=3 is {approx}1.8{ital a}{sub 0}, corresponding to 7 A in the high-temperature superconductors.« less

14 citations


Cited by
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations

Book
01 Nov 2008
TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.
Abstract: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.

17,420 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving and a number of "no free lunch" (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class.
Abstract: A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of "no free lunch" (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class. These theorems result in a geometric interpretation of what it means for an algorithm to be well suited to an optimization problem. Applications of the NFL theorems to information-theoretic aspects of optimization and benchmark measures of performance are also presented. Other issues addressed include time-varying optimization problems and a priori "head-to-head" minimax distinctions between optimization algorithms, distinctions that result despite the NFL theorems' enforcing of a type of uniformity over all algorithms.

10,771 citations