Author
William G. Macready
Other affiliations: IBM, Santa Fe Institute, University of Toronto
Bio: William G. Macready is an academic researcher from D-Wave Systems. The author has contributed to research in topics: Quantum computer & Optimization problem. The author has an hindex of 34, co-authored 91 publications receiving 13024 citations. Previous affiliations of William G. Macready include IBM & Santa Fe Institute.
Papers published on a yearly basis
Papers
More filters
•
TL;DR: In this paper, the authors address the question "Are some classes of combinatorial optimization problems instrinsically harder than others, without regard to the algorithm one uses, or can difficulty only be assessed relative to particular algorithms?" and present two algorithm-independent quantities that use this measure to provide answers to their question.
Abstract: We address the question "Are some classes of combinatorial optimization problems instrinsically harder than others, without regard to the algorithm one uses, or can difficulty only be assessed relative to particular algorithms?" We provide a measure of the hardness of a particular optimization problem for a particular optimization algorithm. We then present two algorithm-independent quantities that use this measure to provide answers to our question. In the first of these we average hardness over all possible algorithms for the optimization problem at hand. We show that according to this quantitiy, there is no distinction between optimization problems, and in this sense no problems are intrinsically harder than others. For the second quantitiy, rather than average over all algorithms we consider the level of hardness of a problem (or class of problems) for the algorithm that is optimal for that problem (or class of problems). Here there are classes of problems that are intrinsically harder than others.
11 citations
•
27 Jul 2007TL;DR: Analog processors such as quantum processors are employed to predict the native structures of proteins based on a primary structure of a protein this paper, and a target graph may be created of sufficient size to permit embedding of all possible native multi-dimensional topologies of the protein.
Abstract: Analog processors such as quantum processors are employed to predict the native structures of proteins based on a primary structure of a protein. A target graph may be created of sufficient size to permit embedding of all possible native multi-dimensional topologies of the protein. At least one location in a target graph may be assigned to represent a respective amino acid forming the protein. An energy function is generated based assigned locations in the target graph. The energy function is mapped onto an analog processor, which is evolved from an initial state to a final state, the final state predicting a native structure of the protein.
11 citations
•
18 Nov 2011TL;DR: In this article, a set of weights and a dictionary are cast as Boolean variables and a non-quantum processor is used to optimize the objective for the dictionary based on the Boolean weights by updating at least some columns of the dictionary.
Abstract: Methods for solving a computational problem including minimizing an objective including a set of weights and a dictionary by casting the weights as Boolean variables and alternately using a quantum processor and a non-quantum processor to successively optimize the weights and the dictionary, respectively. A first set of values for the dictionary is guessed and the objective is mapped to a QUBO. A quantum processor is used to optimize the objective for the Boolean weights based on the first set of values for the dictionary by minimizing the resulting QUBO. A non-quantum processor is used to optimize the objective for the dictionary based on the Boolean weights by updating at least some of the columns of the dictionary. These processes are successively repeated until a solution criterion is met. Minimization of the objective may be used to generate features in a learning problem and/or in data compression.
10 citations
•
TL;DR: The existence of a phase transition in combinatorial optimization problems was shown in this paper, where the quality of local search algorithms degrades to no better than random search for a family of generalized spin-glass models and the traveling salesman problem.
Abstract: We demonstrate the existence of a phase transition in combinatorial optimization problems. For many of these problems, as local search algorithms are parallelized, the quality of solutions first improves and then sharply degrades to no better than random search. This transition can be successfully characterized using finite-size scaling, a technique borrowed from statistical physics. We demonstrate our results for a family of generalized spin-glass models and the Traveling Salesman Problem. We determine critical exponents, investigate the effects of noise, and discuss conditions for the existence of the phase transition.
9 citations
•
TL;DR: Taking a system's self-dissimilarity over various scales as a complexity "signature" of the system, this work can compare the complexity signatures of wholly different kinds of systems (e.g., systems involving information density in a digital computer vs. systems involving species densities in a rainforest, vs. capital density in an economy etc.).
Abstract: For systems usually characterized as complex/living/intelligent, the spatio-temporal patterns exhibited on different scales differ markedly from one another. (E.g., the biomass distribution of a human body looks very different depending on the spatial scale at which one examines that biomass.) Conversely, the density patterns at different scales in non-living/simple systems (e.g., gases, mountains, crystal) do not vary significantly from one another. Such self-dissimilarity can be empirically measured on almost any real-world data set involving spatio-temporal densities, be they mass densities, species densities, or symbol densities. Accordingly, taking a system's (empirically measurable) self-dissimilarity over various scales as a complexity "signature" of the system, we can compare the complexity signatures of wholly different kinds of systems (e.g., systems involving information density in a digital computer vs. systems involving species densities in a rainforest, vs. capital density in an economy etc.). Signatures can also be clustered, to provide an empirically determined taxonomy of kinds of systems that share organizational traits. Many of our candidate self-dissimilarity measures can also be calculated (or at least approximated) for physical models. The measure of dissimilarity between two scales that we finally choose is the amount of extra information on one of the scales beyond that which exists on the other scale. It is natural to determine this "added information" using a maximum entropy inference of the pattern at the second scale, based on the provided patter at the first scale. We briefly discuss using our measure with other inference mechanisms (e.g., Kolmogorov complexity-based inference, fractal-dimension preserving inference, etc.).
9 citations
Cited by
More filters
••
[...]
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
79,257 citations
•
[...]
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.
38,208 citations
•
01 Nov 2008TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.
Abstract: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.
17,420 citations
••
[...]
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).
13,246 citations
••
IBM1
TL;DR: A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving and a number of "no free lunch" (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class.
Abstract: A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of "no free lunch" (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class. These theorems result in a geometric interpretation of what it means for an algorithm to be well suited to an optimization problem. Applications of the NFL theorems to information-theoretic aspects of optimization and benchmark measures of performance are also presented. Other issues addressed include time-varying optimization problems and a priori "head-to-head" minimax distinctions between optimization algorithms, distinctions that result despite the NFL theorems' enforcing of a type of uniformity over all algorithms.
10,771 citations