TL;DR: A practical adaptive step size random search algorithm is proposed, and experimental experience shows the superiority of random search over other methods for sufficiently high dimension.
Abstract: Fixed step size random search for minimization of functions of several parameters is described and compared with the fixed step size gradient method for a particular surface. A theoretical technique, using the optimum step size at each step, is analyzed. A practical adaptive step size random search algorithm is then proposed, and experimental experience is reported that shows the superiority of random search over other methods for sufficiently high dimension.
TL;DR: Two general convergence proofs for random search algorithms are given and how these extend those available for specific variants of the conceptual algorithm studied here are shown.
Abstract: We give two general convergence proofs for random search algorithms. We review the literature and show how our results extend those available for specific variants of the conceptual algorithm studied here. We then exploit the convergence results to examine convergence rates and to actually design implementable methods. Finally we report on some computational experience.
1,550 citations
Cites background or methods from "Adaptive step size random search"
...where p is the optimal step size, see [12]....
[...]
...5 [12])....
[...]
...Since the expected decrease in the function value is p2p [12] we see that the expected step in the direction of the solution is cp/n....
[...]
...This "linearity" was first observed by Schumer and Steiglitz [12], the algorithm that they propose has K ' 80....
[...]
...Schumer and Steiglitz [12] introduce adaptive step size methods; here Pk is increased or decreased depending on the number of successes or failures in finding lower values of f on S in the preceding iterations....
TL;DR: The numerical performance of the BGA is demonstrated on a test suite of multimodal functions and the number of function evaluations needed to locate the optimum scales only as n ln(n) where n is thenumber of parameters.
Abstract: In this paper a new genetic algorithm called the Breeder Genetic Algorithm (BGA) is introduced. The BGA is based on artificial selection similar to that used by human breeders. A predictive model for the BGA is presented that is derived from quantitative genetics. The model is used to predict the behavior of the BGA for simple test functions. Different mutation schemes are compared by computing the expected progress to the solution. The numerical performance of the BGA is demonstrated on a test suite of multimodal functions. The number of function evaluations needed to locate the optimum scales only as n ln(n) where n is the number of parameters. Results up to n = 1000 are reported.
1,267 citations
Cites background from "Adaptive step size random search"
...For rh = 1:225r=pn the expected progress was computed for large n and smallr in (Schumer & Steiglitz, 1968) E(n; r)r = 0:2n (21)We now turn to normal distributed mutation....
[...]
...The optimal normalized average progress for uniform distributed mutation decreases exponentially with n. The reason for this behavior is the well-known fact that the volume of the unit sphere in n dimensions goes to zero for n + 00. Better results can be obtained if the uniform distribution is restricted to a hypersphere with radius rh. For rh = 1.22Sr/fi the expected progress was computed for large n and small Y in Schumer and Steiglitz ......
TL;DR: In this paper, the authors present a set of 175 benchmark functions for unconstrained optimization problems with diverse properties in terms of modality, separability, and valley landscape, which can be used for validation of new optimization in the future.
Abstract: Test functions are important to validate and compare the performance of optimization algorithms. There have been many test or benchmark functions reported in the literature; however, there is no standard list or set of benchmark functions. Ideally, test functions should have diverse properties so that can be truly useful to test new algorithms in an unbiased way. For this purpose, we have reviewed and compiled a rich set of 175 benchmark functions for unconstrained optimization problems with diverse properties in terms of modality, separability, and valley landscape. This is by far the most complete set of functions so far in the literature, and tt can be expected this complete set of functions can be used for validation of new optimization in the future.
TL;DR: In this paper, the authors present a set of 175 benchmark functions for unconstrained optimisation problems with diverse properties in terms of modality, separability, and valley landscape.
Abstract: Test functions are important to validate and compare the performance of optimisation algorithms. There have been many test or benchmark functions reported in the literature; however, there is no standard list or set of benchmark functions. Ideally, test functions should have diverse properties to be truly useful to test new algorithms in an unbiased way. For this purpose, we have reviewed and compiled a rich set of 175 benchmark functions for unconstrained optimisation problems with diverse properties in terms of modality, separability, and valley landscape. This is by far the most complete set of functions so far in the literature, and it can be expected that this complete set of functions can be used for validation of new optimisation in the future.
TL;DR: The Square Attack is a score-based black-box attack that does not rely on local gradient information and thus is not affected by gradient masking, and can outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate.
Abstract: We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the feasible set. Our method is significantly more query efficient and achieves a higher success rate compared to the state-of-the-art methods, especially in the untargeted setting. In particular, on ImageNet we improve the average query efficiency in the untargeted setting for various deep networks by a factor of at least $1.8$ and up to $3$ compared to the recent state-of-the-art $l_\infty$-attack of Al-Dujaili & O'Reilly. Moreover, although our attack is black-box, it can also outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate. The code of our attack is available at this https URL.
362 citations
Cites background from "Adaptive step size random search"
...The Square Attack exploits random search [46,48] which is one of the simplest approaches for blackbox optimization....
[...]
...Many variants of random search have been introduced [38,48,47], which differ mainly in how the random perturbation is chosen at each iteration (the original...
TL;DR: A method is described for the minimization of a function of n variables, which depends on the comparison of function values at the (n 41) vertices of a general simplex, followed by the replacement of the vertex with the highest value by another point.
Abstract: A method is described for the minimization of a function of n variables, which depends on the comparison of function values at the (n 41) vertices of a general simplex, followed by the replacement of the vertex with the highest value by another point. The simplex adapts itself to the local landscape, and contracts on to the final minimum. The method is shown to be effective and computationally compact. A procedure is given for the estimation of the Hessian matrix in the neighbourhood of the minimum, needed in statistical estimation problems.
TL;DR: In the design of experiments for the purpose of seeking maxima, random methods have been shown to have an important place in the consideration of the experimenter as discussed by the authors, and the rationale, application, and relative merits of random methods are discussed.
Abstract: In the design of experiments for the purpose of seeking maxima, random methods are shown to have an important place in the consideration of the experimenter. For a rather large class of experimental situations, an elementary probability formulation leads to an exact statement for the number of trials required in the experiment. The rationale, application, and relative merits of random methods are discussed.
TL;DR: An iterative method which is not unlike the conjugate gradient method of Hestenes and Stiefel (1952), and which finds stationary values of a general function, which has second-order convergence.
Abstract: Eighteen months ago Rosenbrock (1960) published a paper in this journal on finding the greatest or least value of a function of several variables. A number of methods were listed and they all have first-order convergence. Six months ago Martin and Tee (1961) published a paper in which they mentioned gradient methods which have second-order convergence for finding the minimum of a quadratic positive definite function. In this paper will be described an iterative method which is not unlike the conjugate gradient method of Hestenes and Stiefel (1952), and which finds stationary values of a general function. It has second-order convergence, so near a stationary value it converges more quickly than Rosenbrock's variation of the steepest descents method and, although each iteration is rather longer because the method is applicable to a general function, the rate of convergence is comparable to that of the more powerful of the gradient methods described by Martin and Tee.