Showing papers on "Maxima and minima published in 2015"

PDF

Open Access

Proceedings Article•

Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition

[...]

Rong Ge¹, Furong Huang², Chi Jin³, Yang Yuan⁴•Institutions (4)

Microsoft¹, University of California, Irvine², University of California, Berkeley³, Cornell University⁴

26 Jun 2015

TL;DR: In this article, the authors show that stochastic gradient descent converges to a local minimum in a polynomial number of iterations for orthogonal tensor decomposition.

...read moreread less

Abstract: We analyze stochastic gradient descent for optimizing non-convex functions. In many cases for non-convex functions the goal is to find a reasonable local minimum, and the main concern is that gradient updates are trapped in saddle points. In this paper we identify strict saddle property for non-convex problem that allows for efficient optimization. Using this property we show that from an arbitrary starting point, stochastic gradient descent converges to a local minimum in a polynomial number of iterations. To the best of our knowledge this is the first work that gives global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. Our analysis can be applied to orthogonal tensor decomposition, which is widely used in learning a rich class of latent variable models. We propose a new optimization formulation for the tensor decomposition problem that has strict saddle property. As a result we get the first online algorithm for orthogonal tensor decomposition with global convergence guarantee.

...read moreread less

1,016 citations

Proceedings Article•

The Loss Surfaces of Multilayer Networks

[...]

Anna Choromanska¹, Mikael Henaff², Michael Mathieu², Gérard Ben Arous², Yann LeCun² - Show less +1 more•Institutions (2)

Wrocław Medical University¹, New York University²

21 Feb 2015

TL;DR: In this paper, the authors study the connection between the loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of variable independence, redundancy in network parametrization, and uniformity.

...read moreread less

Abstract: We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band diminishes exponentially with the size of the network. We empirically verify that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band of low critical points, and that all critical points found there are local minima of high quality measured by the test error. This emphasizes a major difference between largeand small-size networks where for the latter poor quality local minima have nonzero probability of being recovered. Finally, we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting.

...read moreread less

970 citations

Proceedings Article•DOI•

Information-theoretic mapping using Cauchy-Schwarz Quadratic Mutual Information

[...]

Benjamin Charrow¹, Sikang Liu¹, Vijay Kumar¹, Nathan Michael²•Institutions (2)

University of Pennsylvania¹, Carnegie Mellon University²

26 May 2015

TL;DR: A computationally efficient control policy for active perception that incorporates explicit models of sensing and mobility to build 3D maps with ground and aerial robots is developed.

...read moreread less

Abstract: We develop a computationally efficient control policy for active perception that incorporates explicit models of sensing and mobility to build 3D maps with ground and aerial robots Like previous work, our policy maximizes an information-theoretic objective function between the discrete occupancy belief distribution (eg, voxel grid) and future measurements that can be made by mobile sensors However, our work is unique in three ways First, we show that by using Cauchy-Schwarz Quadratic Mutual Information (CSQMI), we get significant gains in efficiency Second, while most previous methods adopt a myopic, gradient-following approach that yields poor convergence properties, our algorithm searches over a set of paths and is less susceptible to local minima In doing so, we explicitly incorporate models of sensors, and model the dependence (and independence) of measurements over multiple time steps in a path Third, because we consider models of sensing and mobility, our method naturally applies to both ground and aerial vehicles The paper describes the basic models, the problem formulation and the algorithm, and demonstrates applications via simulation and experimentation

...read moreread less

158 citations

Journal Article•DOI•

Consistent use of effective potentials

[...]

Anders Andreassen¹, William Frost¹, Matthew D. Schwartz¹•Institutions (1)

Harvard University¹

28 Jan 2015-Physical Review D

TL;DR: In this article, it was shown that an infinite class of loops can be summed (and must be summed) to give a gauge-invariant value for the potential at its minimum, and the exact potential depends on both the scale at which it is calculated and the normalization of the fields, but the vacuum energy does not.

...read moreread less

Abstract: It is well known that effective potentials can be gauge dependent while their values at extrema should be gauge invariant. Unfortunately, establishing this invariance in perturbation theory is not straightforward, since contributions from arbitrarily high-order loops can be of the same size. We show in massless scalar QED that an infinite class of loops can be summed (and must be summed) to give a gauge-invariant value for the potential at its minimum. In addition, we show that the exact potential depends on both the scale at which it is calculated and the normalization of the fields, but the vacuum energy does not. Using these insights, we propose a method to extract some physical quantities from effective potentials which is self-consistent order by order in perturbation theory, including improvement with the renormalization group.

...read moreread less

119 citations

Journal Article•DOI•

Particle Swarm Optimization with Sequential Niche Technique for Dynamic Finite Element Model Updating

[...]

Faisal Shabbir¹, Piotr Omenzetter²•Institutions (2)

University of Engineering and Technology¹, University of Aberdeen²

01 May 2015-Computer-aided Civil and Infrastructure Engineering

TL;DR: A methodology based on particle swarm optimization (PSO), a GOA, with sequential niche technique (SNT) for FE model updating is proposed and explored and considerably increases the confidence in finding the global minimum.

...read moreread less

Abstract: Due to uncertainties associated with material properties, structural geometry, boundary conditions, and connectivity of structural parts as well as inherent simplifying assumptions in the development of finite element (FE) models, actual behavior of structures often differs from model predictions FE model updating comprises a multitude of techniques that systematically calibrate FE models in order to match experimental results Updating of structural models can be posed as an optimization problem where model parameters that minimize the errors between the responses of the model and actual structure are sought However, due to limited number of experimental responses and measurement errors, the optimization problem may have multiple admissible solutions in the search domain Global optimization algorithms (GOAs) are useful and efficient tools in such situations as they try to find the globally optimal solution out of many possible local minima, but are not totally immune to missing the right minimum in complex problems such as those encountered in updating A methodology based on particle swarm optimization (PSO), a GOA, with sequential niche technique (SNT) for FE model updating is proposed and explored in this article The combination of PSO and SNT enables a systematic search for multiple minima and considerably increases the confidence in finding the global minimum The method is applied to FE model updating of a pedestrian cable-stayed bridge using modal data from full-scale dynamic testing

...read moreread less

115 citations

Journal Article•DOI•

Cognition-Driven Formulation of Space Mapping for Equal-Ripple Optimization of Microwave Filters

[...]

Chao Zhang¹, Feng Feng¹, Venu-Madhav-Reddy Gongal-Reddy¹, Qi-Jun Zhang¹, John W. Bandler² - Show less +1 more•Institutions (2)

Carleton University¹, McMaster University²

02 Jun 2015-IEEE Transactions on Microwave Theory and Techniques

TL;DR: In this paper, a cognition-driven space mapping method was proposed for microwave filter optimization, which utilizes two sets of intermediate feature space parameters, including feature frequency parameters and ripple height parameters.

...read moreread less

Abstract: Space mapping is a recognized method for speeding up electromagnetic (EM) optimization. Existing space-mapping approaches belong to the class of surrogate-based optimization methods. This paper proposes a cognition-driven formulation of space mapping that does not require explicit surrogates. The proposed method is applied to EM-based filter optimization. The new technique utilizes two sets of intermediate feature space parameters, including feature frequency parameters and ripple height parameters. The design variables are mapped to the feature frequency parameters, which are further mapped to the ripple height parameters. By formulating the cognition-driven optimization directly in the feature space, our method increases optimization efficiency and the ability to avoid being trapped in local minima. The technique is suitable for design of filters with equal-ripple responses. It is illustrated by two microwave filter examples.

...read moreread less

111 citations

Journal Article•DOI•

Parametric Robust Structured Control Design

[...]

Pierre Apkarian, Minh N. Dao¹, Dominikus Noll•Institutions (1)

Hanoi National University of Education¹

29 Jan 2015-IEEE Transactions on Automatic Control

TL;DR: In this paper, a nonsmooth minimization method tailored to functions which are semi-infinite minima of smooth functions is presented, which can deal with complex problems involving multiple possibly repeated uncertain parameters.

...read moreread less

Abstract: We present a new approach to parametric robust controller design, where we compute controllers of arbitrary order and structure which minimize the worst-case $H_{\infty} $ norm over a pre-specified set of uncertain parameters. At the core of our method is a nonsmooth minimization method tailored to functions which are semi-infinite minima of smooth functions. A rich test bench and a more detailed example illustrate the potential of the technique, which can deal with complex problems involving multiple possibly repeated uncertain parameters.

...read moreread less

109 citations

Journal Article•DOI•

An automated transition state search using classical trajectories initialized at multiple minima.

[...]

Emilio Martínez-Núñez¹•Institutions (1)

University of Santiago de Compostela¹

27 May 2015-Physical Chemistry Chemical Physics

TL;DR: An improved automated search procedure is developed, which consists of iteratively running different ensembles of trajectories initialized at different minima, and it is suggested that at least these two channels: three-body dissociation and CO elimination occur on the ground electronic state.

...read moreread less

Abstract: Very recently, we proposed an automated method for finding transition states of chemical reactions using dynamics simulations; the method has been termed Transition State Search using Chemical Dynamics Simulations (TSSCDS) (E. Martinez-Nunez, J. Comput. Chem., 2015, 36, 222-234). In the present work, an improved automated search procedure is developed, which consists of iteratively running different ensembles of trajectories initialized at different minima. The iterative TSSCDS method is applied to the complex C3H4O system, obtaining a total of 66 different minima and 276 transition states. With the obtained transition states and paths, statistical RRKM calculations and Kinetic Monte Carlo simulations are carried out to study the fragmentation dynamics of propenal, which is the global minimum of the system. The kinetic simulations provide a (three-body dissociation)/(CO elimination) ratio of 1.49 for an excitation energy of 148 kcal mol(-1), which agrees well with the corresponding value obtained in the photolysis of propenal at 193 nm (1.1), suggesting that at least these two channels: three-body dissociation (to give H2 + CO + C2H2) and CO elimination occur on the ground electronic state.

...read moreread less

104 citations

Journal Article•

Open Problem: The landscape of the loss surfaces of multilayer networks

[...]

Anna Choromanska, Yann LeCun, Gérard Ben Arous

01 Jan 2015-Journal of Machine Learning Research

TL;DR: The question is whether it is possible to drop some of these assumptions to establish a stronger connection between both models.

...read moreread less

Abstract: Deep learning has enjoyed a resurgence of interest in the last few years for such applications as image and speech recognition, or natural language processing. The vast majority of practical applications of deep learning focus on supervised learning, where the supervised loss function is minimized using stochastic gradient descent. The properties of this highly non-convex loss function, such as its landscape and the behavior of critical points (maxima, minima, and saddle points), as well as the reason why largeand small-size networks achieve radically different practical performance, are however very poorly understood. It was only recently shown that new results in spin-glass theory potentially may provide an explanation for these problems by establishing a connection between the loss function of the neural networks and the Hamiltonian of the spherical spin-glass models. The connection between both models relies on a number of possibly unrealistic assumptions, yet the empirical evidence suggests that the connection may exist in real. The question we pose is whether it is possible to drop some of these assumptions to establish a stronger connection between both models.

...read moreread less

85 citations

Proceedings Article•DOI•

A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation

[...]

Helge Rhodin¹, Nadia Robertini¹, Christian Richardt¹, Christian Richardt², Hans-Peter Seidel¹, Christian Theobalt¹ - Show less +2 more•Institutions (2)

Max Planck Society¹, Intel²

07 Dec 2015

TL;DR: A new scene representation is presented that enables an analytically differentiable closed-form formulation of surface visibility and results in a new image formation model that represents opaque objects by a translucent medium with a smooth Gaussian density distribution which turns visibility into a smooth phenomenon.

...read moreread less

Abstract: Generative reconstruction methods compute the 3D configuration (such as pose and/or geometry) of a shape by optimizing the overlap of the projected 3D shape model with images. Proper handling of occlusions is a big challenge, since the visibility function that indicates if a surface point is seen from a camera can often not be formulated in closed form, and is in general discrete and non-differentiable at occlusion boundaries. We present a new scene representation that enables an analytically differentiable closed-form formulation of surface visibility. In contrast to previous methods, this yields smooth, analytically differentiable, and efficient to optimize pose similarity energies with rigorous occlusion handling, fewer local minima, and experimentally verified improved convergence of numerical optimization. The underlying idea is a new image formation model that represents opaque objects by a translucent medium with a smooth Gaussian density distribution which turns visibility into a smooth phenomenon. We demonstrate the advantages of our versatile scene model in several generative pose estimation problems, namely marker-less multi-object pose estimation, marker-less human motion capture with few cameras, and image-based 3D geometry estimation.

...read moreread less

85 citations

Proceedings Article•

Open Problem: The landscape of the loss surfaces of multilayer networks

[...]

Anna Choromanska¹, Yann LeCun¹, Gérard Ben Arous¹•Institutions (1)

New York University¹

26 Jun 2015

TL;DR: In this article, it was shown that new results in spin-glass theory potentially may provide an explanation for these problems by establishing a connection between the loss function of the neural networks and the Hamiltonian of the spherical spinglass models.

...read moreread less

Journal Article•DOI•

Robust identification of continuous-time models with arbitrary time-delay from irregularly sampled data

[...]

Fengwei Chen¹, Fengwei Chen², Hugues Garnier², Hugues Garnier¹, Marion Gilson², Marion Gilson¹ - Show less +2 more•Institutions (2)

Centre national de la recherche scientifique¹, University of Lorraine²

01 Jan 2015-Journal of Process Control

TL;DR: In this paper, a separable nonlinear least-squares method is proposed to identify continuous-time systems with arbitrary time-delay from irregularly sampled input-output data, which combines in a bootstrap manner the iterative optimal instrumental variable method for transfer function model estimation with an adaptive gradient-based technique that searches for the optimal time delay.

...read moreread less

Proceedings Article•

Escaping the Local Minima via Simulated Annealing: Optimization of Approximately Convex Functions

[...]

Alexandre Belloni¹, Tengyuan Liang², Hariharan Narayanan³, Alexander Rakhlin²•Institutions (3)

Duke University¹, University of Pennsylvania², University of Washington³

26 Jun 2015

TL;DR: The problem of optimizing an approximately convex function over a bounded convex set in $\mathbb{R}^n$ using only function evaluations is reduced to sampling from an \emph{approximately} log-concave distribution using the Hit-and-Run method, which is shown to have the same $\mathcal{O}^*$ complexity as sampling from log- Concave distributions.

...read moreread less

Abstract: We consider the problem of optimizing an approximately convex function over a bounded convex set in R n using only function evaluations. The problem is reduced to sampling from an approximately log-concave distribution using the Hit-and-Run method, with query complexity of O ⁄ (n 4.5 ). In the context of zeroth order stochastic convex optimization, the proposed method produces an †minimizer after O ⁄ (n 7.5 † i2 ) noisy function evaluations by inducing a O (†/n)-approximately log concave distribution. We also consider the case when the “amount of non-convexity” decays towards the optimum of the function. Other applications of the random walk method include private computation of empirical risk minimizers, two-stage stochastic programming, and approximate dynamic programming for online learning.

...read moreread less

Journal Article•DOI•

Proximal point method for a special class of nonconvex functions on Hadamard manifolds

[...]

Glaydston de Carvalho Bento¹, Orizon P. Ferreira¹, Paulo Roberto de Oliveira²•Institutions (2)

Universidade Federal de Goiás¹, Federal University of Rio de Janeiro²

08 Jan 2015-Optimization

TL;DR: In this article, the proximal point method for finding minima of a special class of nonconvex functions on a Hadamard manifold is presented, and it is proved that each accumulation point of this sequence satisfies the necessary optimality conditions.

...read moreread less

Abstract: In this article, we present the proximal point method for finding minima of a special class of nonconvex function on a Hadamard manifold. The well definedness of the sequence generated by the proximal point method is established. Moreover, it is proved that each accumulation point of this sequence satisfies the necessary optimality conditions and, under additional assumptions, its convergence for a minima is obtained.

...read moreread less

Journal Article•DOI•

On Smooth 3D Frame Field Design

[...]

Nicolas Ray, Dmitry Sokolov

13 Jul 2015-arXiv: Graphics

TL;DR: In this paper, the 2D and 3D optimization problems are derived from the same formulation (based on representing frames by functions), and their energies share some similarities from an optimization point of view (smoothness, local minima, bounds of partial derivatives, etc.).

...read moreread less

Abstract: We analyze actual methods that generate smooth frame fields both in 2D and in 3D. We formalize the 2D problem by representing frames as functions (as it was done in 3D), and show that the derived optimization problem is the one that previous work obtain via " representation vectors ". We show (in 2D) why this non linear optimization problem is easier to solve than directly minimizing the rotation angle of the field, and observe that the 2D algorithm is able to find good fields. Now, the 2D and the 3D optimization problems are derived from the same formulation (based on representing frames by functions). Their energies share some similarities from an optimization point of view (smoothness , local minima, bounds of partial derivatives, etc.), so we applied the 2D resolution mechanism to the 3D problem. Our evaluation of all existing 3D methods suggests to initialize the field by this new algorithm, but possibly use another method for further smoothing.

...read moreread less

Journal Article•DOI•

Locating landmarks on high-dimensional free energy surfaces.

[...]

Ming Chen, Tang Qing Yu¹, Mark E. Tuckerman², Mark E. Tuckerman¹•Institutions (2)

New York University¹, East China Normal University²

17 Mar 2015-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: Techniques from multiscale modeling, stochastic optimization, and machine learning are used to devise a strategy for locating minima and saddle points on a high-dimensional free energy surface “on the fly” and without requiring prior knowledge of or an explicit form for the surface.

...read moreread less

Abstract: Coarse graining of complex systems possessing many degrees of freedom can often be a useful approach for analyzing and understanding key features of these systems in terms of just a few variables. The relevant energy landscape in a coarse-grained description is the free energy surface as a function of the coarse-grained variables, which, despite the dimensional reduction, can still be an object of high dimension. Consequently, navigating and exploring this high-dimensional free energy surface is a nontrivial task. In this paper, we use techniques from multiscale modeling, stochastic optimization, and machine learning to devise a strategy for locating minima and saddle points (termed “landmarks”) on a high-dimensional free energy surface “on the fly” and without requiring prior knowledge of or an explicit form for the surface. In addition, we propose a compact graph representation of the landmarks and connections between them, and we show that the graph nodes can be subsequently analyzed and clustered based on key attributes that elucidate important properties of the system. Finally, we show that knowledge of landmark locations allows for the efficient determination of their relative free energies via enhanced sampling techniques.

...read moreread less

Journal Article•DOI•

Grand solar minima and maxima deduced from 10 Be and 14 C: magnetic dynamo configuration and polarity reversal

[...]

Fadil Inceoglu¹, R. Simoniello², Mads Faurschou Knudsen¹, Christoffer Karoff¹, Jesper V. Olsen¹, Sylvaine Turck-Chièze², Bo Holm Jacobsen¹ - Show less +3 more•Institutions (2)

Aarhus University¹, Paris Diderot University²

01 May 2015-Astronomy and Astrophysics

TL;DR: In this paper, the authors investigated the nature of grand minima and maxima periods identified based on the criteria as well as the variance and significance of the Hale cycle during these kinds of events throughout the Holocene epoch.

...read moreread less

Abstract: Aims. This study aims to improve our understanding of the occurrence and origin of grand solar maxima and minima. Methods. We first investigate the statistics of peaks and dips simultaneously occurring in the solar modulation potentials reconstructed using the Greenland Ice Core Project (GRIP) 10 Be and IntCal13 14 C records for the overlapping time period spanning between 1650 AD to 6600 BC. Based on the distribution of these events, we propose a method to identify grand minima and maxima periods. By using waiting time distribution analysis, we investigate the nature of grand minima and maxima periods identified based on the criteria as well as the variance and significance of the Hale cycle during these kinds of events throughout the Holocene epoch. Results. Analysis of grand minima and maxima events occurring simultaneously in the solar modulation potentials, reconstructed based on the 14 C and the 10 Be records, shows that the majority of events characterized by periods of moderate activity levels tend to last less than 50 years: grand maxima periods do not last longer than 100 years, while grand minima can persist slightly longer. The power and the variance of the 22-year Hale cycle increases during grand maxima and decreases during grand minima, compared to periods characterized by moderate activity levels. Conclusions. We present the first reconstruction of the occurrence of grand solar maxima and minima during the Holocene based on simultaneous changes in records of past solar variability derived from tree-ring 14 C and ice-core 10 Be, respectively. This robust determination of the occurrence of grand solar minima and maxima periods will enable systematic investigations of the influence of grand solar minima and maxima episodes on Earth’s climate.

...read moreread less

Journal Article•DOI•

Second Order and Stability Analysis for Optimal Sparse Control of the FitzHugh--Nagumo Equation

[...]

Eduardo Casas, Christopher Ryll, Fredi Tröltzsch

30 Jul 2015-Siam Journal on Control and Optimization

TL;DR: Optimal sparse control problems are considered for the FitzHugh--Nagumo system including the so-called Schlogl model and a theory of second order sufficient optimality conditions is established for Tikhonov regularization parameter $ u>0$ and also for the case of 0$.

...read moreread less

Abstract: Optimal sparse control problems are considered for the FitzHugh--Nagumo system including the so-called Schlogl model. The nondifferentiable objective functional of tracking type includes a quadratic Tikhonov regularization term and the $L^1$-norm of the control that accounts for the sparsity. Though the objective functional is not differentiable, a theory of second order sufficient optimality conditions is established for Tikhonov regularization parameter $ u>0$ and also for the case $ u = 0$. In this context, also local minima are discussed that are strong in the sense of the calculus of variations. The second order conditions are used as the main assumption for proving the stability of locally optimal solutions with respect to $ u \to 0$ and with respect to perturbations of the desired state functions. The theory is confirmed by numerical examples that are resolved with high precision to confirm that the optimal solution obeys the system of necessary optimality conditions.

...read moreread less

Journal Article•DOI•

Novel fuzzy active contour model with kernel metric for image segmentation

[...]

Yue Wu¹, Wenping Ma¹, Maoguo Gong¹, Hao Li¹, Licheng Jiao¹ - Show less +1 more•Institutions (1)

Xidian University¹

01 Sep 2015

TL;DR: A novel region-based fuzzy active contour model with kernel metric and fuzzy logic for image segmentation which makes the updating of region centers more robust against the noise and outliers in an image.

...read moreread less

Abstract: Graphical abstractDisplay Omitted HighlightsWe propose a model which incorporates kernel metric and fuzzy logic for image segmentation.The updating of region prototypes is more robust against outliers and noise.The evolution of contour in our model is stable and accurate.The proposed model achieves good balance between accuracy and efficiency. In this paper, a novel region-based fuzzy active contour model with kernel metric is proposed for a robust and stable image segmentation. This model can detect the boundaries precisely and work well with images in the presence of noise, outliers and low contrast. It segments an image into two regions - the object and the background by the minimization of a predefined energy function. Due to the kernel metric incorporated in the energy and the fuzziness of the energy, the active contour evolves very stably without the reinitialization for the level set function during the evolution. Here the fuzziness provides the model with a strong ability to reject local minima and the kernel metric is employed to construct a nonlinear version of energy function based on a level set framework. This new fuzzy and nonlinear version of energy function makes the updating of region centers more robust against the noise and outliers in an image. Theoretical analysis and experimental results show that the proposed model achieves a much better balance between accuracy and efficiency compared with other active contour models.

...read moreread less

Proceedings Article•DOI•

Inference for Generalized Linear Models via alternating directions and Bethe Free Energy minimization

[...]

Sundeep Rangan¹, Alyson K. Fletcher², Philip Schniter³, Ulugbek S. Kamilov⁴•Institutions (4)

New York University¹, University of California, Santa Cruz², Ohio State University³, École Polytechnique Fédérale de Lausanne⁴

14 Jun 2015

TL;DR: This paper presents a convergent approach to the generalized AMP (GAMP) algorithm based on direct minimization of a large-system limit approximation of the Bethe Free Energy (LSL-BFE), and shows that for strictly convex, smooth penalties, ADMM-GAMP is guaranteed to converge to a local minima of the LSL-B FE.

...read moreread less

Abstract: Generalized Linear Models (GLMs), where a random vector x is observed through a noisy, possibly nonlinear, function of a linear transform z = Ax arise in a range of applications in nonlinear filtering and regression. Approximate Message Passing (AMP) methods, based on loopy belief propagation, are a promising class of approaches for approximate inference in these models. AMP methods are computationally simple, general, and admit precise analyses with testable conditions for optimality for large i.i.d. transforms A. However, the algorithms can easily diverge for general transforms. This paper presents a convergent approach to the generalized AMP (GAMP) algorithm based on direct minimization of a large-system limit approximation of the Bethe Free Energy (LSL-BFE). The proposed method uses a double-loop procedure, where the outer loop successively linearizes the LSL-BFE and the inner loop minimizes the linearized LSL-BFE using the Alternating Direction Method of Multipliers (ADMM). The proposed method, called ADMM-GAMP, is similar in structure to the original GAMP method, but with an additional least-squares minimization. It is shown that for strictly convex, smooth penalties, ADMM-GAMP is guaranteed to converge to a local minima of the LSL-BFE, thus providing a convergent alternative to GAMP that is stable under arbitrary transforms. Simulations are also presented that demonstrate the robustness of the method for non-convex penalties as well.

...read moreread less

Journal Article•DOI•

Full waveform inversion in the frequency domain using direct iterative T-matrix methods

[...]

Morten Jakobsen¹, Bjørn Ursin²•Institutions (2)

University of Bergen¹, Norwegian University of Science and Technology²

01 Jun 2015-Journal of Geophysics and Engineering

TL;DR: In this article, the authors present two direct iterative solutions to the nonlinear seismic waveform inversion problem that are based on volume integral equation methods for seismic forward modelling in the acoustic approximation.

...read moreread less

Abstract: We present two direct iterative solutions to the nonlinear seismic waveform inversion problem that are based on volume integral equation methods for seismic forward modelling in the acoustic approximation. The solutions are presented in the frequency domain, where accurate inversion results can often be obtained using a relatively low number of frequency components. Our inverse scattering approach effectively replaces an ill-posed nonlinear inverse problem with a series of linear ill-posed inverse problems, for which there already exist efficient (regularized) solution methods. Both these solutions update the wavefield within the scattering domain after each iteration. The main difference is that the background medium Green functions are kept fixed in the first solution, but updated after each iteration in the second solution. This means that our solutions are very similar to the Born iterative (BI) and the distorted Born iterative (DBI) methods that are commonly used in acoustic and electromagnetic inverse scattering. However, we have eliminated the need to perform a full forward simulation (or to invert a huge matrix) at each iteration via the use of an iterative T-matrix method for fixed background media for the BI method and a variational T-matrix method for dynamic background media for the DBI method. The T-matrix (variation) is linearly related with the seismic wavefield data (residuals), but related with the unknown scattering potential model parameter (updates) in a non-linear manner, which is independent of the source-receiver configuration. This mathematical structure, which allows one to peel off the effects of the source-receiver configuration, is very attractive when dealing with multiple (simultaneous) sources, and is also compatible with the (future) use of renormalization methods for dealing with local minima problems. To illustrate the performance and potential of the two direct iterative methods for FWI, we performed a series of numerical experiments on synthetic seismic waveform data associated with a simple 2D model and the more complicated Marmousi model. The results of these numerical experiments suggest that the use of a fixed (e.g. smooth and ray-tracing friendly) background medium may be adequate for some applications with moderately large velocity contrasts, but the solution based on a dynamic (non-smooth and constantly updated) background medium will normally provide superiour inversion results; also in the case of low signal-to-noise ratios.

...read moreread less

Journal Article•DOI•

Multiple nonspherical structures from the extrema of Szekeres scalars

[...]

Roberto A. Sussman¹, I. Delgado Gaspar²•Institutions (2)

National Autonomous University of Mexico¹, Universidad Autónoma del Estado de Morelos²

13 Aug 2015-Physical Review D

TL;DR: In this article, the authors examined the spatial extrema (local maxima, minima and saddle points) of the covariant scalars (density, Hubble expansion, spatial curvature and eigenvalues of the shear and electric Weyl tensors) of quasispherical Szekeres dust models.

...read moreread less

Abstract: We examine the spatial extrema (local maxima, minima and saddle points) of the covariant scalars (density, Hubble expansion, spatial curvature and eigenvalues of the shear and electric Weyl tensors) of the quasispherical Szekeres dust models. Sufficient conditions are obtained for the existence of distributions of multiple extrema in spatial comoving locations that can be prescribed through initial conditions. These distributions evolve without shell crossing singularities at least for ever expanding models (with or without cosmological constant) in the full evolution range where the models are valid. By considering the local maxima and minima of the density, our results allow for setting up elaborated networks of ``pancake'' shaped evolving cold dark matter overdensities and density voids whose spatial distribution and amplitudes can be controlled from initial data compatible with standard early Universe initial conditions. We believe that these results have an enormous range of potential application by providing a fully relativistic nonperturbative coarse grained modeling of cosmic structure at all scales.

...read moreread less

Journal Article•DOI•

PARAFAC models of fluorescence data with scattering: A comparative study

[...]

Saioa Elcoroaristizabal¹, Rasmus Bro², José Antonio García¹, Lucio Alonso¹•Institutions (2)

University of the Basque Country¹, University of Copenhagen²

15 Mar 2015-Chemometrics and Intelligent Laboratory Systems

TL;DR: In this paper, the two most common methods of handling scatter are evaluated: replacing the scattering area with missing elements or with interpolated values, both in terms of stability of the models and quality of predictions.

...read moreread less

Posted Content•

Full and fast calibration of the Heston stochastic volatility model

[...]

Yiran Cui¹, Sebastian del Baño Rollin², Guido Germano¹, Guido Germano³•Institutions (3)

University College London¹, Queen Mary University of London², London School of Economics and Political Science³

27 Nov 2015-arXiv: Computational Finance

TL;DR: In this article, the authors present an algorithm for a complete and efficient calibration of the Heston stochastic volatility model, which is expressed as a nonlinear least squares problem and used to avoid discontinuities caused by branch switchings.

...read moreread less

Abstract: This paper presents an algorithm for a complete and efficient calibration of the Heston stochastic volatility model. We express the calibration as a nonlinear least squares problem. We exploit a suitable representation of the Heston characteristic function and modify it to avoid discontinuities caused by branch switchings of complex functions. Using this representation, we obtain the analytical gradient of the price of a vanilla option with respect to the model parameters, which is the key element of all variants of the objective function. The interdependency between the components of the gradient enables an efficient implementation which is around ten times faster than a numerical gradient. We choose the Levenberg-Marquardt method to calibrate the model and do not observe multiple local minima reported in previous research. Two-dimensional sections show that the objective function is shaped as a narrow valley with a flat bottom. Our method is the fastest calibration of the Heston model developed so far and meets the speed requirement of practical trading.

...read moreread less

Book Chapter•DOI•

Optimized Object Packings Using Quasi-Phi-Functions

[...]

Yuriy Stoyan¹, Tatiana E. Romanova¹, Alexander Pankratov¹, A. M. Chugay¹•Institutions (1)

National Academy of Sciences of Ukraine¹

01 Jan 2015

TL;DR: This chapter defines new functions, called quasi-phi-functions, that are used for analytic description of relations of geometric objects placed in a container taking into account their continuous rotations, translations, and distance constraints.

...read moreread less

Abstract: In this chapter we further develop the main tool of our studies,phi-functions. We define new functions, called quasi-phi-functions, that we use for analytic description of relations of geometric objects placed in a container taking into account their continuous rotations, translations, and distance constraints. The new functions are substantially simpler than phi-functions for some types of objects. They also are simple enough for some types of objects for which phi-functions could not be constructed. In particular, we derive quasi-phi-functions for certain 2D&3D-objects. We formulate a basic optimal packing problem and introduce its exact mathematical model in the form of a nonlinear continuous programming problem, using our quasi-phi-functions. We propose a general solution strategy, involving: a construction of feasible starting points, a generation of nonlinear subproblems of a smaller dimension and decreased number of inequalities; a search for local extrema of our problem using subproblems. To show the advantages of our quasi-phi-functions we apply them to two packing problems, which have a wide spectrum of industrial applications: packing of a given collection of ellipses into a rectangular container of minimal area taking into account distance constraints; packing of a given collection of 3D-objects, including cuboids, spheres, spherocylinders and spherocones, into a cuboid container of minimal height. Our efficient optimization algorithms allow us to get local optimal object packings and reduce considerably computational cost. We applied our algorithms to several inspiring instances: our new benchmark instances and known test cases.

...read moreread less

Proceedings Article•DOI•

Dynamic Multi-Heuristic A*

[...]

Fahad Islam¹, Venkatraman Narayanan¹, Maxim Likhachev¹•Institutions (1)

Carnegie Mellon University¹

26 May 2015

TL;DR: This work proposes a method for dynamically generating heuristics, in addition to the original heuristic(s) used, to guide the search out of local minima, and provides guarantees on completeness and bounds on suboptimality of the solution found.

...read moreread less

Abstract: Many motion planning problems in robotics are high dimensional planning problems. While sampling-based motion planning algorithms handle the high dimensionality very well, the solution qualities are often hard to control due to the inherent randomization. In addition, they suffer severely when the configuration space has several ‘narrow passages’. Search-based planners on the other hand typically provide good solution qualities and are not affected by narrow passages. However, in the absence of a good heuristic or when there are deep local minima in the heuristic, they suffer from the curse of dimensionality. In this work, our primary contribution is a method for dynamically generating heuristics, in addition to the original heuristic(s) used, to guide the search out of local minima. With the ability to escape local minima easily, the effect of dimensionality becomes less pronounced. On the theoretical side, we provide guarantees on completeness and bounds on suboptimality of the solution found. We compare our proposed method with the recently published Multi-Heuristic A* search, and the popular RRT-Connect in a full-body mobile manipulation domain for the PR2 robot, and show its benefits over these approaches.

...read moreread less

Journal Article•DOI•

Evaluation of Docking Target Functions by the Comprehensive Investigation of Protein-Ligand Energy Minima.

[...]

I. V. Oferkin, E. V. Katkova¹, A. V. Sulimov¹, Danil C. Kutov¹, Sergey I. Sobolev¹, Vladimir V. Voevodin¹, Vladimir B. Sulimov¹ - Show less +3 more•Institutions (1)

Moscow State University¹

26 Nov 2015-Advances in Bioinformatics

TL;DR: It was demonstrated that the docking target function on the base of the MMFF94 force field in vacuo can be used for discovery of native or near native ligand positions by finding the low-energy local minima spectrum of the target function.

...read moreread less

Abstract: The adequate choice of the docking target function impacts the accuracy of the ligand positioning as well as the accuracy of the protein-ligand binding energy calculation. To evaluate a docking target function we compared positions of its minima with the experimentally known pose of the ligand in the protein active site. We evaluated five docking target functions based on either the MMFF94 force field or the PM7 quantum-chemical method with or without implicit solvent models: PCM, COSMO, and SGB. Each function was tested on the same set of 16 protein-ligand complexes. For exhaustive low-energy minima search the novel MPI parallelized docking program FLM and large supercomputer resources were used. Protein-ligand binding energies calculated using low-energy minima were compared with experimental values. It was demonstrated that the docking target function on the base of the MMFF94 force field in vacuo can be used for discovery of native or near native ligand positions by finding the low-energy local minima spectrum of the target function. The importance of solute-solvent interaction for the correct ligand positioning is demonstrated. It is shown that docking accuracy can be improved by replacement of the MMFF94 force field by the new semiempirical quantum-chemical PM7 method.

...read moreread less

Journal Article•DOI•

Economic Dispatch With Non-Smooth Objectives—Part I: Local Minimum Analysis

[...]

Junpeng Zhan¹, Qinghua Wu², Chuangxin Guo¹, X. X. Zhou³•Institutions (3)

Zhejiang University¹, South China University of Technology², State Grid Corporation of China³

01 Mar 2015-IEEE Transactions on Power Systems

TL;DR: In this paper, a non-convex, non-differentiable, and multi-modal optimization model with many local minima is presented as a more accurate model of real problem compared to the conventional economic dispatch model.

...read moreread less

Abstract: Economic dispatch with valve-point effect (EDVPE) considered is presented as a more accurate model of the real problem compared to the conventional economic dispatch model. It is basically a non-convex, non-differentiable, and multi-modal optimization model with many local minima. Part I of the paper focuses on the local minimum analysis of the EDVPE. The analysis indicates that a local minimum consists of the singular points, the small convex regions, and the output of a slack unit that is dispatched to balance the load demand. Two types of local minima are identified and the second type could be ignored. To verify the rationality of the analyses, a traverse search has been performed to solve the EDVPE with and without considering the transmission loss on different test systems. All the simulation results support the analysis given in the paper. To effectively solve the EDVPE on a large-scale power system, based on the analysis presented in this paper, a new method, dimensional steepest decline method, is proposed in Part II of the paper.

...read moreread less

Journal Article•DOI•

Feedback stabilization of a nonholonomic system with potential fields: application to a two-wheeled mobile robot among obstacles

[...]

Takateru Urakubo¹•Institutions (1)

Kobe University¹

18 Apr 2015-Nonlinear Dynamics

TL;DR: In this article, a feedback controller for a nonholonomic system with three states and two inputs is derived using an artificial potential function that has no local minima, and the stability of equilibria of the system is analyzed.

...read moreread less

Abstract: In this paper, a feedback controller for a nonholonomic system with three states and two inputs is derived using an artificial potential function that has no local minima, and the stability of equilibria of the system is analyzed. Although the system with the controller has an infinite number of equilibria due to the nonholonomic constraint, those equilibria except the critical points of the potential function are unstable because of a skew-symmetric component of the controller. When the potential function has critical points of saddle type, the saddles may be stable equilibria in addition to the stable equilibrium at the minimum of the function. The controller is applied to a two-wheeled mobile robot among obstacles and modified by using a time-varying potential function in order to avoid convergence to the saddles. As a result, with the controller, the mobile robot converges to a desired position and orientation without collision with obstacles.

...read moreread less

Posted Content•

Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition

[...]

Rong Ge¹, Furong Huang², Chi Jin³, Yang Yuan⁴•Institutions (4)

Microsoft¹, University of California, Irvine², University of California, Berkeley³, Cornell University⁴

06 Mar 2015-arXiv: Learning

TL;DR: In this article, the authors show that stochastic gradient descent converges to a local minimum in a polynomial number of iterations for orthogonal tensor decomposition with strict saddle property.

...read moreread less

Abstract: We analyze stochastic gradient descent for optimizing non-convex functions. In many cases for non-convex functions the goal is to find a reasonable local minimum, and the main concern is that gradient updates are trapped in saddle points. In this paper we identify strict saddle property for non-convex problem that allows for efficient optimization. Using this property we show that stochastic gradient descent converges to a local minimum in a polynomial number of iterations. To the best of our knowledge this is the first work that gives global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. Our analysis can be applied to orthogonal tensor decomposition, which is widely used in learning a rich class of latent variable models. We propose a new optimization formulation for the tensor decomposition problem that has strict saddle property. As a result we get the first online algorithm for orthogonal tensor decomposition with global convergence guarantee.

...read moreread less

Collapse