scispace - formally typeset
Search or ask a question

Showing papers on "Function (mathematics) published in 2015"


Proceedings Article
21 Feb 2015
TL;DR: In this paper, the authors study the connection between the loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of variable independence, redundancy in network parametrization, and uniformity.
Abstract: We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band diminishes exponentially with the size of the network. We empirically verify that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band of low critical points, and that all critical points found there are local minima of high quality measured by the test error. This emphasizes a major difference between largeand small-size networks where for the latter poor quality local minima have nonzero probability of being recovered. Finally, we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting.

970 citations


Journal ArticleDOI
TL;DR: A novel decentralized exact first-order algorithm (abbreviated as EXTRA) to solve the consensus optimization problem and uses a fixed, large step size, which can be determined independently of the network size or topology.
Abstract: Recently, there has been growing interest in solving consensus optimization problems in a multiagent network. In this paper, we develop a decentralized algorithm for the consensus optimization problem $\mathrm{minimize}_{x\in\mathbb{R}^p}~\bar{f}(x)=\frac{1}{n}\sum_{i=1}^n f_i(x),$ which is defined over a connected network of $n$ agents, where each function $f_i$ is held privately by agent $i$ and encodes the agent's data and objective. All the agents shall collaboratively find the minimizer while each agent can only communicate with its neighbors. Such a computation scheme avoids a data fusion center or long-distance communication and offers better load balance to the network. This paper proposes a novel decentralized exact first-order algorithm (abbreviated as EXTRA) to solve the consensus optimization problem. “Exact” means that it can converge to the exact solution. EXTRA uses a fixed, large step size, which can be determined independently of the network size or topology. The local variable of every a...

906 citations


Proceedings Article
06 Jul 2015
TL;DR: An efficient technique for supervised learning of universal value function approximators (UVFAs) V (s, g; θ) that generalise not just over states s but also over goals g is developed and it is demonstrated that a UVFA can successfully generalise to previously unseen goals.
Abstract: Value functions are a core component of reinforcement learning systems. The main idea is to to construct a single function approximator V (s; θ) that estimates the long-term reward from any state s, using parameters θ. In this paper we introduce universal value function approximators (UVFAs) V (s, g; θ) that generalise not just over states s but also over goals g. We develop an efficient technique for supervised learning of UVFAs, by factoring observed values into separate embedding vectors for state and goal, and then learning a mapping from s and g to these factored embedding vectors. We show how this technique may be incorporated into a reinforcement learning algorithm that updates the UVFA solely from observed rewards. Finally, we demonstrate that a UVFA can successfully generalise to previously unseen goals.

795 citations


Proceedings Article
23 Jul 2015
TL;DR: Deep Recurrent Q-Network (DRQN) as discussed by the authors replaces the first post-convolutional fully-connected layer with a recurrent LSTM, which integrates information through time and replicates DQN's performance on standard Atari games and partially observed equivalents featuring flickering game screens.
Abstract: Deep Reinforcement Learning has yielded proficient controllers for complex tasks. However, these controllers have limited memory and rely on being able to perceive the complete game screen at each decision point. To address these shortcomings, this article investigates the effects of adding recurrency to a Deep Q-Network (DQN) by replacing the first post-convolutional fully-connected layer with a recurrent LSTM. The resulting Deep Recurrent Q-Network (DRQN), although capable of seeing only a single frame at each timestep, successfully integrates information through time and replicates DQN's performance on standard Atari games and partially observed equivalents featuring flickering game screens. Additionally, when trained with partial observations and evaluated with incrementally more complete observations, DRQN's performance scales as a function of observability. Conversely, when trained with full observations and evaluated with partial observations, DRQN's performance degrades less than DQN's. Thus, given the same length of history, recurrency is a viable alternative to stacking a history of frames in the DQN's input layer and while recurrency confers no systematic advantage when learning to play the game, the recurrent net can better adapt at evaluation time if the quality of observations changes.

695 citations


Journal ArticleDOI
TL;DR: TMB as discussed by the authors is an open source R package that enables quick implementation of complex nonlinear random effect (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, this http URL).
Abstract: TMB is an open source R package that enables quick implementation of complex nonlinear random effect (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, this http URL). In addition, it offers easy access to parallel computations. The user defines the joint likelihood for the data and the random effects as a C++ template function, while all the other operations are done in R; e.g., reading in the data. The package evaluates and maximizes the Laplace approximation of the marginal likelihood where the random effects are automatically integrated out. This approximation, and its derivatives, are obtained using automatic differentiation (up to order three) of the joint likelihood. The computations are designed to be fast for problems with many random effects (~10^6) and parameters (~10^3). Computation times using ADMB and TMB are compared on a suite of examples ranging from simple models to large spatial models where the random effects are a Gaussian random field. Speedups ranging from 1.5 to about 100 are obtained with increasing gains for large problems. The package and examples are available at this http URL.

506 citations


Journal ArticleDOI
TL;DR: This paper examined two types of splitting methods for solving this nonconvex optimization problem: alternating direction method of multipliers and proximal gradient algorithm and gives simple sufficient conditions to guarantee boundedness of the sequence generated.
Abstract: We consider the problem of minimizing the sum of a smooth function $h$ with a bounded Hessian and a nonsmooth function. We assume that the latter function is a composition of a proper closed function $P$ and a surjective linear map $\mathcal{M}$, with the proximal mappings of $\tau P$, $\tau > 0$, simple to compute. This problem is nonconvex in general and encompasses many important applications in engineering and machine learning. In this paper, we examined two types of splitting methods for solving this nonconvex optimization problem: the alternating direction method of multipliers and the proximal gradient algorithm. For the direct adaptation of the alternating direction method of multipliers, we show that if the penalty parameter is chosen sufficiently large and the sequence generated has a cluster point, then it gives a stationary point of the nonconvex problem. We also establish convergence of the whole sequence under an additional assumption that the functions $h$ and $P$ are semialgebraic. Further...

337 citations


Proceedings ArticleDOI
17 Oct 2015
TL;DR: This paper investigates the partial and exact recovery of communities in the general SBM (in the constant and logarithmic degree regimes), and uses the generality of the results to tackle overlapping communities.
Abstract: New phase transition phenomena have recently been discovered for the stochastic block model, for the special case of two non-overlapping symmetric communities. This gives raise in particular to new algorithmic challenges driven by the thresholds. This paper investigates whether a general phenomenon takes place for multiple communities, without imposing symmetry. In the general stochastic block model SBM (n, p, W), n vertices are split into k communities of relative siz{pi} ia#x03B5;[k], and vertices in community i and j connect independently with probability {Wij}i,j a#x03B5;[k]. This paper investigates the partial and exact recovery of communities in the general SBM (in the constant and logarithmic degree regimes), and uses the generality of the results to tackle overlapping communities. The contributions of the paper are: (i) an explicit characterization of the recovery threshold in the general SBM in terms of a new f-divergence function D+, which generalizes the Hellinger and Chern off divergences, and which provides an operational meaning to a divergence function analog to the KL-divergence in the channel coding theorem, (ii) the development of an algorithm that recovers the communities all the way down to the optimal threshold and runs in quasi-linear time, showing that exact recovery has no information-theoretic to computational gap for multiple communities, (iii) the development of an efficient algorithm that detects communities in the constant degree regime with an explicit accuracy bound that can be made arbitrarily close to 1 when a prescribed signal-to-noise ratio (defined in term of the spectrum of diag(p)W tends to infinity.

333 citations


Posted Content
TL;DR: In this paper, the authors propose a model that is based on decoding an image into a set of people detections and uses a recurrent LSTM layer for sequence generation and train their model end-to-end with a new loss function that operates on sets of detections.
Abstract: Current people detectors operate either by scanning an image in a sliding window fashion or by classifying a discrete set of proposals. We propose a model that is based on decoding an image into a set of people detections. Our system takes an image as input and directly outputs a set of distinct detection hypotheses. Because we generate predictions jointly, common post-processing steps such as non-maximum suppression are unnecessary. We use a recurrent LSTM layer for sequence generation and train our model end-to-end with a new loss function that operates on sets of detections. We demonstrate the effectiveness of our approach on the challenging task of detecting people in crowded scenes.

250 citations


Journal ArticleDOI
TL;DR: These findings provide converging multimodal evidence for a model in which decision threshold in reward-based tasks is adjusted as a function of communication from pre-SMA to STN when choices differ subtly in reward values, allowing more time to choose the statistically more rewarding option.
Abstract: What are the neural dynamics of choice processes during reinforcement learning? Two largely separate literatures have examined dynamics of reinforcement learning (RL) as a function of experience but assuming a static choice process, or conversely, the dynamics of choice processes in decision making but based on static decision values. Here we show that human choice processes during RL are well described by a drift diffusion model (DDM) of decision making in which the learned trial-by-trial reward values are sequentially sampled, with a choice made when the value signal crosses a decision threshold. Moreover, simultaneous fMRI and EEG recordings revealed that this decision threshold is not fixed across trials but varies as a function of activity in the subthalamic nucleus (STN) and is further modulated by trial-by-trial measures of decision conflict and activity in the dorsomedial frontal cortex (pre-SMA BOLD and mediofrontal theta in EEG). These findings provide converging multimodal evidence for a model in which decision threshold in reward-based tasks is adjusted as a function of communication from pre-SMA to STN when choices differ subtly in reward values, allowing more time to choose the statistically more rewarding option.

212 citations


Proceedings ArticleDOI
01 Jul 2015
TL;DR: The end result is the generation of stable walking satisfying physical realizability constraints for a model of the bipedal robot AMBER2.
Abstract: This paper presents a methodology for the development of control barrier functions (CBFs) through a backstepping inspired approach. Given a set defined as the superlevel set of a function, h, the main result is a constructive means for generating control barrier functions that guarantee forward invariance of this set. In particular, if the function defining the set has relative degree n, an iterative methodology utilizing higher order derivatives of h provably results in a control barrier function that can be explicitly derived. To demonstrate these formal results, they are applied in the context of bipedal robotic walking. Physical constraints, e.g., joint limits, are represented by control barrier functions and unified with control objectives expressed through control Lyapunov functions (CLFs) via quadratic program (QP) based controllers. The end result is the generation of stable walking satisfying physical realizability constraints for a model of the bipedal robot AMBER2.

200 citations


Journal ArticleDOI
TL;DR: In this article, an optimal parabolic contour is selected on the basis of the distance and the strength of the singularities of the Laplace transform, with the aim of minimizing the computational effort and reducing the propagation of errors.
Abstract: The Mittag-Leffler (ML) function plays a fundamental role in fractional calculus but very few methods are available for its numerical evaluation. In this work we present a method for the efficient computation of the ML function based on the numerical inversion of its Laplace transform (LT): an optimal parabolic contour is selected on the basis of the distance and the strength of the singularities of the LT, with the aim of minimizing the computational effort and reducing the propagation of errors. Numerical experiments are presented to show accuracy and efficiency of the proposed approach. The application to the three parameter ML (also known as Prabhakar) function is also presented.

Proceedings Article
25 Jan 2015
TL;DR: In this paper, the generalized singular value thresholding (GSVT) operator Proxσg(·) was proposed to solve the nonconvex low rank minimization problem by using GSVT in place of SVT.
Abstract: This work studies the Generalized Singular Value Thresholding (GSVT) operator Proxσg(·), Proxσg(B) = arg minx ∑mi=1 g(σi(X)) + 1/2 ||X - B||2F, associated with a nonconvex function g defined on the singular values of X. We prove that GSVT can be obtained by performing the proximal operator of g (denoted as Proxg(·)) on the singular values since Proxg(·) is monotone when g is lower bounded. If the nonconvex g satisfies some conditions (many popular nonconvex surrogate functions, e.g., lp-norm, 0 < p < 1, of l0-norm are special cases), a general solver to find Proxg(b) is proposed for any b ≥ 0. GSVT greatly generalizes the known Singular Value Thresholding (SVT) which is a basic subroutine in many convex low rank minimization methods. We are able to solve the nonconvex low rank minimization problem by using GSVT in place of SVT.

Journal ArticleDOI
TL;DR: In this article, the existence and the asymptotic behavior of non-negative solutions for a class of stationary Kirchhoff problems driven by a fractional integro-differential operator LK and involving a critical nonlinearity were analyzed.
Abstract: This paper deals with the existence and the asymptotic behavior of non-negative solutions for a class of stationary Kirchhoff problems driven by a fractional integro-differential operator LK and involving a critical nonlinearity. In particular, we consider the problem −M(||u||2)LKu=λf(x,u)+|u|2s∗−2uin Ω,u=0in Rn∖Ω, where Ω⊂Rn is a bounded domain, 2s∗ is the critical exponent of the fractional Sobolev space Hs(Rn), the function f is a subcritical term and λ is a positive parameter. The main feature, as well as the main difficulty, of the analysis is the fact that the Kirchhoff function M could be zero at zero, that is the problem is degenerate. The adopted techniques are variational and the main theorems extend in several directions previous results recently appeared in the literature.

Journal ArticleDOI
TL;DR: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems, and two discontinuous algorithms based on the signum function are proposed to solve the problem in each case.
Abstract: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems. Control algorithms are designed for the cases of single-integrator and double-integrator dynamics. Two discontinuous algorithms based on the signum function are proposed to solve the problem in each case. Then in the case of double-integrator dynamics, two continuous algorithms based on, respectively, a time-varying and a fixed boundary layer are proposed as continuous approximations of the signum function. Also, to account for inter-agent collision for physical agents, a distributed convex optimization problem with swarm tracking behavior is introduced for both single-integrator and double-integrator dynamics.

Posted Content
24 Mar 2015
TL;DR: In this article, a new analogue of Bernstein operators is introduced, called (p, q)-Bernstein operators, which is a generalization of q-Bernstein operator and also study approximation properties based on Korovkin's type approximation theorem.
Abstract: In this paper, we introduce a new analogue of Bernstein operators and we call it as (p, q)-Bernstein operators which is a generalization of q-Bernstein operators. We also study approximation properties based on Korovkin's type approximation theorem of (p, q)-Bernstein operators and establish some direct theorems. Furthermore, we show comparisons and some illustrative graphics for the convergence of operators to a function.

Journal ArticleDOI
TL;DR: In this article, the authors consider the logarithmic negativity of a finite interval embedded in an infinite one-dimensional system at finite temperature and show that the naive approach based on the calculation of a two-point function of twist fields in a cylindrical geometry yields a wrong result.
Abstract: We consider the logarithmic negativity of a finite interval embedded in an infinite one dimensional system at finite temperature. We focus on conformal invariant systems and we show that the naive approach based on the calculation of a two-point function of twist fields in a cylindrical geometry yields a wrong result. The correct result is obtained through a four-point function of twist fields in which two auxiliary fields are inserted far away from the interval, and they are sent to infinity only after having taken the replica limit. In this way, we find a universal scaling form for the finite temperature negativity which depends on the full operator content of the theory and not only on the central charge. In the limit of low and high temperatures, the expansion of this universal form can be obtained by means of the operator product expansion. We check our results against exact numerical computations for the critical harmonic chain.

Proceedings ArticleDOI
04 Jan 2015
TL;DR: In this paper, the authors study the private distributed optimization problem (PDOP) with the additional requirement that the cost function of the individual agents should remain differentially private, and propose a class of iterative algorithms for solving PDOP, which achieves differential privacy and convergence to a common value.
Abstract: In distributed optimization and iterative consensus literature, a standard problem is for N agents to minimize a function f over a subset of Euclidean space, where the cost function is expressed as a sum Σ fi. In this paper, we study the private distributed optimization problem (PDOP) with the additional requirement that the cost function of the individual agents should remain differentially private. The adversary attempts to infer information about the private cost functions from the messages that the agents exchange. Achieving differential privacy requires that any change of an individual's cost function only results in unsubstantial changes in the statistics of the messages. We propose a class of iterative algorithms for solving PDOP, which achieves differential privacy and convergence to a common value. Our analysis reveals the dependence of the achieved accuracy and the privacy levels on the the parameters of the algorithm. We observe that to achieve e-differential privacy the accuracy of the algorithm has the order of O(1/e2).

Journal ArticleDOI
TL;DR: A weighted least square method (WLSM) is applied that addresses the sample selection bias problem of single-regime models and reveals the deficiency associated with the LSM is because the expected value of speed is nonlinear with regard to the density.
Abstract: The speed-density or flow-density relationship has been considered as the foundation of traffic flow theory. Existing single-regime models calibrated by the least square method (LSM) could not fit the empirical data consistently well both in light-traffic/free-flow conditions and congested/jam conditions. In this paper, first, we point out that the inaccuracy of single-regime models is not caused solely by their functional forms, but also by the sample selection bias. Second, we apply a weighted least square method (WLSM) that addresses the sample selection bias problem. The calibration results for six well-known single-regime models using the WLSM fit the empirical data reasonably well both in light-traffic/free-flow conditions and congested/jam conditions. Third, we conduct a theoretical investigation that reveals the deficiency associated with the LSM is because the expected value of speed (or a function of it) is nonlinear with regard to the density (or a function of it).

Journal ArticleDOI
TL;DR: This work develops an efficient, data-driven technique for estimating the parameters of these models from observed equilibria, and supports both parametric and nonparametric estimation by leveraging ideas from statistical learning (kernel methods and regularization operators).
Abstract: Equilibrium modeling is common in a variety of fields such as game theory and transportation science. The inputs for these models, however, are often difficult to estimate, while their outputs, i.e., the equilibria they are meant to describe, are often directly observable. By combining ideas from inverse optimization with the theory of variational inequalities, we develop an efficient, data-driven technique for estimating the parameters of these models from observed equilibria. We use this technique to estimate the utility functions of players in a game from their observed actions and to estimate the congestion function on a road network from traffic count data. A distinguishing feature of our approach is that it supports both parametric and nonparametric estimation by leveraging ideas from statistical learning (kernel methods and regularization operators). In computational experiments involving Nash and Wardrop equilibria in a nonparametric setting, we find that a) we effectively estimate the unknown demand or congestion function, respectively, and b) our proposed regularization technique substantially improves the out-of-sample performance of our estimators.

Journal ArticleDOI
TL;DR: It is found that, under mild conditions on the objective function, the Chebyshev scalarizing function has an almost identical effect to Pareto-dominance relations when the authors consider the probability of finding superior solutions for algorithms that follow a balanced trajectory.

Journal ArticleDOI
01 Feb 2015
TL;DR: A distributed cooperative optimization problem encountered in a computational multiagent network with delay is considered, where each agent has local access to its convex cost function, and jointly minimizes the cost function over the whole network.
Abstract: In this technical correspondence, we consider a distributed cooperative optimization problem encountered in a computational multiagent network with delay, where each agent has local access to its convex cost function, and jointly minimizes the cost function over the whole network. To solve this problem, we develop an algorithm that is based on dual averaging updates and delayed subgradient information, and analyze its convergence properties for a diminishing step-size by utilizing Bregman-distance functions. Moreover, we provide sharp bounds on the convergence rates as a function of the network size and topology embodied in the inverse spectral gap. Finally, we present a numerical example to evaluate our algorithm and compare its performance with several similar algorithms.

Journal ArticleDOI
TL;DR: In this article, a Renormalization Group (RG) equation for the function f in a theory of gravity in the f(R) truncation was proposed. But this equation differs from previous ones due to the exponential parametrization of the quantum fluctuations and to the choice of gauge.
Abstract: We write a Renormalization Group (RG) equation for the function f in a theory of gravity in the f(R) truncation. Our equation differs from previous ones due to the exponential parametrization of the quantum fluctuations and to the choice of gauge. The cutoff procedure depends on three free parameters, and we find that there exist discrete special choices of parameters for which the flow equation has fixed points where f=f_0+f_1 R+f_2 R^2. For other values of the parameters the solution seems to be continuously deformed.

Journal ArticleDOI
TL;DR: This paper slightly modify Khojasteh et?al.'s notion of simulation function and investigates the existence and uniqueness of coincidence points of two nonlinear operators using this kind of control functions.

Journal ArticleDOI
TL;DR: In this paper, a simple Fourier transform (FT) method is presented for obtaining a Distribution Function of Relaxation Times (DFRT) for electrochemical impedance spectroscopy (EIS) data.

Journal ArticleDOI
TL;DR: A novel bottom-up salient object detection approach by exploiting the relationship between the saliency detection and the Markov absorption probability is presented and the robustness and efficiency of the proposed method against 17 state-of-the-art methods are demonstrated.
Abstract: In this paper, we present a novel bottom-up salient object detection approach by exploiting the relationship between the saliency detection and the Markov absorption probability. First, we calculate a preliminary saliency map by the Markov absorption probability on a weighted graph via partial image borders as background prior. Unlike most of the existing background prior-based methods which treated all image boundaries as background, we only use the left and top sides as background for simplicity. The saliency of each element is defined as the sum of the corresponding absorption probability by several left and top virtual boundary nodes, which are most similar to it. Second, a better result is obtained by ranking the relevance of the image elements with foreground cues extracted from the preliminary saliency map, which can effectively emphasize the objects against the background, whose computation is processed similarly as that in the first stage and yet substantially different from the former one. At last, three optimization techniques—content-based diffusion mechanism, superpixelwise depression function, and guided filter—are utilized to further modify the saliency map generalized at the second stage, which is proved to be effective and complementary to each other. Both qualitative and quantitative evaluations on four publicly available benchmark data sets demonstrate the robustness and efficiency of the proposed method against 17 state-of-the-art methods.

Posted Content
TL;DR: In this paper, the authors studied the problem of estimating the number of samples required to answer a sequence of adaptive queries about an unknown distribution, as a function of the type of queries and the desired level of accuracy.
Abstract: Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model, where all questions are specified before the dataset is drawn. Recent work by Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014) initiated the formal study of this problem, and gave the first upper and lower bounds on the achievable generalization error for adaptive data analysis. Specifically, suppose there is an unknown distribution $\mathbf{P}$ and a set of $n$ independent samples $\mathbf{x}$ is drawn from $\mathbf{P}$. We seek an algorithm that, given $\mathbf{x}$ as input, accurately answers a sequence of adaptively chosen queries about the unknown distribution $\mathbf{P}$. How many samples $n$ must we draw from the distribution, as a function of the type of queries, the number of queries, and the desired level of accuracy? In this work we make two new contributions: (i) We give upper bounds on the number of samples $n$ that are needed to answer statistical queries. The bounds improve and simplify the work of Dwork et al. (STOC, 2015), and have been applied in subsequent work by those authors (Science, 2015, NIPS, 2015). (ii) We prove the first upper bounds on the number of samples required to answer more general families of queries. These include arbitrary low-sensitivity queries and an important class of optimization queries. As in Dwork et al., our algorithms are based on a connection with algorithmic stability in the form of differential privacy. We extend their work by giving a quantitatively optimal, more general, and simpler proof of their main theorem that stability implies low generalization error. We also study weaker stability guarantees such as bounded KL divergence and total variation distance.

Journal ArticleDOI
Jianpeng Wang1, Huchuan Lu1, Xiaohui Li1, Na Tong1, Wei Liu1 
TL;DR: A bottom-up visual saliency detection algorithm that takes both background and foreground into consideration and the two saliency maps are integrated by the proposed unified function is proposed.

Journal ArticleDOI
TL;DR: This paper is the first time that the convergence, admissibility, and optimality properties of the generalized policy iteration algorithm for DT nonlinear systems are analyzed.
Abstract: In this paper, a novel iterative adaptive dynamic programming (ADP)-based infinite horizon self-learning optimal control algorithm, called generalized policy iteration algorithm, is developed for nonaffine discrete-time (DT) nonlinear systems. Generalized policy iteration algorithm is a general idea of interacting policy and value iteration algorithms of ADP. The developed generalized policy iteration algorithm permits an arbitrary positive semidefinite function to initialize the algorithm, where two iteration indices are used for policy improvement and policy evaluation, respectively. It is the first time that the convergence, admissibility, and optimality properties of the generalized policy iteration algorithm for DT nonlinear systems are analyzed. Neural networks are used to implement the developed algorithm. Finally, numerical examples are presented to illustrate the performance of the developed algorithm.

Journal ArticleDOI
TL;DR: In this paper, the authors derive the Ward identities which relate the three point function of scalar perturbations produced during inflation to the scalar four point function, in a particular limit.
Abstract: Using symmetry considerations, we derive Ward identities which relate the three point function of scalar perturbations produced during inflation to the scalar four point function, in a particular limit. The derivation assumes approximate conformal invariance, and the conditions for the slow roll approximation, but is otherwise model independent. The Ward identities allow us to deduce that the three point function must be suppressed in general, being of the same order of magnitude as in the slow roll model. They also fix the three point function in terms of the four point function, upto one constant which we argue is generically suppressed. Our approach is based on analyzing the wave function of the universe, and the Ward identities arise by imposing the requirements of spatial and time reparametrization invariance on it.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of finding a positive ground state solution for a nonlinear problem of Kirchhoff type with general nonlinearities, and prove that (0.1) has a positive solution under certain assumptions on V and f.