scispace - formally typeset
Search or ask a question
JournalISSN: 2197-9847

Research in the Mathematical Sciences 

Springer Nature
About: Research in the Mathematical Sciences is an academic journal published by Springer Nature. The journal publishes majorly in the area(s): Computer science & Modular form. It has an ISSN identifier of 2197-9847. Over the lifetime, 339 publications have been published receiving 4094 citations.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
TL;DR: In particular, this paper showed that for any admissible triple (h1,h2,h3), there are infinitely many n for which at least two of n+h 1,n+h 2,h 3 are prime, and also showed that either the twin prime conjecture holds or the even Goldbach conjecture is asymptotically true if one allows an additive error of at most 2, or both.
Abstract: For any m≥1, let H m denote the quantity . A celebrated recent result of Zhang showed the finiteness of H1, with the explicit bound H1≤70,000,000. This was then improved by us (the Polymath8 project) to H1≤4680, and then by Maynard to H1≤600, who also established for the first time a finiteness result for H m for m≥2, and specifically that H m ≪m3e4m. If one also assumes the Elliott-Halberstam conjecture, Maynard obtained the bound H1≤12, improving upon the previous bound H1≤16 of Goldston, Pintz, and Yildirim, as well as the bound H m ≪m3e2m. In this paper, we extend the methods of Maynard by generalizing the Selberg sieve further and by performing more extensive numerical calculations. As a consequence, we can obtain the bound H1≤246 unconditionally and H1≤6 under the assumption of the generalized Elliott-Halberstam conjecture. Indeed, under the latter conjecture, we show the stronger statement that for any admissible triple (h1,h2,h3), there are infinitely many n for which at least two of n+h1,n+h2,n+h3 are prime, and also obtain a related disjunction asserting that either the twin prime conjecture holds or the even Goldbach conjecture is asymptotically true if one allows an additive error of at most 2, or both. We also modify the ‘parity problem’ argument of Selberg to show that the H1≤6 bound is the best possible that one can obtain from purely sieve-theoretic considerations. For larger m, we use the distributional results obtained previously by our project to obtain the unconditional asymptotic bound or H m ≪m e2m under the assumption of the Elliott-Halberstam conjecture. We also obtain explicit upper bounds for H m when m=2,3,4,5.

167 citations

Journal ArticleDOI
TL;DR: In this article, the authors introduced the mathematical formulation of the population risk minimization problem in deep learning as a mean-field optimal control problem and proved optimality conditions of both the Hamilton-Jacobi-Bellman type and the Pontryagin type.
Abstract: Recent work linking deep neural networks and dynamical systems opened up new avenues to analyze deep learning. In particular, it is observed that new insights can be obtained by recasting deep learning as an optimal control problem on difference or differential equations. However, the mathematical aspects of such a formulation have not been systematically explored. This paper introduces the mathematical formulation of the population risk minimization problem in deep learning as a mean-field optimal control problem. Mirroring the development of classical optimal control, we state and prove optimality conditions of both the Hamilton–Jacobi–Bellman type and the Pontryagin type. These mean-field results reflect the probabilistic nature of the learning problem. In addition, by appealing to the mean-field Pontryagin’s maximum principle, we establish some quantitative relationships between population and empirical learning problems. This serves to establish a mathematical foundation for investigating the algorithmic and theoretical connections between optimal control and deep learning.

144 citations

Journal ArticleDOI
TL;DR: Stochastic homogenization theory allows us to better understand the convergence of the algorithm, and a stochastic control interpretation is used to prove that a modified algorithm converges faster than SGD in expectation.
Abstract: Entropy-SGD is a first-order optimization method which has been used successfully to train deep neural networks. This algorithm, which was motivated by statistical physics, is now interpreted as gradient descent on a modified loss function. The modified, or relaxed, loss function is the solution of a viscous Hamilton–Jacobi partial differential equation (PDE). Experimental results on modern, high-dimensional neural networks demonstrate that the algorithm converges faster than the benchmark stochastic gradient descent (SGD). Well-established PDE regularity results allow us to analyze the geometry of the relaxed energy landscape, confirming empirical evidence. Stochastic homogenization theory allows us to better understand the convergence of the algorithm. A stochastic control interpretation is used to prove that a modified algorithm converges faster than SGD in expectation.

135 citations

Journal ArticleDOI
TL;DR: In this article, the authors relate the Mathieu moonshine conjecture to the Niemeier lattices, the 23 even unimodular positive-definite lattices of rank 24 with non-trivial root systems.
Abstract: In this paper, we relate umbral moonshine to the Niemeier lattices - the 23 even unimodular positive-definite lattices of rank 24 with non-trivial root systems. To each Niemeier lattice, we attach a finite group by considering a naturally defined quotient of the lattice automorphism group, and for each conjugacy class of each of these groups, we identify a vector-valued mock modular form whose components coincide with mock theta functions of Ramanujan in many cases. This leads to the umbral moonshine conjecture, stating that an infinite-dimensional module is assigned to each of the Niemeier lattices in such a way that the associated graded trace functions are mock modular forms of a distinguished nature. These constructions and conjectures extend those of our earlier paper and in particular include the Mathieu moonshine observed by Eguchi, Ooguri and Tachikawa as a special case. Our analysis also highlights a correspondence between genus zero groups and Niemeier lattices. As a part of this relation, we recognise the Coxeter numbers of Niemeier root systems with a type A component as exactly those levels for which the corresponding classical modular curve has genus zero.

132 citations

Journal ArticleDOI
TL;DR: Darbon et al. as mentioned in this paper used the classical Hopf formulas for solving initial value problems for HJ PDEs and showed that these formulas are polynomial in the dimension.
Abstract: It is well known that time-dependent Hamilton–Jacobi–Isaacs partial differential equations (HJ PDEs) play an important role in analyzing continuous dynamic games and control theory problems. An important tool for such problems when they involve geometric motion is the level set method (Osher and Sethian in J Comput Phys 79(1):12–49, 1988). This was first used for reachability problems in Mitchell et al. (IEEE Trans Autom Control 50(171):947–957, 2005) and Mitchell and Tomlin (J Sci Comput 19(1–3):323–346, 2003). The cost of these algorithms and, in fact, all PDE numerical approximations is exponential in the space dimension and time. In Darbon (SIAM J Imaging Sci 8(4):2268–2293, 2015), some connections between HJ PDE and convex optimization in many dimensions are presented. In this work, we propose and test methods for solving a large class of the HJ PDE relevant to optimal control problems without the use of grids or numerical approximations. Rather we use the classical Hopf formulas for solving initial value problems for HJ PDE (Hopf in J Math Mech 14:951–973, 1965). We have noticed that if the Hamiltonian is convex and positively homogeneous of degree one (which the latter is for all geometrically based level set motion and control and differential game problems) that very fast methods exist to solve the resulting optimization problem. This is very much related to fast methods for solving problems in compressive sensing, based on $$\ell _1$$ optimization (Goldstein and Osher in SIAM J Imaging Sci 2(2):323–343, 2009; Yin et al. in SIAM J Imaging Sci 1(1):143–168, 2008). We seem to obtain methods which are polynomial in the dimension. Our algorithm is very fast, requires very low memory and is totally parallelizable. We can evaluate the solution and its gradient in very high dimensions at $$10^{-4}$$ – $$10^{-8}$$ s per evaluation on a laptop. We carefully explain how to compute numerically the optimal control from the numerical solution of the associated initial valued HJ PDE for a class of optimal control problems. We show that our algorithms compute all the quantities we need to obtain easily the controller. In addition, as a step often needed in this procedure, we have developed a new and equally fast way to find, in very high dimensions, the closest point y lying in the union of a finite number of compact convex sets $$\Omega $$ to any point x exterior to the $$\Omega $$ . We can also compute the distance to these sets much faster than Dijkstra type “fast methods,” e.g., Dijkstra (Numer Math 1:269–271, 1959). The term “curse of dimensionality” was coined by Bellman (Adaptive control processes, a guided tour. Princeton University Press, Princeton, 1961; Dynamic programming. Princeton University Press, Princeton, 1957), when considering problems in dynamic optimization.

116 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
202319
202280
202161
202034
201934
201843