scispace - formally typeset
Search or ask a question

Showing papers by "Stanley Osher published in 2021"


Journal ArticleDOI
31 Mar 2021-Nature
TL;DR: In this paper, an atomic electron tomography reconstruction method was developed to determine the 3D atomic positions of an amorphous solid using a multi-component glass-forming alloy as proof of principle.
Abstract: Amorphous solids such as glass, plastics and amorphous thin films are ubiquitous in our daily life and have broad applications ranging from telecommunications to electronics and solar cells1-4 However, owing to the lack of long-range order, the three-dimensional (3D) atomic structure of amorphous solids has so far eluded direct experimental determination5-15 Here we develop an atomic electron tomography reconstruction method to experimentally determine the 3D atomic positions of an amorphous solid Using a multi-component glass-forming alloy as proof of principle, we quantitatively characterize the short- and medium-range order of the 3D atomic arrangement We observe that, although the 3D atomic packing of the short-range order is geometrically disordered, some short-range-order structures connect with each other to form crystal-like superclusters and give rise to medium-range order We identify four types of crystal-like medium-range order-face-centred cubic, hexagonal close-packed, body-centred cubic and simple cubic-coexisting in the amorphous sample, showing translational but not orientational order These observations provide direct experimental evidence to support the general framework of the efficient cluster packing model for metallic glasses10,12-14,16 We expect that this work will pave the way for the determination of the 3D structure of a wide range of amorphous solids, which could transform our fundamental understanding of non-crystalline materials and related phenomena

128 citations


Journal Article
TL;DR: SRSGD replaces the constant momentum in SGD by the increasing momentum in NAG but stabilizes the iterations by resetting the momentum to zero according to a schedule, a new NAG-style scheme for training DNNs.
Abstract: Stochastic gradient descent (SGD) algorithms, with constant momentum and its variants such as Adam, are the optimization methods of choice for training deep neural networks (DNNs). There is great interest in speeding up the convergence of these methods due to their high computational expense. Nesterov accelerated gradient (NAG) with a time-varying momentum, denoted as NAG below, improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst. In this paper, we propose scheduled restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD by the increasing momentum in NAG but stabilizes the iterations by resetting the momentum to zero according to a schedule. Using a variety of models and benchmarks for image classification, we demonstrate that, in training DNNs, SRSGD significantly improves convergence and generalization; for instance, in training ResNet-200 for ImageNet classification, SRSGD achieves an error rate of 20.93% vs. the benchmark of 22.13%. These improvements become more significant as the network grows deeper. Furthermore, on both CIFAR and ImageNet, SRSGD reaches similar or even better error rates with significantly fewer training epochs compared to the SGD baseline.

35 citations


Posted Content
TL;DR: A new method for training generative adversarial networks by applying the Wasserstein-2 metric proximal on the generators is introduced, which defines a parametrization invariant natural gradient by pulling back optimal transport structures from probability space to parameter space.
Abstract: We introduce a new method for training generative adversarial networks by applying the Wasserstein-2 metric proximal on the generators. The approach is based on Wasserstein information geometry. It defines a parametrization invariant natural gradient by pulling back optimal transport structures from probability space to parameter space. We obtain easy-to-implement iterative regularizers for the parameter updates of implicit deep generative models. Our experiments demonstrate that this method improves the speed and stability of training in terms of wall-clock time and Frechet Inception Distance.

30 citations


Journal ArticleDOI
TL;DR: APAC-Net as discussed by the authors is an alternating population and agent control neural network for solving stochastic mean field games (MFGs), which is geared toward high-dimensional instances of MFGs that are not approachable with existing solution methods.
Abstract: We present APAC-Net, an alternating population and agent control neural network for solving stochastic mean-field games (MFGs). Our algorithm is geared toward high-dimensional instances of MFGs that are not approachable with existing solution methods. We achieve this in two steps. First, we take advantage of the underlying variational primal-dual structure that MFGs exhibit and phrase it as a convex–concave saddle-point problem. Second, we parameterize the value and density functions by two neural networks, respectively. By phrasing the problem in this manner, solving the MFG can be interpreted as a special case of training a generative adversarial network (GAN). We show the potential of our method on up to 100-dimensional MFG problems.

24 citations


Journal ArticleDOI
TL;DR: In this article, the authors used atomic electron tomography to experimentally determine the three-dimensional atomic positions of monatomic amorphous solids, namely a Ta thin film and two Pd nanoparticles.
Abstract: Liquids and solids are two fundamental states of matter. However, our understanding of their three-dimensional atomic structure is mostly based on physical models. Here we use atomic electron tomography to experimentally determine the three-dimensional atomic positions of monatomic amorphous solids, namely a Ta thin film and two Pd nanoparticles. We observe that pentagonal bipyramids are the most abundant atomic motifs in these amorphous materials. Instead of forming icosahedra, the majority of pentagonal bipyramids arrange into pentagonal bipyramid networks with medium-range order. Molecular dynamics simulations further reveal that pentagonal bipyramid networks are prevalent in monatomic metallic liquids, which rapidly grow in size and form more icosahedra during the quench from the liquid to the glass state. These results expand our understanding of the atomic structures of amorphous solids and will encourage future studies on amorphous–crystalline phase and glass transitions in non-crystalline materials with three-dimensional atomic resolution. Atomic electron tomography is used to determine the three-dimensional atomic structure of monatomic amorphous solids with liquid-like structure, which is characterized by the existence of pentagonal bipyramid networks with medium-range order.

23 citations


Journal ArticleDOI
TL;DR: In this paper, a mean-field game model was introduced to control the propagation of epidemics in the COVID-19 pandemic, where the authors introduced a mean field game model in controlling epidemics.
Abstract: The coronavirus disease 2019 (COVID-19) pandemic is changing and impacting lives on a global scale In this paper, we introduce a mean-field game model in controlling the propagation of epidemics o

19 citations


Journal ArticleDOI
TL;DR: In this paper, a mean-field-game (MFG) algorithm was proposed to solve the problem of joint task assignment and collision-free trajectory optimization for mobile robots. But the complexity of the algorithm is linear with the total number of grid points in the proposed MFG problem.
Abstract: With the increasing popularity of mobile vehicles, such as unmanned aerial vehicles (UAVs) and mobile robots, it is foreseen that they will play an important role in Internet-of-Things (IoT) networks due to their high mobility and rapid deployment. Specifically, mobile vehicles equipped with sensors act as IoT devices and can be dispatched to several sensing regions to perform sensing tasks. In this article, we consider mobile vehicles for sensing applications and investigate the corresponding joint task assignment and collision-free trajectory optimization problem. This problem is challenging as the number of involved vehicles can be very large, and to tackle the problem efficiently, we reformulate the original optimization problem into a mean-field-game (MFG) problem by simplifying the interaction between vehicles as a distribution over their state space, known as the mean-field term. To solve the MFG problem efficiently, we propose a G-prox primal-dual hybrid gradient (PDHG) algorithm that transforms the MFG problem into a saddle-point problem by defining a Lagrangian functional with a proximal operator. The complexity of this algorithm is shown to be linear with the total number of grid points in the proposed MFG problem. We provide a comprehensive theoretical analysis of the proposed model and algorithm. Numerical results together with the practical implementation on real mobile robots show that our proposed system model and algorithm are of significant effectiveness and efficiency.

16 citations


Journal ArticleDOI
TL;DR: In this article, fast algorithms for generalized unnormalized optimal transport were proposed to handle densities with different total mass, and the associated gradient flow is essentially related to an elliptic equation which can be solved efficiently.

13 citations


Journal ArticleDOI
03 Jun 2021
TL;DR: This paper considers a mobile-vehicle-assisted MCS system where vehicles are owned by different operators or individuals who compete against others for limited sensing resources, and proposes a multi-population Mean-Field Game (MPMFG) problem by simplifying the interaction between vehicles as a distribution over their strategy space, known as the mean-field term.
Abstract: With the increasing deployment of mobile vehicles, such as mobile robots and unmanned aerial vehicles (UAVs), it is foreseen that they will play an important role in mobile crowd sensing (MCS). Specifically, mobile vehicles equipped with sensors and communication modules are able to execute large scale tasks due to their fast and flexible mobility in MCS systems. However, the battery capacity of mobile vehicles imposes a significant limitation on their performance, and so energy efficiency is an important metric especially when a large number of mobile vehicles collect sensing data. In this paper, we consider a mobile-vehicle-assisted MCS system where vehicles are owned by different operators or individuals who compete against others for limited sensing resources. We investigate the joint task selection and route planning problem for such an MCS system from an energy-efficiency perspective. However, since the computational complexity of the original problem is very high due to the large number of vehicles, we propose a multi-population Mean-Field Game (MPMFG) problem by simplifying the interaction between vehicles as a distribution over their strategy space, known as the mean-field term. To solve the MPMFG problem efficiently, we propose a G-prox primal-dual hybrid gradient method (PDHG) algorithm whose computational complexity is independent of the number of vehicles. Numerical results verify the effectiveness and efficiency of the proposed MPMFG scheme and G-prox PDHG algorithm.

10 citations


Journal ArticleDOI
TL;DR: Nurbekyan et al. as discussed by the authors introduced a framework to model and solve first-order mean field game systems with nonlocal interactions, extending the results in [L. Nurbekya and J. Saude, Port. Math., 75 (2018), pp. 36...
Abstract: We introduce a novel framework to model and solve first-order mean field game systems with nonlocal interactions, extending the results in [L. Nurbekyan and J. Saude, Port. Math., 75 (2018), pp. 36...

9 citations


Posted Content
23 Mar 2021
TL;DR: The fixed point networks (FPNs) as mentioned in this paper is a simple setup for implicit depth learning that guarantees convergence of forward propagation to a unique limit defined by network weights and input data.
Abstract: A growing trend in deep learning replaces fixed depth models by approximations of the limit as network depth approaches infinity. This approach uses a portion of network weights to prescribe behavior by defining a limit condition. This makes network depth implicit, varying based on the provided data and an error tolerance. Moreover, existing implicit models can be implemented and trained with fixed memory costs in exchange for additional computational costs. In particular, backpropagation through implicit depth models requires solving a Jacobian-based equation arising from the implicit function theorem. We propose fixed point networks (FPNs), a simple setup for implicit depth learning that guarantees convergence of forward propagation to a unique limit defined by network weights and input data. Our key contribution is to provide a new Jacobian-free backpropagation (JFB) scheme that circumvents the need to solve Jacobian-based equations while maintaining fixed memory costs. This makes FPNs much cheaper to train and easy to implement. Our numerical examples yield state of the art classification results for implicit depth models and outperform corresponding explicit models.


Journal ArticleDOI
TL;DR: In this paper, the stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling, however, SGLD is not suitable for Bayesian posterior sampling.
Abstract: As an important Markov chain Monte Carlo (MCMC) method, the stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling. However, S...

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a graph Laplacian-based high-dimensional interpolating function which, in the continuum limit, converges to the solution of a Laplace-Beltrami equation on a highdimensional manifold.
Abstract: Improving the accuracy and robustness of deep neural nets (DNNs) and adapting them to small training data are primary tasks in deep learning (DL) research. In this paper, we replace the output activation function of DNNs, typically the data-agnostic softmax function, with a graph Laplacian-based high-dimensional interpolating function which, in the continuum limit, converges to the solution of a Laplace–Beltrami equation on a high-dimensional manifold. Furthermore, we propose end-to-end training and testing algorithms for this new architecture. The proposed DNN with graph interpolating activation integrates the advantages of both deep learning and manifold learning. Compared to the conventional DNNs with the softmax function as output activation, the new framework demonstrates the following major advantages: First, it is better applicable to data-efficient learning in which we train high capacity DNNs without using a large number of training data. Second, it remarkably improves both natural accuracy on the clean images and robust accuracy on the adversarial images crafted by both white-box and black-box adversarial attacks. Third, it is a natural choice for semi-supervised learning. This paper is a significant extension of our earlier work published in NeurIPS, 2018. For reproducibility, the code is available at https://github.com/BaoWangMath/DNN-DataDependentActivation .

Posted Content
TL;DR: This article proposed Jacobian-free backpropagation (JFB), a fixed-memory approach that circumvents the need to solve Jacobianbased equations, which makes implicit networks faster to train and significantly easier to implement without sacrificing test accuracy.
Abstract: A promising trend in deep learning replaces traditional feedforward networks with implicit networks. Unlike traditional networks, implicit networks solve a fixed point equation to compute inferences. Solving for the fixed point varies in complexity, depending on provided data and an error tolerance. Importantly, implicit networks may be trained with fixed memory costs in stark contrast to feedforward networks, whose memory requirements scale linearly with depth. However, there is no free lunch -- backpropagation through implicit networks often requires solving a costly Jacobian-based equation arising from the implicit function theorem. We propose Jacobian-Free Backpropagation (JFB), a fixed-memory approach that circumvents the need to solve Jacobian-based equations. JFB makes implicit networks faster to train and significantly easier to implement, without sacrificing test accuracy. Our experiments show implicit networks trained with JFB are competitive with feedforward networks and prior implicit networks given the same number of parameters.

Proceedings ArticleDOI
14 Jun 2021
TL;DR: In this paper, a multi-population mean field game (MFG) problem was proposed to solve the joint task selection and route planning problem for a mobile vehicle-based MCS system, where vehicles owned by different operators or individuals compete against others for limited sensing resources.
Abstract: With the increasing deployment of mobile vehicles, such as mobile robots and unmanned aerial vehicles (UAVs), it is foreseen that they will play an important role in mobile crowd sensing (MCS). Specifically, mobile vehicles equipped with sensors and computing devices are able to collect massive data due to their fast and flexible mobility in MCS systems. In this paper, we consider a mobile vehicle-based MCS system where vehicles owned by different operators or individuals compete against others for limited sensing resources. We investigate the joint task selection and route planning problem for such an MCS system. However, since the structural complexity and computational complexity of the original problem is very high, we propose a multi-population Mean-Field Game (MFG) problem by simplifying the interaction between vehicles as a distribution over their strategy space, known as the mean-field term. To solve the multi-population MFG problem efficiently, we propose a G-prox primal-dual hybrid gradient method (PDHG) algorithm whose computational complexity is independent of the number of vehicles. Numerical results show that the proposed multi-population MFG scheme and algorithm are of effectiveness and efficiency.

Posted Content
TL;DR: In this article, a neural network approach for solving high-dimensional optimal control problems arising in real-time applications is proposed, which yields controls in a feedback form and can therefore handle uncertainties such as perturbations to the system's state.
Abstract: We propose a neural network approach for solving high-dimensional optimal control problems arising in real-time applications. Our approach yields controls in a feedback form and can therefore handle uncertainties such as perturbations to the system's state. We accomplish this by fusing the Pontryagin Maximum Principle (PMP) and Hamilton-Jacobi-Bellman (HJB) approaches and parameterizing the value function with a neural network. We train our neural network model using the objective function of the control problem and penalty terms that enforce the HJB equations. Therefore, our training algorithm does not involve data generated by another algorithm. By training on a distribution of initial states, we ensure the controls' optimality on a large portion of the state-space. Our grid-free approach scales efficiently to dimensions where grids become impractical or infeasible. We demonstrate the effectiveness of our approach on several multi-agent collision-avoidance problems in up to 150 dimensions. Furthermore, we empirically observe that the number of parameters in our approach scales linearly with the dimension of the control problem, thereby mitigating the curse of dimensionality.

Journal ArticleDOI
TL;DR: This work improves the robustness of Deep Neural Net to adversarial attacks by using an interpolating function as the output activation, and achieves an improvement in robust accuracy by 38.9% for ResNet56 under the strongest IFGSM attack.
Abstract: We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation. This data-dependent activation remarkably improves both the generalization and robustness of DNN. In the CIFAR10 benchmark, we raise the robust accuracy of the adversarially trained ResNet20 from \begin{document}$ \sim 46\% $\end{document} to \begin{document}$ \sim 69\% $\end{document} under the state-of-the-art Iterative Fast Gradient Sign Method (IFGSM) based adversarial attack. When we combine this data-dependent activation with total variation minimization on adversarial images and training data augmentation, we achieve an improvement in robust accuracy by 38.9 \begin{document}$ \% $\end{document} for ResNet56 under the strongest IFGSM attack. Furthermore, We provide an intuitive explanation of our defense by analyzing the geometry of the feature space.

Journal ArticleDOI
TL;DR: In this paper, a unified framework to study the coevolution of grain boundaries with bulk plasticity is developed, which is based on modeling grain boundaries as continuum dislocations governed by an energy based on the Kobayashi-Warren-Carter model.

Posted Content
TL;DR: Nash fixed point networks (N-FPNs) as mentioned in this paper are a class of implicit-depth neural networks that output Nash equilibria of contextual games, where the context encodes additional information beyond the control of any agent.
Abstract: Systems of interacting agents can often be modeled as contextual games, where the context encodes additional information, beyond the control of any agent (e.g. weather for traffic and fiscal policy for market economies). In such systems, the most likely outcome is given by a Nash equilibrium. In many practical settings, only game equilibria are observed, while the optimal parameters for a game model are unknown. This work introduces Nash Fixed Point Networks (N-FPNs), a class of implicit-depth neural networks that output Nash equilibria of contextual games. The N-FPN architecture fuses data-driven modeling with provided constraints. Given equilibrium observations of a contextual game, N-FPN parameters are learnt to predict equilibria outcomes given only the context. We present an end-to-end training scheme for N-FPNs that is simple and memory efficient to implement with existing autodifferentiation tools. N-FPNs also exploit a novel constraint decoupling scheme to avoid costly projections. Provided numerical examples show the efficacy of N-FPNs on atomic and non-atomic games (e.g. traffic routing).

Posted Content
TL;DR: In this article, a class of mean-field information dynamics based on reaction diffusion equations was formulated and computed using primal-dual hybrid-gradient algorithms, and several numerical examples are provided.
Abstract: We formulate and compute a class of mean-field information dynamics based on reaction diffusion equations. Given a class of nonlinear reaction diffusion and entropy type Lyapunov functionals, we study their gradient flow formulations. We write the "mean-field" metric space formalisms and derive Hamiltonian flows therein. These Hamiltonian flows follow saddle point systems of the proposed mean-field control problems. We apply primal-dual hybrid-gradient algorithms to compute the mean field information dynamics. Several numerical examples are provided.

Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate the rectification properties of tapered-channel thermal diodes relying on asymmetric heat flow brought about by thermal conductivity differences between the liquid and solid phases of suitably selected phase-change materials (PCM).
Abstract: Designing thermal diodes is attracting a considerable amount of interest recently due to the wide range of applications and potentially high impact in the transportation and energy industries. Advances in nanoscale synthesis and characterization are opening new avenues for design using atomic-level tools to take advantage of materials properties in confined volumes. In this paper, we demonstrate using advanced modeling and simulation the rectification properties of tapered-channel thermal diodes relying on asymmetric heat flow brought about by thermal conductivity differences between the liquid and solid phases of suitably selected phase-change materials (PCM). Our prototypical design considers Ga as PCM and anodized alumina as the structural material. First, we use a thresholding scheme to solve a Stefan problem in the device channel to study the interface shape and the hysteresis of the phase transformation when the temperature gradient is switched. We then carry out finite-element simulations to study the effect of several geometric parameters on diode efficiency, such as channel length as aspect ratio. Our analysis establishes physical limits on rectification efficiencies and point to design improvements using several materials to assess the potential of these devices as viable thermal diodes. Finally, we demonstrate the viability of proof-of-concept device fabrication by using a non-conformal atomic layer deposition process in anodic alumina membranes infiltrated with Ga metal.

Posted Content
TL;DR: In this article, fast multipole transformers (FMM-formers) have been proposed for accelerating particle simulation by decomposing particle-particle interaction into near-field and far-field components and then performing direct and coarse-grained computation, respectively.
Abstract: We propose FMMformers, a class of efficient and flexible transformers inspired by the celebrated fast multipole method (FMM) for accelerating interacting particle simulation. FMM decomposes particle-particle interaction into near-field and far-field components and then performs direct and coarse-grained computation, respectively. Similarly, FMMformers decompose the attention into near-field and far-field attention, modeling the near-field attention by a banded matrix and the far-field attention by a low-rank matrix. Computing the attention matrix for FMMformers requires linear complexity in computational time and memory footprint with respect to the sequence length. In contrast, standard transformers suffer from quadratic complexity. We analyze and validate the advantage of FMMformers over the standard transformer on the Long Range Arena and language modeling benchmarks. FMMformers can even outperform the standard transformer in terms of accuracy by a significant margin. For instance, FMMformers achieve an average classification accuracy of $60.74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58.70\%$.

Posted Content
TL;DR: In this article, heavy ball neural ordinary differential equations (HBNODEs) are proposed to accelerate both forward and backward ODE solvers, thus reducing the number of function evaluations and improving the utility of the trained models.
Abstract: We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers, thus significantly reducing the number of function evaluations (NFEs) and improving the utility of the trained models. (ii) The spectrum of HBNODEs is well structured, enabling effective learning of long-term dependencies from complex sequential data. We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling. Our method requires remarkably fewer forward and backward NFEs, is more accurate, and learns long-term dependencies more effectively than the other ODE-based neural network models. Code is available at \url{this https URL}.

Proceedings Article
06 Dec 2021
TL;DR: In this article, fast multipole transformers (FMM-formers) have been proposed for accelerating particle simulation by decomposing particle-particle interaction into near-field and far-field components and then performing direct and coarse-grained computation, respectively.
Abstract: We propose FMMformers, a class of efficient and flexible transformers inspired by the celebrated fast multipole method (FMM) for accelerating interacting particle simulation. FMM decomposes particle-particle interaction into near-field and far-field components and then performs direct and coarse-grained computation, respectively. Similarly, FMMformers decompose the attention into near-field and far-field attention, modeling the near-field attention by a banded matrix and the far-field attention by a low-rank matrix. Computing the attention matrix for FMMformers requires linear complexity in computational time and memory footprint with respect to the sequence length. In contrast, standard transformers suffer from quadratic complexity. We analyze and validate the advantage of FMMformers over the standard transformer on the Long Range Arena and language modeling benchmarks. FMMformers can even outperform the standard transformer in terms of accuracy by a significant margin. For instance, FMMformers achieve an average classification accuracy of $60.74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58.70\%$.

Posted Content
TL;DR: In this article, a mean-field variational problem in a spatial domain, which controls the propagation of pandemic by the optimal transportation strategy of vaccine distribution, is proposed, where the vaccine distribution is integrated into the mean field SIR model.
Abstract: With the invention of the COVID-19 vaccine, shipping and distributing are crucial in controlling the pandemic. In this paper, we build a mean-field variational problem in a spatial domain, which controls the propagation of pandemic by the optimal transportation strategy of vaccine distribution. Here we integrate the vaccine distribution into the mean-field SIR model designed in our previous paper arXiv:2006.01249. Numerical examples demonstrate that the proposed model provides practical strategies in vaccine distribution on a spatial domain.

Posted Content
TL;DR: In this paper, an algorithmic and theoretical framework for improving neural network architecture design via momentum is presented and reviewed, which can improve the architecture design for recurrent neural networks (RNNs), neural ODEs, and transformers.
Abstract: We present and review an algorithmic and theoretical framework for improving neural network architecture design via momentum. As case studies, we consider how momentum can improve the architecture design for recurrent neural networks (RNNs), neural ordinary differential equations (ODEs), and transformers. We show that integrating momentum into neural network architectures has several remarkable theoretical and empirical benefits, including 1) integrating momentum into RNNs and neural ODEs can overcome the vanishing gradient issues in training RNNs and neural ODEs, resulting in effective learning long-term dependencies. 2) momentum in neural ODEs can reduce the stiffness of the ODE dynamics, which significantly enhances the computational efficiency in training and testing. 3) momentum can improve the efficiency and accuracy of transformers.

Proceedings ArticleDOI
14 Jun 2021
TL;DR: Zhang et al. as mentioned in this paper formulated the opinion evolution in social networks as a high-dimensional stochastic mean field game (MFG), and used an alternating population and agent control neural network (APAC-net) to solve the MFG.
Abstract: Belief and opinion evolution in social networks (SNs) can aid in understanding how people influence others’ decisions through social relationships as well as provide a solid foundation for many valuable social applications. As large numbers of users are involved in SNs, the complexity of traditional optimization techniques is high as they deal with the interactions between users separately. Moreover, the state variable (opinion) is high-dimensional because a person usually has opinions about many different social issues. To overcome those challenges, we formulate the opinion evolution in SNs as a high-dimensional stochastic mean field game (MFG). Numerical methods for high-dimensional MFGs are practically non-existent because of the need for grid-based spatial discretization. Thus, we propose a machine-learning based method, where we use an alternating population and agent control neural network (APAC-net), to tractably solve high-dimensional stochastic MFGs. Through APAC-net, solving MFGs can be regarded as a special case of training a generative adversarial network (GAN). To the best of our knowledge, the APAC-Net is the first model that can solve high-dimensional stochastic MFGs. The simulation results affirm the efficiency of the APAC-net.

Posted Content
TL;DR: In this paper, the authors proposed an efficient and flexible algorithm to solve dynamic mean-field planning problems based on an accelerated proximal gradient method, and they theoretically showed that the proposed discrete solution converges to the underlying continuous solution as the grid size increases.
Abstract: In this paper, we propose an efficient and flexible algorithm to solve dynamic mean-field planning problems based on an accelerated proximal gradient method. Besides an easy-to-implement gradient descent step in this algorithm, a crucial projection step becomes solving an elliptic equation whose solution can be obtained by conventional methods efficiently. By induction on iterations used in the algorithm, we theoretically show that the proposed discrete solution converges to the underlying continuous solution as the grid size increases. Furthermore, we generalize our algorithm to mean-field game problems and accelerate it using multilevel and multigrid strategies. We conduct comprehensive numerical experiments to confirm the convergence analysis of the proposed algorithm, to show its efficiency and mass preservation property by comparing it with state-of-the-art methods, and to illustrates its flexibility for handling various mean-field variational problems.

Posted ContentDOI
23 Mar 2021
Abstract: Liquids and solids are two fundamental states of matter. However, due to the lack of direct experimental determination, our understanding of the 3D atomic structure of liquids and amorphous solids remained speculative. Here we advance atomic electron tomography to determine for the first time the 3D atomic positions in monatomic amorphous materials, including a Ta thin film and two Pd nanoparticles. We observe that pentagonal bipyramids are the most abundant atomic motifs in these amorphous materials. Instead of forming icosahedra, the majority of pentagonal bipyramids arrange into a novel medium-range order, named the pentagonal bipyramid network. Molecular dynamic simulations further reveal that pentagonal bipyramid networks are prevalent in monatomic amorphous liquids, which rapidly grow in size and form icosahedra during the quench from the liquid state to glass state. The experimental method and results are expected to advance the study of the amorphous-crystalline phase transition and glass transition at the single-atom level.