scispace - formally typeset
Search or ask a question
Author

Stanley Osher

Bio: Stanley Osher is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Level set method & Hyperbolic partial differential equation. The author has an hindex of 114, co-authored 510 publications receiving 104028 citations. Previous affiliations of Stanley Osher include University of Minnesota & University of Innsbruck.


Papers
More filters
Posted Content
TL;DR: Using a Feynman-Kac formalism principled robust and sparse DNNs, this paper can at least double the channel sparsity of the adversarially trained ResNet20 for CIFAR10 classification and improve the natural accuracy and robust accuracy under the benchmark iterations of IFGSM attack.
Abstract: Deep neural nets (DNNs) compression is crucial for adaptation to mobile devices. Though many successful algorithms exist to compress naturally trained DNNs, developing efficient and stable compression algorithms for robustly trained DNNs remains widely open. In this paper, we focus on a co-design of efficient DNN compression algorithms and sparse neural architectures for robust and accurate deep learning. Such a co-design enables us to advance the goal of accommodating both sparsity and robustness. With this objective in mind, we leverage the relaxed augmented Lagrangian based algorithms to prune the weights of adversarially trained DNNs, at both structured and unstructured levels. Using a Feynman-Kac formalism principled robust and sparse DNNs, we can at least double the channel sparsity of the adversarially trained ResNet20 for CIFAR10 classification, meanwhile, improve the natural accuracy by $8.69$\% and the robust accuracy under the benchmark $20$ iterations of IFGSM attack by $5.42$\%. The code is available at \url{this https URL}.

6 citations

Posted Content
TL;DR: This work considers the reinforcement learning problem of training multiple agents in order to maximize a shared reward, and shows that one can obtain decentralized solutions to a multi-agent problem through imitation learning.
Abstract: We consider the reinforcement learning problem of training multiple agents in order to maximize a shared reward. In this multi-agent system, each agent seeks to maximize the reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent observes a non-stationary and partially-observable environment. In order to obtain multi-agents that act in a decentralized manner, we introduce a novel algorithm under the framework of centralized learning, but decentralized execution. This training framework first obtains solutions to a multi-agent problem with a single centralized joint-space learner. This centralized expert is then used to guide imitation learning for independent decentralized multi-agents. This framework has the flexibility to use any reinforcement learning algorithm to obtain the expert as well as any imitation learning algorithm to obtain the decentralized agents. This is in contrast to other multi-agent learning algorithms that, for example, can require more specific structures. We present some theoretical error bounds for our method, and we show that one can obtain decentralized solutions to a multi-agent problem through imitation learning.

6 citations

Posted Content
TL;DR: This article proposed Jacobian-free backpropagation (JFB), a fixed-memory approach that circumvents the need to solve Jacobianbased equations, which makes implicit networks faster to train and significantly easier to implement without sacrificing test accuracy.
Abstract: A promising trend in deep learning replaces traditional feedforward networks with implicit networks. Unlike traditional networks, implicit networks solve a fixed point equation to compute inferences. Solving for the fixed point varies in complexity, depending on provided data and an error tolerance. Importantly, implicit networks may be trained with fixed memory costs in stark contrast to feedforward networks, whose memory requirements scale linearly with depth. However, there is no free lunch -- backpropagation through implicit networks often requires solving a costly Jacobian-based equation arising from the implicit function theorem. We propose Jacobian-Free Backpropagation (JFB), a fixed-memory approach that circumvents the need to solve Jacobian-based equations. JFB makes implicit networks faster to train and significantly easier to implement, without sacrificing test accuracy. Our experiments show implicit networks trained with JFB are competitive with feedforward networks and prior implicit networks given the same number of parameters.

6 citations

Journal ArticleDOI
TL;DR: In this article, a multiscale representation (MSR) for shapes via level set motions and PDEs is introduced, and a surface inpainting algorithm is designed to recover three-dimensional geometry of blood vessels.
Abstract: In this paper, we will first introduce a novel multiscale representation (MSR) for shapes via level set motions and PDEs. Based on the MSR, we will then design a surface inpainting algorithm to recover three-dimensional geometry of blood vessels. Because of the nature of irregular morphology in vessels and organs, both phantom and real inpainting scenarios were tested using our new algorithm. Successful vessel recoveries are demonstrated with numerical estimation of the degree of arteriosclerosis and vessel occlusion.

6 citations

Proceedings ArticleDOI
14 May 2008
TL;DR: An asymmetric version of a recently proposed unbiased registration method is investigated, using mutual information as the matching criterion, and it is demonstrated that the unbiased methods have higher reproducibility and less likely to detect changes in the absence of any real physiological change.
Abstract: This paper examines the power of different nonrigid registration models to detect changes in tensor based morphometry (TBM), and their stability when no real changes are present. Specifically, we investigate an asymmetric version of a recently proposed unbiased registration method, using mutual information as the matching criterion. We compare matching functionals (sum of squared differences and mutual information), as well as large deformation registration schemes (viscous fluid registration versus symmetric and asymmetric unbiased registration) for detecting changes in serial MRI scans of 10 elderly normal subjects and 10 patients with Alzheimer's disease scanned at 2-week and 1-year intervals. We demonstrated that the unbiased methods, both symmetric and asymmetric, have higher reproducibility. The unbiased methods were also less likely to detect changes in the absence of any real physiological change. Moreover, they measured biological deformations more accurately by penalizing bias in the corresponding statistical maps.

6 citations


Cited by
More filters
Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

29,323 citations

Book
23 May 2011
TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.
Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

17,433 citations