scispace - formally typeset
Search or ask a question
Author

Lingjiong Zhu

Bio: Lingjiong Zhu is an academic researcher from Florida State University. The author has contributed to research in topics: Large deviations theory & Central limit theorem. The author has an hindex of 19, co-authored 130 publications receiving 1191 citations. Previous affiliations of Lingjiong Zhu include University of Minnesota & Courant Institute of Mathematical Sciences.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a functional central limit theorem for the nonlinear Hawkes process is obtained, and a functional law of the iterated logarithm is obtained for the same point process.
Abstract: The Hawkes process is a self-exciting point process with clustering effect whose intensity depends on its entire past history. It has wide applications in neuroscience, finance, and many other fields. In this paper we obtain a functional central limit theorem for the nonlinear Hawkes process. Under the same assumptions, we also obtain a Strassen's invariance principle, i.e. a functional law of the iterated logarithm.

92 citations

Journal ArticleDOI
TL;DR: This paper proves first a large deviation principle for a special class of nonlinear Hawkes processes, that is, a Markovian Hawkes process with nonlinear rate and exponential exciting function, and then generalizes it to get the result for sum of exponentials exciting functions.
Abstract: Hawkes process is a class of simple point processes that is self-exciting and has clustering effect. The intensity of this point process depends on its entire past history. It has wide applications in finance, neuroscience and many other fields. In this paper, we study the large deviations for nonlinear Hawkes processes. The large deviations for linear Hawkes processes has been studied by Bordenave and Torrisi. In this paper, we prove first a large deviation principle for a special class of nonlinear Hawkes processes, that is, a Markovian Hawkes process with nonlinear rate and exponential exciting function, and then generalize it to get the result for sum of exponentials exciting functions. We then provide an alternative proof for the large deviation principle for a linear Hawkes process. Finally, we use an approximation approach to prove the large deviation principle for a special class of nonlinear Hawkes processes with general exciting functions.

70 citations

Posted Content
TL;DR: In this article, the authors survey the known results about the theory and applications of both linear and nonlinear Hawkes processes and provide an alternative variational formula for the rate function of the level-1 large deviations in the Markovian case.
Abstract: The Hawkes process is a simple point process that has long memory, clustering effect, self-exciting property and is in general non-Markovian. The future evolution of a self-exciting point process is influenced by the timing of the past events. There are applications in finance, neuroscience, genome analysis, seismology, sociology, criminology and many other fields. We first survey the known results about the theory and applications of both linear and nonlinear Hawkes processes. Then, we obtain the central limit theorem and process-level, i.e. level-3 large deviations for nonlinear Hawkes processes. The level-1 large deviation principle holds as a result of the contraction principle. We also provide an alternative variational formula for the rate function of the level-1 large deviations in the Markovian case. Next, we drop the usual assumptions on the nonlinear Hawkes process and categorize it into different regimes, i.e. sublinear, sub-critical, critical, super-critical and explosive regimes. We show the different time asymptotics in different regimes and obtain other properties as well. Finally, we study the limit theorems of linear Hawkes processes with random marks.

63 citations

Journal ArticleDOI
TL;DR: In this paper, a stochastic Cox-Ingersoll-Ross process with Hawkes jumps is proposed, which is a generalization of the classical Cox-Inersoll Ross process and the classical Hawkes process with exponential exciting function.
Abstract: In this paper we propose a stochastic process, which is a Cox-Ingersoll-Ross process with Hawkes jumps. It can be seen as a generalization of the classical Cox-Ingersoll-Ross process and the classical Hawkes process with exponential exciting function. Our model is a special case of the affine point processes. We obtain Laplace transforms and limit theorems, including the law of large numbers, central limit theorems, and large deviations.

59 citations

Posted Content
TL;DR: Two variants of SGHMC based on two alternative discretizations of the underdamped Langevin diffusion are studied, and results show that acceleration with momentum is possible in the context of global non-convex optimization.
Abstract: Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is a variant of stochastic gradient with momentum where a controlled and properly scaled Gaussian noise is added to the stochastic gradients to steer the iterates towards a global minimum. Many works reported its empirical success in practice for solving stochastic non-convex optimization problems, in particular it has been observed to outperform overdamped Langevin Monte Carlo-based methods such as stochastic gradient Langevin dynamics (SGLD) in many applications. Although asymptotic global convergence properties of SGHMC are well known, its finite-time performance is not well-understood. In this work, we study two variants of SGHMC based on two alternative discretizations of the underdamped Langevin diffusion. We provide finite-time performance bounds for the global convergence of both SGHMC variants for solving stochastic non-convex optimization problems with explicit constants. Our results lead to non-asymptotic guarantees for both population and empirical risk minimization problems. For a fixed target accuracy level, on a class of non-convex problems, we obtain complexity bounds for SGHMC that can be tighter than those for SGLD. These results show that acceleration with momentum is possible in the context of global non-convex optimization.

57 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Book ChapterDOI
01 Jan 2011
TL;DR: Weakconvergence methods in metric spaces were studied in this article, with applications sufficient to show their power and utility, and the results of the first three chapters are used in Chapter 4 to derive a variety of limit theorems for dependent sequences of random variables.
Abstract: The author's preface gives an outline: "This book is about weakconvergence methods in metric spaces, with applications sufficient to show their power and utility. The Introduction motivates the definitions and indicates how the theory will yield solutions to problems arising outside it. Chapter 1 sets out the basic general theorems, which are then specialized in Chapter 2 to the space C[0, l ] of continuous functions on the unit interval and in Chapter 3 to the space D [0, 1 ] of functions with discontinuities of the first kind. The results of the first three chapters are used in Chapter 4 to derive a variety of limit theorems for dependent sequences of random variables. " The book develops and expands on Donsker's 1951 and 1952 papers on the invariance principle and empirical distributions. The basic random variables remain real-valued although, of course, measures on C[0, l ] and D[0, l ] are vitally used. Within this framework, there are various possibilities for a different and apparently better treatment of the material. More of the general theory of weak convergence of probabilities on separable metric spaces would be useful. Metrizability of the convergence is not brought up until late in the Appendix. The close relation of the Prokhorov metric and a metric for convergence in probability is (hence) not mentioned (see V. Strassen, Ann. Math. Statist. 36 (1965), 423-439; the reviewer, ibid. 39 (1968), 1563-1572). This relation would illuminate and organize such results as Theorems 4.1, 4.2 and 4.4 which give isolated, ad hoc connections between weak convergence of measures and nearness in probability. In the middle of p. 16, it should be noted that C*(S) consists of signed measures which need only be finitely additive if 5 is not compact. On p. 239, where the author twice speaks of separable subsets having nonmeasurable cardinal, he means "discrete" rather than "separable." Theorem 1.4 is Ulam's theorem that a Borel probability on a complete separable metric space is tight. Theorem 1 of Appendix 3 weakens completeness to topological completeness. After mentioning that probabilities on the rationals are tight, the author says it is an

3,554 citations

Book ChapterDOI
01 Jan 1998
TL;DR: In this paper, the authors explore questions of existence and uniqueness for solutions to stochastic differential equations and offer a study of their properties, using diffusion processes as a model of a Markov process with continuous sample paths.
Abstract: We explore in this chapter questions of existence and uniqueness for solutions to stochastic differential equations and offer a study of their properties. This endeavor is really a study of diffusion processes. Loosely speaking, the term diffusion is attributed to a Markov process which has continuous sample paths and can be characterized in terms of its infinitesimal generator.

2,446 citations

01 Jan 2016
TL;DR: An introduction to the theory of point processes is universally compatible with any devices to read and will help you get the most less latency time to download any of the authors' books like this one.
Abstract: Thank you for downloading an introduction to the theory of point processes. As you may know, people have search hundreds times for their chosen novels like this an introduction to the theory of point processes, but end up in infectious downloads. Rather than enjoying a good book with a cup of coffee in the afternoon, instead they juggled with some harmful virus inside their computer. an introduction to the theory of point processes is available in our digital library an online access to it is set as public so you can download it instantly. Our book servers hosts in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Merely said, the an introduction to the theory of point processes is universally compatible with any devices to read.

903 citations

10 Dec 2016
TL;DR: Wang et al. as mentioned in this paper developed Doctor AI, a generic predictive model that covers observed medical conditions and medication uses using recurrent neural networks (RNNs) and applied it to longitudinal time stamped EHR data from 260k patients over 8 years.
Abstract: Leveraging large historical data in electronic health record (EHR), we developed Doctor AI, a generic predictive model that covers observed medical conditions and medication uses. Doctor AI is a temporal model using recurrent neural networks (RNN) and was developed and applied to longitudinal time stamped EHR data from 260K patients over 8 years. Encounter records (e.g. diagnosis codes, medication codes or procedure codes) were input to RNN to predict (all) the diagnosis and medication categories for a subsequent visit. Doctor AI assesses the history of patients to make multilabel predictions (one label for each diagnosis or medication category). Based on separate blind test set evaluation, Doctor AI can perform differential diagnosis with up to 79% recall@30, significantly higher than several baselines. Moreover, we demonstrate great generalizability of Doctor AI by adapting the resulting models from one institution to another without losing substantial accuracy.

714 citations