scispace - formally typeset
Search or ask a question

Showing papers on "Information geometry published in 1995"


Journal ArticleDOI
TL;DR: A unified information geometrical framework for studying stochastic models of neural networks, by focusing on the EM and em algorithms, and proves a condition that guarantees their equivalence.

339 citations


Book
01 Oct 1995
TL;DR: In this article, it was shown that a regular abnormal extremal can always be put in a special form by a suitable change of coordinates, and an inequality showing that, once a trajectory is in this special form, then local optimality follows.
Abstract: We study length-minimizing arcs in sub-Riemannian manifolds (M;E;G) whose metric G is de ned on a rank-two bracket-generating distribution E. It is well known that all length-minimizing arcs are extremals, and that these extremals are either ormal" or \abnormal." Normal extremals are locally optimal, in the sense that every su ciently short piece of such an extremal is a minimizer. The question whether every length-minimizer is a normal extremal remained open for several years, and was recently settled by R. Montgomery, who exhibited a counterexample. But Montgomery's geometric optimality proof depends heavily on special properties of his example and still leaves open the question whether abnormal minimizers are an exceptional phenomenon or a common occurrence. We present an analytic technique for proving local optimality of a large class of abnormal extremals that we call \regular." Our technique is based on (a) a ormal form theorem," stating that, locally, a regular abnormal extremal can always be put in a special form by a suitable change of coordinates, and (b) an inequality showing that, once a trajectory is in this special form, then local optimality follows. Using this approach we prove that regular abnormal extremals are locally optimal. If E satis es a mild additional restriction |valid in particular for all regular 2-dimensional distributions and for generic 2-dimensional distributions| then regular abnormal extremals are \typical" (in a sense made precise in the text), so our result implies that the abnormal minimizers are ubiquitous rather than exceptional. We also discuss some related issues, and in particular show, by means of an example, that a smooth abnormal extremal need not be locally optimal, even if in addition it belongs to the class |recently studied by Bryant and Hsu| of C1-rigid curves. Keywords: Sub-Riemannian manifolds, Geodesics, Hamiltonians, Abnormal Extremals. ix 1 Introduction The structure of sub-Riemannian minimizers and of the corresponding \geodesics" has recently attracted a great deal of attention (cf. [1], [5], [6], [7], [9], [10], [11], [12], [14], [17], [18], [19], [22], [23], [27], [28]; see Remark 1 below for a discussion of the use of the word \geodesic"), due to the delicate issues that arise because of the possibility of the existence of \abnormal" length-minimizing arcs. This phenomenon, well known in Optimal Control Theory, was not immediately recognized as possible in the more special case of subRiemannian geometry. For example, in 1986 it was stated, in [22], that all length-minimizing arcs for a sub-Riemannian manifold are characteristics of the associated Hamiltonian (i.e. ormal extremals," in our terminology). A proof was suggested for this result, relying on an application of the Pontryagin Maximum Principle from Optimal Control Theory. It turns out, however, that the Maximum Principle only makes it possible to draw the weaker conclusion that every minimizer is either a characteristic of the Hamiltonian (i.e. a normal extremal) or a member of another class of arcs known as \abnormal extremals." The possibility that a minimizer might be an abnormal extremal can easily be ruled out in the Riemannian case and, more generally, for the special class |introduced by R. Strichartz in [23]| of sub-Riemannian metrics de ned on \strongly bracket-generating" distributions, but for general sub-Riemannian metrics there is no obvious way to go beyond the necessary conditions of the Maximum Principle and exclude abnormal extremals. This left open the question whether there can exist sub-Riemannian minimizers that are not normal extremals (\strictly abnormal minimizers," in the terminology introduced below). The suggestion that this could indeed happen had in fact been made much Work supported in part by the National Science Foundation under NSF grant DMS92-02554. Received by the editor November 1993, and in revised form February 1994. 2 WENSHENG LIU AND H ECTOR J. SUSSMANN earlier, in 1977, by B. Gaveau, who had studied a particular subRiemannian structure for which he asserted that there were pairs of points p1, p2 that could not be joined by a characteristic curve ([9], p. 133, Theorem 1). Since, for the system studied by Gaveau, the existence of a minimizer joining any two points is obvious, his result would have amounted to a non-constructive proof of the existence of a strictly abnormal minimizer. However, contrary to the statement of [9], the points considered by Gaveau can be joined by a characteristic curve, as was pointed out by R. Brockett in 1984 (cf. [6]). (Brockett established this by explicitly computing the minimizers and showing that they were characteristics. A brief discussion of the Gaveau-Brockett system is included below in Appendix A, where we provide an alternative proof of Brockett's result, by directly applying the Pontryagin Maximum Principle to show that all minimizers are characteristic curves.) Meanwhile, the claim that all minimizers are normal extremals was made again in 1988 in [1]. Subsequently, relying on this assertion, Hamenst adt stated in 1990, in [12], that \the critical points are geodesics, i.e. locally minimizing curves," and \every geodesic is a critical point; together this gives a complete description of the geodesics." (Hamenst adt's \critical points" are exactly the same as our ormal extremals.") However, the claim of [1] is also incorrect, and the \complete description of the geodesics" announced in [12] is in fact incomplete, because it omits a rather important and interesting class of minimizers that are rather di erent from the normal extremals. Correct answers to these questions are in fact suggested in a rather natural way by Optimal Control Theory, which gives necessary conditions for trajectories of fairly general control systems to minimize functionals of a rather general type. The sub-Riemannian minimization problem just happens to be a special case of the much broader class of situations where the Maximum Principle of Optimal Control Theory applies. (This fact was recognized by Strichartz in [22], even SUB-RIEMANNIAN METRICS 3 though his result failed to include the abnormal extremals.) When the Maximum Principle is in fact applied to this case, the possibility that some minimizers might be \abnormal" emerges directly by mechanically writing the necessary conditions. This, of course, does not yet answer the question whether abnormal minimizers actually exist, but it provides a radically di erent perspective on the problem: whereas a di erential geometer's natural inclination is to look for the true characterization of minimizers in the form of a modi ed version of the geodesic equation, based on making variations as in the classical derivation of the Euler-Lagrange equations, the control theorist's predisposition is to start from the opposite direction, applying the Maximum Principle to conclude that a minimizer is either a ormal extremal" or an \abnormal" one, inferring from this that abnormal extremal minimizers probably exist, and then setting out to prove that they do. The problem of the existence of strictly abnormal minimizers was nally settled in 1991, when R. Montgomery, in [17], produced an example of such a minimizer, thereby showing that the intuition derived from the control-theoretic point of view was in fact the right one, even though Montgomery himself was led to his discovery by physical rather than control-theoretic considerations. Montgomery's very long and ingenious optimality proof was somewhat simpli ed in 1992 by I. Kupka (cf. [14]). However, neither of these proofs makes it possible to go beyond individual examples and prove, for instance, that large classes of abnormal extremals are optimal. In [16], we studied an example for which the optimality proof was much simpler. Moreover, this example had the extra virtue of being such that, after suitable changes of coordinates, one can transform fairly general situations into normal forms to which the proof of [16] applies. This was made precise in the preprint [25], where a general \optimality lemma" was announced, which basically describes the most general situation where a method similar to that of [16] can be applied to prove optimality. Using this optimality 4 WENSHENG LIU AND H ECTOR J. SUSSMANN lemma plus a coordinate transformation, it was proved in [25] that, for a completely arbitrary four-dimensional sub-Riemannian manifold whose metric is de ned on a regular two-dimensional distribution, there passes through each point exactly one locally simple abnormal extremal parametrized by arc-length, and all these abnormal extremals are locally optimal. (Here \simple" means \without double points.") It was also shown that, if an extra generic condition is satis ed, then these abnormal extremals are not normal. It then became apparent that abnormal minimizers are not pathological objects that can only be shown to exist in very special and elaborately constructed examples. They are in fact ubiquitous and easy to nd (at least for generic cases), and can be proved optimal by relatively simple general techniques. The purpose of this work is to present the general theory of regular abnormal extremals for manifolds M of arbitrary dimensions equipped with sub-Riemannian structures (E;G) arising from two-dimensional distributions, and to prove in particular that these abnormal extremals are locally optimal. (The fact that normal extremals are locally optimal is an immediate consequence of the Control Theory version of classical Hamilton-Jacobi theory, which gives su cient conditions for optimality of \ elds of extremals," as explained, for example, in [15]. An independent derivation was given by Hamenstadt in [12]. Since the proof of [12] is based on a formalism quite di erent from ours, and the control-theoretic proof is in our view a rather transparent illustration of the power of Optimal Control methods and the advantages of the Hamiltonia

228 citations


Journal ArticleDOI
TL;DR: In this paper, a survey of singular curves in sub-Riemannian geometry is presented, which can be seen as length minimizing geodesics, independent of the choice of inner product.
Abstract: Sub-Riemannian geometry is the geometry of a distribution ofk-planes on an-dimensional manifold with a smoothly varying inner product on thek-planes. Singular curves are singularities of the space of paths tangent to the distribution and joining two fixed points. This survey is devoted to the singular curves, which can be length minimizing geodesics, independent of the choice of inner product.

90 citations


01 Jan 1995
TL;DR: The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd, which offers conceptual simplification to information geometry.
Abstract: Neural networks can be regarded as statistical models, and can be analysed in a Bayesian framework. Generalisation is measured by the performance on independent test data drawn from the same distribution as the training data. Such performance can be quantified by the posterior average of the information divergence between the true and the model distributions. Averaging over the Bayesian posterior guarantees internal coherence; Using information divergence guarantees invariance with respect to representation. The theory generalises the least mean squares theory for linear Gaussian models to general problems of statistical estimation. The main results are: (1)~the ideal optimal estimate is always given by average over the posterior; (2)~the optimal estimate within a computational model is given by the projection of the ideal estimate to the model. This incidentally shows some currently popular methods dealing with hyperpriors are in general unnecessary and misleading. The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd . It therefore offers conceptual simplification to information geometry. The general conclusion on the issue of evaluating neural network learning rules and other statistical inference methods is that such evaluations are only meaningful under three assumptions: The prior P(p) , describing the environment of all the problems; the divergence Dd , specifying the requirement of the task; and the model Q , specifying available computing resources.

73 citations


Journal ArticleDOI
TL;DR: In this article, an intrinsic version of the Cramer-Rao lower bound is obtained, which depends on the intrinsic bias and the curvature of the statistical model, for the mean square of the Rao distance.
Abstract: The parametric statistical models with suitable regularity conditions have a natural Riemannian manifold structure, given by the information metric. Since the parameters are merely labels for the probability measures, an inferential statement should be formulated through intrinsic objects, invariant under reparametrizations. In this context the estimators will be random objects valued on the manifold corresponding to the statistical model. In spite of these considerations, classical measures of an estimator's performance, like the bias and the mean square error, are clearly dependent on the statistical model parametrizations. In this paper the authors work with extended notions of mean value and moments of random objects which take values on a Hausdorff and connected manifold, equipped with an affine connection. In particular, the Riemannian manifold case is considered. This extension is applied to the bias and the mean square error study in statistical point estimation theory. Under this approach an intrinsic version of the Cramer-Rao lower bound is obtained : a lower bound, which depends on the intrinsic bias and the curvature of the statistical model, for the mean square of the Rao distance, the invariant measure analogous to the mean square error. Further, the behavior of the mean square of the Rao distance of an estimator when conditioning with respect to a sufficient statistic is considered, obtaining intrinsic versions of the Rao-Blackwell and Lehmann-Scheffe theorems. Asymptotic properties complete the study.

71 citations


Journal ArticleDOI
TL;DR: Certain properties of the Wheeler-DeWitt metric (for constant lapse) in canonical General Relativity associated with its non-definite nature are investigated.
Abstract: The configuration space of general relativity, called superspace or the space of three-geometries, inherits certain geometric structures from the Wheeler-DeWitt metric on the larger space of Riemannian metrics. We analytically investigate the signature properties of the particular geometric structure associated with the choice of constant lapse function. We point out that this metric has rather special properties and generically suffers from signature changes.

68 citations


Proceedings ArticleDOI
20 Jun 1995
TL;DR: A new result on single view invariants based on 6 points is shown and certain relationships are impossible, which has non trivial implications to the understanding of N view geometry.
Abstract: The paper unifies most of the current literature on 3D geometric invariants from point correspondences across multiple 2D views by using the tool of elimination from algebraic geometry. The technique allows one to predict results by counting parameters and reduces many complicated results obtained in the past (reconstructuon from two and three views, epipolar geometry from seven points, trilinearity of three views, the use of a priori 3D information such as bilateral symmetry, shading and color constancy, and more) into a few lines of reasoning each. The tool of Grobner base computation is used in the elimination process. In the process we obtain several results on N view geometry, and obtain a general result on invariant functions of 4 views and its corresponding quadlinear tensor: 4 views admit minimal sets of 16 invariant functions (of quadlinear forms) with 81 distinct coefficients that can be solved linearly from 6 corresponding points across 4 views. This result has non trivial implications to the understanding of N view geometry. We show a new result on single view invariants based on 6 points and show that certain relationships are impossible. One of the appealing features of the elimination approach is that it is simple to apply and does not require any understanding of the underlying 3D from 2D geometry and algebra. >

39 citations


Book ChapterDOI
01 Jan 1995
TL;DR: In this article, it was shown that the differential-geometrical formulation of statistics concerning the structure of a smooth manifold in the parameter space of classical probabilities, S = {p(·, θ)|, which was constructed by Amari can be extended to the non-commutative framework, at least, within finite dimensional algebras where the classical probabilities p(· and θ) are replaced by density matrices, ρ(θ).
Abstract: It is shown that the differential-geometrical formulation of statistics concerning the structure of a smooth manifold in the parameter space of classical probabilities, S = {p(·, θ)|, which was constructed by Amari can be extended to the non-commutative framework, at least, within finite dimensional algebras where the classical probabilities p(·, θ) are replaced by density matrices, ρ(θ). A brief outline of the framework is presented with the emphasis on elucidating the remarkable dualistic structure of the so-called α-families of ρ′s,specifically, in the case of |α|= 1 (i.e. the dualistic structure between log ρ and ρ).

39 citations


Journal ArticleDOI
TL;DR: In this article, the Alexandrov spaces are used to obtain results in Riemannian geometry using spherical suspensions of positively curved spaces, an operation which is also closed in Alexandrov geometry but not in riemannians.
Abstract: Singular spaces are playing an increasingly important role in Riemannian geometry. In particular this is true of the so-called Alexandrov spaces; i.e., finite-(Hausdorff) dimensional, complete, inner metric spaces with a lower curvature bound in the local triangle comparison sense. Interest in Alexandrov spaces is largely explained by the fact that the natural process of taking Gromov-Hausdorff limits is closed in Alexandrov geometry but not in Riemannian geometry. This simple observation has become a powerful tool in Riemannian geometry (see, for example, [FY], [GP2], or [GPW] and [Pe]). In this paper, we use a new means to obtain results in Riemannian geometry by studying Alexandrov spaces. The operation of forming spherical suspensions of positively curved spaces, an operation which is also closed in Alexandrov geometry but not in Riemannian geometry, will be applied to prove a differentiable sphere theorem which is unlike all previous solutions to the following basic problem.

38 citations


Book ChapterDOI
01 Jan 1995
TL;DR: In this article, a triplet structure (g, ∇(e), ∇ (m) on S via the specified covariance is introduced, where g is a Riemannian metric and ∇ and m are affine connections.
Abstract: In this article, we treat the space S of finite-dimensional positive density operators (quantum states) in a differential geometrical viewpoint. We suppose that a generalized covariance for arbitrary two observables (Hermitian operators) is specified at each state in S, which includes the symmetrized inner product and the Bogoliubov inner product as special (but important) cases, and introduce a triplet structure (g, ∇(e), ∇(m) on S via the specified covariance, where g is a Riemannian metric and ∇(e) and ∇(m) are affine connections. The structure (g, ∇(e), ∇(m) is regarded as a quantum analogue of the triplet of Fisher metric, exponential connection and mixture connection on a space of probability densities introduced in the information geometry by S. Amari ([1]). Some aspects relating to the quantum state estimation and the relative entropy are treated in terms of the differential geometry, where the theory of dual connections developed by Nagaoka and Amari ([4] [1]) plays an essential role.

38 citations


Journal ArticleDOI
TL;DR: In this article, a procedure to test statistical hypotheses is proposed on the basis of geodesic distances, and the asymptotic distribution of the test statistics are obtained; so, this method can be used in those cases where it is not possible to get the exact distribution of test statistics.


Journal ArticleDOI
TL;DR: In this paper, the affine connection and curvature related to the optical metric for a strong-laser plasma were derived and their spatial distributions were studied numerically, and their affine connections and curvatures were analyzed.
Abstract: The optical metric for a strong-laser plasma is derived. The affine connection and curvature related to the optical metric are given and their spatial distributions are studied numerically.

Journal ArticleDOI
TL;DR: In this paper, it was shown that there are no metric-compatible connections with zero torsion on proper Finslerian, i.e. post-Riemannian, metrics.
Abstract: It is shown that there are no metric-compatible connections with zero torsion onproperly Finslerian, i.e. post-Riemannian, metrics. Since Finslerian connections exist on Riemannian metrics, the torsion rather than the metric becomes the object which determines whether the geometry is properly Finslerian or not. On the other hand, the solder forms and connection are determined by the torsion if the affine curvature is zero, the torsion then containing all the information about the geometric reality of spacetime. Since the metric curvature may still be Riemannian, the question arises of whether its present central role in spacetime physics is but a consequence of requiring that all the geometric content of spacetime be contained in the metric.

Proceedings ArticleDOI
11 Aug 1995
TL;DR: A statistical method customized for the constraints of the variability of human cortical surface form and its implications for individual differences in neurophysiological functioning is sketched.
Abstract: Recent advances in computational geometry have greatly extended the range of neuroanatomical questions that can be approached by rigorous quantitative methods. One of the major current challenges in this area is to describe the variability of human cortical surface form and its implications for individual differences in neurophysiological functioning. Existing techniques for representation of stochastically invaginated surfaces do not conduce to the necessary parametric statistical summaries. In this paper, following a hint from David Van Essen and Heather Drury, I sketch a statistical method customized for the constraints of this complex data type. Cortical surface form is represented by its Riemannian metric tensor and averaged according to parameters of a smooth averaged surface. Sulci are represented by integral trajectories of the smaller principal strains of this metric, and their statistics follow the statistics of that relative metric. The diagrams visualizing this tensor analysis look like alligator leather but summarize all aspects of cortical surface form in between the principal sulci, the reliable ones; no flattening is required.


Journal ArticleDOI
TL;DR: For Newtonian dynamical systems on Riemannian manifolds that admit normal shift, the problem of the metrizability of these systems by means of a conformally equivalent metric is solved in this paper.
Abstract: For Newtonian dynamical systems on Riemannian manifolds that admit normal shift the problem of the metrizability of these systems by means of a conformally equivalent metric is solved.

Proceedings ArticleDOI
07 Mar 1995
TL;DR: A measure of similarity between array response vectors is introduced and it is shown that for wideband arrays, the optimal array selection should be performed only once, at the highest frequency of operation.
Abstract: The array response in the presence of a single signal is given by the array manifold. The manifold should be different for different directions of arrival (DOA). If for a given set of widely separated DOAs the manifold is similar, large errors, usually referred to as ambiguity errors, are likely to occur. We introduce a measure of similarity between array response vectors. A tight lower bound of the similarity measure can be easily derived. The array geometry associated with the highest lower bound performs better than other arrays with the same aperture and the same number of sensors. Therefore, this bound can be used for selecting the best array configuration from a given set of candidate geometries. It is shown that for wideband arrays, the optimal array selection should be performed only once, at the highest frequency of operation. Unlike most of the results in the literature, our approach is not limited to linear arrays, and it can be applied successfully to any array configuration.