scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Estimation of seismic moment tensors using variational inference machine learning

TL;DR: This work presents an approach for estimating in near real-time full moment tensors of earthquakes and their parameter uncertainties based on short time windows of recorded seismic waveform data by computing the tensor and parameter uncertainties in real time.
Abstract: We present an approach for estimating in near real-time full moment tensors of earthquakes and their parameter uncertainties based on short time windows of recorded seismic waveform data by conside...
Citations
More filters
Journal ArticleDOI
12 Aug 2022-Science
TL;DR: A comprehensive review of the deep learning techniques being applied to seismic datasets, covering approaches, limitations, and opportunities is provided by Mousavi and Beroza as mentioned in this paper , who present a systematic overview of trends, challenges and opportunities in applications of deep learning methods in seismology.
Abstract: Seismic waves from earthquakes and other sources are used to infer the structure and properties of Earth’s interior. The availability of large-scale seismic datasets and the suitability of deep-learning techniques for seismic data processing have pushed deep learning to the forefront of fundamental, long-standing research investigations in seismology. However, some aspects of applying deep learning to seismology are likely to prove instructive for the geosciences, and perhaps other research areas more broadly. Deep learning is a powerful approach, but there are subtleties and nuances in its application. We present a systematic overview of trends, challenges, and opportunities in applications of deep-learning methods in seismology. Description Large-scale learning The large amount and availability of datasets in seismology creates a great opportunity to apply machine learning and artificial intelligence to data processing. Mousavi and Beroza provide a comprehensive review of the deep-learning techniques being applied to seismic datasets, covering approaches, limitations, and opportunities. The trends in data processing and analysis can be instructive for geoscience and other research areas more broadly. —BG The ways in which deep learning can help process and analyze large seismological datasets are reviewed. BACKGROUND Seismology is the study of seismic waves to understand their origin—most obviously, sudden fault slip in earthquakes, but also explosions, volcanic eruptions, glaciers, landslides, ocean waves, vehicular traffic, aircraft, trains, wind, air guns, and thunderstorms, for example. Seismology uses those same waves to infer the structure and properties of planetary interiors. Because sources can generate waves at any time, seismic ground motion is recorded continuously, at typical sampling rates of 100 points per second, for three components of motion, and on arrays that can include thousands of sensors. Although seismology is clearly a data-rich science, it often is a data-driven science as well, with new phenomena and unexpected behavior discovered with regularity. And for at least some tasks, the careful and painstaking work of seismic analysts over decades and around the world has also made seismology a data label–rich science. This facet makes it fertile ground for deep learning, which has entered almost every subfield of seismology and outperforms classical approaches, often dramatically, for many seismological tasks. ADVANCES Seismic wave identification and onset-time, first-break determination for seismic P and S waves within continuous seismic data are foundational to seismology and are particularly well suited to deep learning because of the availability of massive, labeled datasets. It has received particularly close attention, and that has led, for example, to the development of deep learning–based earthquake catalogs that can feature more than an order of magnitude more events than are present in conventional catalogs. Deep learning has shown the ability to outperform classical approaches for other important seismological tasks as well, including the discrimination of earthquakes from explosions and other sources, separation of seismic signals from background noise, seismic image processing and interpretation, and Earth model inversion. OUTLOOK The development of increasingly cost-effective sensors and emerging ground-motion sensing technologies, such as fiber optic cable and accelerometers in smart devices, portend a continuing acceleration of seismological data volumes, so that deep learning is likely to become essential to seismology’s future. Deep learning’s nonlinear mapping ability, sequential data modeling, automatic feature extraction, dimensionality reduction, and reparameterization are all advantageous for processing high-dimensional seismic data, particularly because those data are noisy and, from the point of view of mathematical inference, incomplete. Deep learning for scientific discovery and direct extraction of insight into seismological processes is clearly just getting started. Aspects of seismology pose interesting additional challenges for deep learning. Many of the most important problems in earthquake seismology—such as earthquake forecasting, ground motion prediction, and rapid earthquake alerting—concern large and damaging earthquakes that are (fortunately) rare. That rarity poses a fundamental challenge for the data-hungry methods of deep learning: How can we train reliable models, and how do we validate them well enough to rely on them when data are scarce and opportunities to test models are infrequent? Further, how can we operationalize deep-learning techniques in such a situation, when the mechanisms by which they make predictions from data may not be easily explained, and the consequences of incorrect models are high? Incorporating domain knowledge through physics-based and explainable deep learning and setting up standard benchmarking and evaluation protocols will help ensure progress, as is the nascent emergence of a seismological data science ecosystem. More generally, a combination of data science literacy for geoscientists as well as recruiting data science expertise will help to ensure that deep-learning seismology reaches its full potential. Deep-learning processing of seismic data and incorporation of domain knowledge can lead to new capabilities and new insights across seismology. A. MASTIN/SCIENCE, TOP RIGHT: CARA HARWOOD/UNIVERSITY OF CALIFORNIA-DAVIS/CC BY-NC-SA 3.0 MIDDLE RIGHT: JOHAN SWANEPOEL/SCIENCE SOURCE

61 citations

Journal ArticleDOI
TL;DR: In this article , the authors proposed an alternative machine learning approach, which does not require any pre-existing observations, except a velocity model, and is based on a feed-forward neural network trained on synthetic arrival times.
Abstract: Location of earthquakes is a primary task in seismology and microseismic monitoring, essential for almost any further analysis. Earthquake hypocenters can be determined by the inversion of arrival times of seismic waves observed at seismic stations, which is a non-linear inverse problem. Growing amounts of seismic data and real-time processing requirements imply the use of robust machine learning applications for characterization of seismicity. Convolutional neural networks have been proposed for hypocenter determination assuming training on previously processed seismic event catalogs. We propose an alternative machine learning approach, which does not require any pre-existing observations, except a velocity model. This is particularly important for microseismic monitoring when labeled seismic events are not available due to lack of seismicity before monitoring commenced (e.g., induced seismicity). The proposed algorithm is based on a feed-forward neural network trained on synthetic arrival times. Once trained, the neural network can be deployed for fast location of seismic events using observed P-wave (or S-wave) arrival times. We benchmark the neural network method against the conventional location technique and show that the new approach provides the same or better location accuracy. We study the sensitivity of the proposed method to the training dataset, noise in the arrival times of the detected events, and the size of the monitoring network. Finally, we apply the method to real microseismic monitoring data and show that it is able to deal with missing arrival times in efficient way with the help of fine tuning and early stopping. This is achieved by re-training the neural network for each individual set of picked arrivals. To reduce the training time we used previously determined weights and fine tune them. This allows us to obtain hypocenter locations in near real-time.

3 citations

Journal ArticleDOI
TL;DR: In this article , the authors proposed a methodology for the learning of the seismic parameters: location and moment tensor from compressed seismic records, which can not only expedite data transmission from the field to the processing center but also remove the decompression overhead that would be required for the application of traditional processing methods.
Abstract: Fast detection and characterization of seismic sources is crucial for decision-making and warning systems that monitor natural and induced seismicity. However, besides the laying out of ever denser monitoring networks of seismic instruments, the incorporation of new sensor technologies such as Distributed Acoustic Sensing (DAS) further challenges our processing capabilities to deliver short turnaround answers from seismic monitoring. In response, this work describes a methodology for the learning of the seismological parameters: location and moment tensor from compressed seismic records. In this method, data dimensionality is reduced by applying a general encoding protocol derived from the principles of compressive sensing. The data in compressed form is then fed directly to a convolutional neural network that outputs fast predictions of the seismic source parameters. Thus, the proposed methodology can not only expedite data transmission from the field to the processing center, but also remove the decompression overhead that would be required for the application of traditional processing methods. An autoencoder is also explored as an equivalent alternative to perform the same job. We observe that the CS-based compression requires only a fraction of the computing power, time, data and expertise required to design and train an autoencoder to perform the same task. Implementation of the CS-method with a continuous flow of data together with generalization of the principles to other applications such as classification are also discussed.

2 citations

Journal ArticleDOI
TL;DR: In this paper , the authors presented a centroid moment tensor (CMT) catalog for the Amatrice-Visso-Norcia (AVN) seismic sequence based on a recently generated 3D wave speed model for the Italian lithosphere.
Abstract: Moment tensor inversions of broadband velocity data are usually managed by adopting Green's functions for 1D layered seismic wave speed models. This assumption can impact on source parameter estimates in regions with complex 3D heterogeneous structures and discontinuities in rock properties. In this work, we present a new centroid moment tensor (CMT) catalog for the Amatrice-Visso-Norcia (AVN) seismic sequence based on a recently generated 3D wave speed model for the Italian lithosphere. Forward synthetic seismograms and Fréchet derivatives for CMT-3D inversions of 159 earthquakes with Mw ≥ 3.0 are simulated using a spectral-element method (SEM) code. By comparing the retrieved solutions with those from time domain moment tensor (TDMT) catalog, obtained with a 1D wave speed model calibrated for Central Apennines (Italy), we observe a remarkable degree of consistency in terms of source geometry, kinematics, and magnitude. Significant differences are found in centroid depths, which are more accurately estimated using the 3D model. Finally, we present a newly designed parameter, τ, to better quantify and compare a-posteriori the reliability of the obtained MT solutions. τ measures the goodness of fit between observed and synthetic seismograms accounting for differences in amplitude, arrival time, percentage of fitted seconds, and the usual L2-norm estimate. The CMT-3D solutions represent the first Italian CMT catalog based on a full-waveform 3D wave speed model. They provide reliable source parameters with potential implications for the structures activated during the sequence. The developed approach can be readily applied to more complex Italian regions where 1D models are underperforming and not representative of the area.

1 citations

References
More filters
Proceedings Article
01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

111,197 citations


"Estimation of seismic moment tensor..." refers methods in this paper

  • ...As objective function (loss function), we use the negative log-likelihood and as optimizer the Adam algorithm (Kingma & Ba, 2014)....

    [...]

Proceedings ArticleDOI
02 Nov 2016
TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.
Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

10,913 citations

Proceedings Article
14 Jun 2011
TL;DR: This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-dierentiabil ity.
Abstract: While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-dierentiabil ity

6,790 citations


"Estimation of seismic moment tensor..." refers background in this paper

  • ...10.1029/2021JB022685 5 of 16 Gaussian with mean E and a standard deviation ̂E (e.g., Blundell et al., 2015; Graves, 2011; Wen et al., 2018)....

    [...]

Journal ArticleDOI
TL;DR: In this article, a Coulomb failure criterion was proposed for the production of aftershocks, where faults most likely to slip are those optimally orientated for failure as a result of the prevailing regional stress field and the stress change caused by the mainshock.
Abstract: To understand whether the 1992 M = 7.4 Landers earthquake changed the proximity to failure on the San Andreas fault system, we examine the general problem of how one earthquake might trigger another. The tendency of rocks to fail in a brittle manner is thought to be a function of both shear and confining stresses, commonly formulated as the Coulomb failure criterion. Here we explore how changes in Coulomb conditions associated with one or more earthquakes may trigger subsequent events. We first consider a Coulomb criterion appropriate for the production of aftershocks, where faults most likely to slip are those optimally orientated for failure as a result of the prevailing regional stress field and the stress change caused by the mainshock. We find that the distribution of aftershocks for the Landers earthquake, as well as for several other moderate events in its vicinity, can be explained by the Coulomb criterion as follows: aftershocks are abundant where the Coulomb stress on optimally orientated faults rose by more than one-half bar, and aftershocks are sparse where the Coulomb stress dropped by a similar amount. Further, we find that several moderate shocks raised the stress at the future Landers epicenter and along much of the Landers rupture zone by about a bar, advancing the Landers shock by 1 to 3 centuries. The Landers rupture, in turn, raised the stress at site of the future M = 6.5 Big Bear aftershock site by 3 bars. The Coulomb stress change on a specified fault is independent of regional stress but depends on the fault geometry, sense of slip, and the coefficient of friction. We use this method to resolve stress changes on the San Andreas and San Jacinto faults imposed by the Landers sequence. Together the Landers and Big Bear earthquakes raised the stress along the San Bernardino segment of the southern San Andreas fault by 2 to 6 bars, hastening the next great earthquake there by about a decade.

2,100 citations


"Estimation of seismic moment tensor..." refers methods in this paper

  • ...As objective function (loss function), we use the negative log-likelihood and as optimizer the Adam algorithm (Kingma & Ba, 2014)....

    [...]

Journal ArticleDOI
TL;DR: For the period 2004-2010, 13,017 new centroid-moment tensors were reported as mentioned in this paper, and the results are the product of the global centroidmoment-tensor (GCMT) project, which maintains and extends a catalog of global seismic moment tensors beginning with earthquakes in 1976.

2,099 citations