scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Characterization of anomalous diffusion classical statistics powered by deep learning (CONDOR)

06 Aug 2021-Journal of Physics A (IOP Publishing)-Vol. 54, Iss: 31, pp 314003
TL;DR: In this paper, a novel method (CONDOR) combines feature engineering based on classical statistics with supervised deep learning to efficiently identify the underlying anomalous diffusion model with high accuracy and infer its exponent with a small mean absolute error in single 1D, 2D and 3D trajectories corrupted by localization noise.
Abstract: Diffusion processes are important in several physical, chemical, biological and human phenomena. Examples include molecular encounters in reactions, cellular signalling, the foraging of animals, the spread of diseases, as well as trends in financial markets and climate records. Deviations from Brownian diffusion, known as anomalous diffusion, can often be observed in these processes, when the growth of the mean square displacement in time is not linear. An ever-increasing number of methods has thus appeared to characterize anomalous diffusion trajectories based on classical statistics or machine learning approaches. Yet, characterization of anomalous diffusion remains challenging to date as testified by the launch of the Anomalous Diffusion (AnDi) Challenge in March 2020 to assess and compare new and pre-existing methods on three different aspects of the problem: the inference of the anomalous diffusion exponent, the classification of the diffusion model, and the segmentation of trajectories. Here, we introduce a novel method (CONDOR) which combines feature engineering based on classical statistics with supervised deep learning to efficiently identify the underlying anomalous diffusion model with high accuracy and infer its exponent with a small mean absolute error in single 1D, 2D and 3D trajectories corrupted by localization noise. Finally, we extend our method to the segmentation of trajectories where the diffusion model and/or its anomalous exponent vary in time.
Citations
More filters
Journal ArticleDOI
TL;DR: The Anomalous Diffusion Challenge (AnDi) as mentioned in this paper was an open competition for the characterization of anomalous diffusion from the measurement of an individual trajectory, which traditionally relies on calculating the trajectory mean squared displacement.
Abstract: Deviations from Brownian motion leading to anomalous diffusion are found in transport dynamics from quantum physics to life sciences. The characterization of anomalous diffusion from the measurement of an individual trajectory is a challenging task, which traditionally relies on calculating the trajectory mean squared displacement. However, this approach breaks down for cases of practical interest, e.g., short or noisy trajectories, heterogeneous behaviour, or non-ergodic processes. Recently, several new approaches have been proposed, mostly building on the ongoing machine-learning revolution. To perform an objective comparison of methods, we gathered the community and organized an open competition, the Anomalous Diffusion challenge (AnDi). Participating teams applied their algorithms to a commonly-defined dataset including diverse conditions. Although no single method performed best across all scenarios, machine-learning-based approaches achieved superior performance for all tasks. The discussion of the challenge results provides practical advice for users and a benchmark for developers.

66 citations

Journal ArticleDOI
TL;DR: In this article , a Bayesian-Deep-Learning technique is used to train models for both the classification of the diffusion model and the regression of the anomalous diffusion exponent of single-particle-trajectories.
Abstract: Modern single-particle-tracking techniques produce extensive time-series of diffusive motion in a wide variety of systems, from single-molecule motion in living-cells to movement ecology. The quest is to decipher the physical mechanisms encoded in the data and thus to better understand the probed systems. We here augment recently proposed machine-learning techniques for decoding anomalous-diffusion data to include an uncertainty estimate in addition to the predicted output. To avoid the Black-Box-Problem a Bayesian-Deep-Learning technique named Stochastic-Weight-Averaging-Gaussian is used to train models for both the classification of the diffusion model and the regression of the anomalous diffusion exponent of single-particle-trajectories. Evaluating their performance, we find that these models can achieve a well-calibrated error estimate while maintaining high prediction accuracies. In the analysis of the output uncertainty predictions we relate these to properties of the underlying diffusion models, thus providing insights into the learning process of the machine and the relevance of the output.

12 citations

Journal ArticleDOI
TL;DR: In this article , a feature-based machine learning method was developed in response to Task 2 of the Anomalous Diffusion Challenge, i.e. the classification of different types of diffusion.
Abstract: Understanding and identifying different types of single molecules' diffusion that occur in a broad range of systems (including living matter) is extremely important, as it can provide information on the physical and chemical characteristics of particles' surroundings. In recent years, an ever-growing number of methods have been proposed to overcome some of the limitations of the mean-squared displacements approach to tracer diffusion. In March 2020, the Anomalous Diffusion (AnDi) Challenge was launched by a community of international scientists to provide a framework for an objective comparison of the available methods for anomalous diffusion. In this paper, we introduce a feature-based machine learning method developed in response to Task 2 of the challenge, i.e. the classification of different types of diffusion. We discuss two sets of attributes that may be used for the classification of single-particle tracking data. The first one was proposed as our contribution to the AnDi Challenge. The latter is the result of our attempt to improve the performance of the classifier after the deadline of the competition. Extreme gradient boosting was used as the classification model. Although the deep-learning approach constitutes the state-of-the-art technology for data classification in many domains, we deliberately decided to pick this traditional machine learning algorithm due to its superior interpretability. After the extension of the feature set our classifier achieved the accuracy of 0.83, which is comparable with the top methods based on neural networks.

8 citations

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors developed a WaveNet-based deep neural network (WADNet) by combining a modified WaveNet encoder with long short-term memory networks, without any prior knowledge of anomalous diffusion.
Abstract: Anomalous diffusion, which shows a deviation of transport dynamics from the framework of standard Brownian motion, is involved in the evolution of various physical, chemical, biological, and economic systems. The study of such random processes is of fundamental importance in unveiling the physical properties of random walkers and complex systems. However, classical methods to characterize anomalous diffusion are often disqualified for individual short trajectories, leading to the launch of the Anomalous Diffusion (AnDi) Challenge. This challenge aims at objectively assessing and comparing new approaches for single trajectory characterization, with respect to three different aspects: the inference of the anomalous diffusion exponent; the classification of the diffusion model; and the segmentation of trajectories. In this article, to address the inference and classification tasks in the challenge, we develop a WaveNet-based deep neural network (WADNet) by combining a modified WaveNet encoder with long short-term memory networks, without any prior knowledge of anomalous diffusion. As the performance of our model has surpassed the current 1st places in the challenge leaderboard on both two tasks for all dimensions (6 subtasks), WADNet could be the part of state-of-the-art techniques to decode the AnDi database. Our method presents a benchmark for future research, and could accelerate the development of a versatile tool for the characterization of anomalous diffusion.

7 citations

References
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Journal ArticleDOI
10 Mar 2008-Nature
TL;DR: In this article, the authors study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period and find that the individual travel patterns collapse into a single spatial probability distribution, indicating that humans follow simple reproducible patterns.
Abstract: The mapping of large-scale human movements is important for urban planning, traffic forecasting and epidemic prevention. Work in animals had suggested that their foraging might be explained in terms of a random walk, a mathematical rendition of a series of random steps, or a Levy flight, a random walk punctuated by occasional larger steps. The role of Levy statistics in animal behaviour is much debated — as explained in an accompanying News Feature — but the idea of extending it to human behaviour was boosted by a report in 2006 of Levy flight-like patterns in human movement tracked via dollar bills. A new human study, based on tracking the trajectory of 100,000 cell-phone users for six months, reveals behaviour close to a Levy pattern, but deviating from it as individual trajectories show a high degree of temporal and spatial regularity: work and other commitments mean we are not as free to roam as a foraging animal. But by correcting the data to accommodate individual variation, simple and predictable patterns in human travel begin to emerge. The cover photo (by Cesar Hidalgo) captures human mobility in New York's Grand Central Station. This study used a sample of 100,000 mobile phone users whose trajectory was tracked for six months to study human mobility patterns. Displacements across all users suggest behaviour close to the Levy-flight-like pattern observed previously based on the motion of marked dollar bills, but with a cutoff in the distribution. The origin of the Levy patterns observed in the aggregate data appears to be population heterogeneity and not Levy patterns at the level of the individual. Despite their importance for urban planning1, traffic forecasting2 and the spread of biological3,4,5 and mobile viruses6, our understanding of the basic laws governing human motion remains limited owing to the lack of tools to monitor the time-resolved location of individuals. Here we study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period. We find that, in contrast with the random trajectories predicted by the prevailing Levy flight and random walk models7, human trajectories show a high degree of temporal and spatial regularity, each individual being characterized by a time-independent characteristic travel distance and a significant probability to return to a few highly frequented locations. After correcting for differences in travel distances and the inherent anisotropy of each trajectory, the individual travel patterns collapse into a single spatial probability distribution, indicating that, despite the diversity of their travel history, humans follow simple reproducible patterns. This inherent similarity in travel patterns could impact all phenomena driven by human mobility, from epidemic prevention to emergency response, urban planning and agent-based modelling.

5,514 citations

Journal ArticleDOI
TL;DR: Analysis of a recently developed family of formulas and statistics, approximate entropy (ApEn), suggests that ApEn can classify complex systems, given at least 1000 data values in diverse settings that include both deterministic chaotic and stochastic processes.
Abstract: Techniques to determine changing system complexity from data are evaluated. Convergence of a frequently used correlation dimension algorithm to a finite value does not necessarily imply an underlying deterministic model or chaos. Analysis of a recently developed family of formulas and statistics, approximate entropy (ApEn), suggests that ApEn can classify complex systems, given at least 1000 data values in diverse settings that include both deterministic chaotic and stochastic processes. The capability to discern changing complexity from such a relatively small amount of data holds promise for applications of ApEn in a variety of contexts.

5,055 citations

Journal ArticleDOI
TL;DR: In this paper, the authors developed a stochastic transport model for the transient photocurrent, which describes the dynamics of a carrier packet executing a time-dependent random walk in the presence of a field-dependent spatial bias and an absorbing barrier at the sample surface.
Abstract: Measurements of the transient photocurrent $I(t)$ in an increasing number of inorganic and organic amorphous materials display anomalous transport properties. The long tail of $I(t)$ indicates a dispersion of carrier transit times. However, the shape invariance of $I(t)$ to electric field and sample thickness (designated as universality for the classes of materials here considered) is incompatible with traditional concepts of statistical spreading, i.e., a Gaussian carrier packet. We have developed a stochastic transport model for $I(t)$ which describes the dynamics of a carrier packet executing a time-dependent random walk in the presence of a field-dependent spatial bias and an absorbing barrier at the sample surface. The time dependence of the random walk is governed by hopping time distribution $\ensuremath{\Psi}(t)$. A packet, generated with a $\ensuremath{\Psi}(t)$ characteristic of hopping in a disordered system [e.g., $\ensuremath{\Psi}(t)\ensuremath{\sim}{t}^{\ensuremath{-}(1+\ensuremath{\alpha})}$, $0l\ensuremath{\alpha}l1$], is shown to propagate with a number of anomalous non-Gaussian properties. The calculated $I(t)$ associated with this packet not only obeys the property of universality but can account quantitatively for a large variety of experiments. The new method of data analysis advanced by the theory allows one to directly extract the transit time even for a featureless current trace. In particular, we shall analyze both an inorganic ($a\ensuremath{-}{\mathrm{As}}_{2}{\mathrm{Se}}_{3}$) and an organic (trinitrofluorenone-polyvinylcarbazole) system. Our function $\ensuremath{\Psi}(t)$ is related to a first-principles calculation. It is to be emphasized that these $\ensuremath{\Psi}(t)$'s characterize a realization of a non-Markoffian transport process. Moreover, the theory shows the limitations of the concept of a mobility in this dispersive type of transport.

2,610 citations

Trending Questions (1)
What are some experimental applications of anomalous diffusion?

The paper does not provide specific experimental applications of anomalous diffusion.