scispace - formally typeset
Search or ask a question

Showing papers on "Predictability published in 2023"


Journal ArticleDOI
TL;DR: This paper examined the potential of ChatGPT, and other large language models, in predicting stock market returns using sentiment analysis of news headlines, and found that using advanced language models into the investment decision-making process can yield more accurate predictions and enhance the performance of quantitative trading strategies.
Abstract: We examine the potential of ChatGPT, and other large language models, in predicting stock market returns using sentiment analysis of news headlines. We use ChatGPT to indicate whether a given headline is good, bad, or irrelevant news for firms' stock prices. We then compute a numerical score and document a positive correlation between these ``ChatGPT scores'' and subsequent daily stock market returns. Further, ChatGPT outperforms traditional sentiment analysis methods. We find that more basic models such as GPT-1, GPT-2, and BERT cannot accurately forecast returns, indicating return predictability is an emerging capacity of complex models. ChatGPT-4's implied Sharpe ratios are larger than ChatGPT-3's; however, the latter model has larger total returns. Our results suggest that incorporating advanced language models into the investment decision-making process can yield more accurate predictions and enhance the performance of quantitative trading strategies. Predictability is concentrated on smaller stocks and more prominent on firms with bad news, consistent with limits-to-arbitrage arguments rather than market inefficiencies.

20 citations


Journal ArticleDOI
TL;DR: In this article , artificial intelligence algorithms are proposed for estimating the compressive strength of hollow concrete block masonry prisms, including neural networks (ANN), combinatorial methods of group data handling (GMDH-Combi), and gene expression programming (GEP).

16 citations


Journal ArticleDOI
TL;DR: In this paper , the authors analyzed the relationship between leverage loans and US debt markets by investigating the distributional predictability and directional predictability between leveraged loans and treasury bonds, fixed income securities and corporate bonds in the U.S economy.

15 citations


Journal ArticleDOI
TL;DR: Using a spectrum of approaches, including Granger causality across quantiles, cross-quantilograms, and dynamic connectedness, the authors provided novel evidence on the nexus between Bitcoin mining and climate change.

9 citations


Journal ArticleDOI
TL;DR: In this paper , an artificial neural network-based forecasting model employing a nonlinear focused time-delayed neural network (FTDNN) for energy commodity market forecasts is proposed.

7 citations


Journal ArticleDOI
TL;DR: In this article , a systematic study on deep learning-based prediction of genetic alterations from histology, using two large datasets of multiple tumor types, was performed, and an analysis pipeline that integrates self-supervised feature extraction and attention-based multiple instance learning achieves a robust predictability and generalizability.
Abstract: Abstract The histopathological phenotype of tumors reflects the underlying genetic makeup. Deep learning can predict genetic alterations from pathology slides, but it is unclear how well these predictions generalize to external datasets. We performed a systematic study on Deep-Learning-based prediction of genetic alterations from histology, using two large datasets of multiple tumor types. We show that an analysis pipeline that integrates self-supervised feature extraction and attention-based multiple instance learning achieves a robust predictability and generalizability.

7 citations


Journal ArticleDOI
TL;DR: This article applied ancestral gene content reconstruction and machine learning techniques to ~3000 bacterial genomes to predict gene gain and loss evolution at the branches of the reference phylogenetic tree, suggesting that evolutionary pressures and constraints on metabolic systems are universally shared.
Abstract: Evolution prediction is a long-standing goal in evolutionary biology, with potential impacts on strategic pathogen control, genome engineering, and synthetic biology. While laboratory evolution studies have shown the predictability of short-term and sequence-level evolution, that of long-term and system-level evolution has not been systematically examined. Here, we show that the gene content evolution of metabolic systems is generally predictable by applying ancestral gene content reconstruction and machine learning techniques to ~3000 bacterial genomes. Our framework, Evodictor, successfully predicted gene gain and loss evolution at the branches of the reference phylogenetic tree, suggesting that evolutionary pressures and constraints on metabolic systems are universally shared. Investigation of pathway architectures and meta-analysis of metagenomic datasets confirmed that these evolutionary patterns have physiological and ecological bases as functional dependencies among metabolic reactions and bacterial habitat changes. Last, pan-genomic analysis of intraspecies gene content variations proved that even “ongoing” evolution in extant bacterial species is predictable in our framework.

6 citations


Journal ArticleDOI
TL;DR: The authors provide an overview of existing theory and evidence for such mutation-biased adaptation and consider the implications of these results for the problem of prediction, in regard to topics such as the evolution of infectious diseases, resistance to biochemical agents, as well as cancer and other kinds of somatic evolution.
Abstract: Predicting evolutionary outcomes is an important research goal in a diversity of contexts. The focus of evolutionary forecasting is usually on adaptive processes, and efforts to improve prediction typically focus on selection. However, adaptive processes often rely on new mutations, which can be strongly influenced by predictable biases in mutation. Here, we provide an overview of existing theory and evidence for such mutation-biased adaptation and consider the implications of these results for the problem of prediction, in regard to topics such as the evolution of infectious diseases, resistance to biochemical agents, as well as cancer and other kinds of somatic evolution. We argue that empirical knowledge of mutational biases is likely to improve in the near future, and that this knowledge is readily applicable to the challenges of short-term prediction. This article is part of the theme issue ‘Interdisciplinary approaches to predicting evolutionary biology’.

6 citations


Journal ArticleDOI
TL;DR: In this paper , the authors investigated cross-scale interactions within dry-to-hot and hotto-dry extreme event networks and quantify the magnitude, temporal-scale, and physical drivers of cascading effects (CEs) of drying-on-heating and vice-versa, across the globe.
Abstract: Abstract Climate change amplifies dry and hot extremes, yet the mechanism, extent, scope, and temporal scale of causal linkages between dry and hot extremes remain underexplored. Here using the concept of system dynamics, we investigate cross-scale interactions within dry-to-hot and hot-to-dry extreme event networks and quantify the magnitude, temporal-scale, and physical drivers of cascading effects (CEs) of drying-on-heating and vice-versa, across the globe. We find that locations exhibiting exceptionally strong CE (hotspots) for dry-to-hot and hot-to-dry extremes generally coincide. However, the CEs differ strongly in their timescale of interaction, hydroclimatic drivers, and sensitivity to changes in the soil-plant-atmosphere continuum and background aridity. The CE of drying-on-heating in the hotspot locations reaches its peak immediately driven by the compounding influence of vapor pressure deficit, potential evapotranspiration, and precipitation. In contrast, the CE of heating-on-drying peaks gradually dominated by concurrent changes in potential evapotranspiration, precipitation, and net-radiation with the effect of vapor pressure deficit being strongly controlled by ecosystem isohydricity and background aridity. Our results help improve our understanding of the causal linkages and the predictability of compound extremes and related impacts.

6 citations


Journal ArticleDOI
TL;DR: In this paper , a synthetic spatiotemporal framework for future research that pairs experimental and modeling approaches grounded in mechanism to improve predictability and generalizability of plant-soil feedbacks is proposed.
Abstract: Feedbacks between plants and soil microbes form a keystone to terrestrial community and ecosystem dynamics. Recent advances in dissecting the spatial and temporal dynamics of plant-soil feedbacks have challenged longstanding assumptions of spatially well-mixed microbial communities and exceedingly fast microbial assembly dynamics relative to plant lifespans. Instead, plant-soil feedbacks emerge from interactions that are inherently mismatched in spatial and temporal scales, and explicitly considering these spatial and temporal dynamics is crucial to understanding the contribution of plant-soil feedbacks to foundational ecological patterns. I propose a synthetic spatiotemporal framework for future research that pairs experimental and modeling approaches grounded in mechanism to improve predictability and generalizability of plant-soil feedbacks.

6 citations




Journal ArticleDOI
TL;DR: In this article , the prediction and development of a new mathematical model for two parameters of rheology i.e., plastic viscosity (PV) and yield stress (YS) by the application of a novel machine learning algorithm gene expression programming (GEP) was described.

Journal ArticleDOI
TL;DR: In this article , a machine learning algorithm, namely Random Forest, was used to predict the next day closing price of four Greek systemic banks, based on Open, Close prices of stocks and Trading Volume.
Abstract: Objectives: Accurate prediction of stock market returns is a very challenging task due to the volatile and non-linear nature of the financial stock markets. In this work, we consider conventional time series analysis techniques with additional information from the Google Trend website to predict stock price returns. We further utilize a machine learning algorithm, namely Random Forest, to predict the next day closing price of four Greek systemic banks. Methods/Analysis: The financial data considered in this work comprise Open, Close prices of stocks and Trading Volume. In the context of our analysis, these data are further used to create new variables that serve as additional inputs to the proposed machine learning based model. Specifically, we consider variables for each of the banks in the dataset, such as 7 DAYS MA,14 DAYS MA, 21 DAYS MA, 7 DAYS STD DEV and Volume. One step ahead out of sample prediction following the rolling window approach has been applied. Performance evaluation of the proposed model has been done using standard strategic indicators: RMSE and MAPE. Findings: Our results depict that the proposed models effectively predict the stock market prices, providing insight about the applicability of the proposed methodology scheme to various stock market price predictions. Novelty /Improvement: The originality of this study is that Machine Learning Methods highlighted by the Random Forest Technique were used to forecast the closing price of each stock in the Banking Sector for the following trading session. Doi: 10.28991/ESJ-2023-07-03-04 Full Text: PDF

Journal ArticleDOI
01 Feb 2023-Sensors
TL;DR: In this paper , three best-in-class models in terms of interdataset generalization were evaluated on the CSE-CIC-IDS2018, a state-of-the-art intrusion detection dataset.
Abstract: Recently proposed methods in intrusion detection are iterating on machine learning methods as a potential solution. These novel methods are validated on one or more datasets from a sparse collection of academic intrusion detection datasets. Their recognition as improvements to the state-of-the-art is largely dependent on whether they can demonstrate a reliable increase in classification metrics compared to similar works validated on the same datasets. Whether these increases are meaningful outside of the training/testing datasets is rarely asked and never investigated. This work aims to demonstrate that strong general performance does not typically follow from strong classification on the current intrusion detection datasets. Binary classification models from a range of algorithmic families are trained on the attack classes of CSE-CIC-IDS2018, a state-of-the-art intrusion detection dataset. After establishing baselines for each class at various points of data access, the same trained models are tasked with classifying samples from the corresponding attack classes in CIC-IDS2017, CIC-DoS2017 and CIC-DDoS2019. Contrary to what the baseline results would suggest, the models have rarely learned a generally applicable representation of their attack class. Stability and predictability of generalized model performance are central issues for all methods on all attack classes. Focusing only on the three best-in-class models in terms of interdataset generalization, reveals that for network-centric attack classes (brute force, denial of service and distributed denial of service), general representations can be learned with flat losses in classification performance (precision and recall) below 5%. Other attack classes vary in generalized performance from stark losses in recall (−35%) with intact precision (98+%) for botnets to total degradation of precision and moderate recall loss for Web attack and infiltration models. The core conclusion of this article is a warning to researchers in the field. Expecting results of proposed methods on the test sets of state-of-the-art intrusion detection datasets to translate to generalized performance is likely a serious overestimation. Four proposals to reduce this overestimation are set out as future work directions.


Journal ArticleDOI
TL;DR: In this paper , an interpretable machine learning (ML) fire model (AttentionFire_v1.0) was developed to resolve the complex control of climate and human activities on burned areas and to better predict burned areas over ASA regions.
Abstract: Abstract. African and South American (ASA) wildfires account for more than 70 % of global burned areas and have strong connection to local climate for sub-seasonal to seasonal wildfire dynamics. However, representation of the wildfire–climate relationship remains challenging due to spatiotemporally heterogenous responses of wildfires to climate variability and human influences. Here, we developed an interpretable machine learning (ML) fire model (AttentionFire_v1.0) to resolve the complex controls of climate and human activities on burned areas and to better predict burned areas over ASA regions. Our ML fire model substantially improved predictability of burned areas for both spatial and temporal dynamics compared with five commonly used machine learning models. More importantly, the model revealed strong time-lagged control from climate wetness on the burned areas. The model also predicted that, under a high-emission future climate scenario, the recently observed declines in burned area will reverse in South America in the near future due to climate changes. Our study provides a reliable and interpretable fire model and highlights the importance of lagged wildfire–climate relationships in historical and future predictions.


Posted ContentDOI
30 Jan 2023-bioRxiv
TL;DR: Couce et al. as discussed by the authors used saturated, genome-wide insertion libraries to quantify how the fitness effects of new mutations changed in two E. coli populations that adapted to a constant environment for 15,000 generations.
Abstract: The distribution of fitness effects of new mutations is central to predicting adaptive evolution, but observing how it changes as organisms adapt is challenging. Here we use saturated, genome-wide insertion libraries to quantify how the fitness effects of new mutations changed in two E. coli populations that adapted to a constant environment for 15,000 generations. The proportions of neutral and deleterious mutations remained constant, despite large fitness gains. In contrast, the beneficial fraction declined rapidly, approximating an exponential distribution, with strong epistasis profoundly changing the genetic identity of adaptive mutations. Despite this volatility, many important targets of selection were predictable from the ancestral distribution. This predictability occurs because genetic target size contributed to the fixation of beneficial mutations as much as or more than their effect sizes. Overall, our results demonstrate that short-term adaptation can be idiosyncratic but empirically predictable, and that long-term dynamics can be described by simple statistical principles. One-Sentence Summary Couce et al. demonstrate that short-term bacterial adaptation is predictable at the scale of individual genes, while long-term adaptation is predictable at a global scale.

Journal ArticleDOI
TL;DR: In this paper , the authors focus on the detrimental effect that even state-of-the-art AI and ML systems could have on the validity of national exams of secondary education, and how lower validity would negatively affect trust in the system.
Abstract: This article considers the challenges of using artificial intelligence (AI) and machine learning (ML) to assist high-stakes standardised assessment. It focuses on the detrimental effect that even state-of-the-art AI and ML systems could have on the validity of national exams of secondary education, and how lower validity would negatively affect trust in the system. To reach this conclusion, three unresolved issues in AI (unreliability, low explainability and bias) are addressed, to show how each of them would compromise the interpretations and uses of exam results (i.e., exam validity). Furthermore, the article relates validity to trust, and specifically to the ABI+ model of trust. Evidence gathered as part of exam validation supports each of the four trust-enabling components of the ABI+ model (ability, benevolence, integrity and predictability). It is argued, therefore, that the three AI barriers to exam validity limit the extent to which an AI-assisted exam system could be trusted. The article suggests that addressing the issues of AI unreliability, low explainability and bias should be sufficient to put AI-assisted exams on par with traditional ones, but might not go as far as fully reassure the public. To achieve this, it is argued that changes to the quality assurance mechanisms of the exam system will be required. This may involve, for example, integrating principled AI frameworks in assessment policy and regulation.


Journal ArticleDOI
TL;DR: In this article , the authors used feedforward and recurrent networks to predict the direction of monthly stock price movements, with past price data as predictors, and found statistically significant directional predictability for selected assets.
Abstract: The rapidly growing neural network literature continually reports successful stock price forecasting results. Many of these studies use relatively short evaluation periods, spanning only a couple of years. In this paper, sustainability of the neural network forecast quality over the long term is analysed. Feedforward and recurrent networks are used to predict the direction of monthly stock price movements, with past price data as predictors. The analysis is conducted on the NYSE stocks over the 1971–2015 period and all the evaluations are performed out-of-sample. Statistically significant directional predictability for selected assets is found. However, the trading simulations reveal that directional predictability does not guarantee trading performance better than the benchmark buy-and-hold strategy. The opportunities for investors to use the tested models for profit appear to be episodic and periodically enhanced e.g. in periods of recession.

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper further developed a conditional past return (CPR) indicator that adds the direction information for the investors' consistent belief, and examined the effectiveness of CPR as a predictor for stock market returns.

Journal ArticleDOI
TL;DR: In this article , the authors quantified the volatility spillover impact and the directional predictability from stock market indexes to Bitcoin and found that only six stock markets are powerful predictors of Bitcoin return in the short term.
Abstract: PurposeThis paper aims to quantify the volatility spillover impact and the directional predictability from stock market indexes to Bitcoin.Design/methodology/approachDaily data of 15 developed and 15 emerging stock markets are used for the period March 2017–December 2021.; The author uses vector autoregressive (VAR) model, Granger causality test and impulse response function (IRF) to estimate the results of the study.FindingsEmpirical results show a significant unidirectional volatility spillover impact from emerging markets to Bitcoin and only six stock markets are powerful predictors of Bitcoin return in the short term. Additionally, there is no a difference between developed and developing markets regarding the directional predictability however there is difference in the reaction of Bitcoin return to shocks in the emerging markets compared to developed ones.Originality/valueThe paper proposes different econometric techniques from prior research and presents a comparative analysis between developed and emerging markets.

Journal ArticleDOI
TL;DR: This article examined the predictability of macroeconomic variables in China over a long period and found that five of nine macroeconomic factors have senior in-and out-of-sample predictability at monthly and longer horizons.

Journal ArticleDOI
TL;DR: This article decompose the earnings yield into smoothing and residual components, which form a powerful predictor of dividend growth and show significant in-sample predictive power for aggregate dividend growth at both monthly and annual frequencies over several forecast horizons.

Journal ArticleDOI
TL;DR: The authors used a neural language model trained to compute the conditional probability of any word based on the words that precede it to operationalize contextual predictability, using an information theoretical construct known as surprisal.
Abstract: Theoretical accounts of the N400 are divided as to whether the amplitude of the N400 response to a stimulus reflects the extent to which the stimulus was predicted, the extent to which the stimulus is semantically similar to its preceding context, or both. We use state-of-the-art machine learning tools to investigate which of these three accounts is best supported by the evidence. GPT-3, a neural language model (LM) trained to compute the conditional probability of any word based on the words that precede it, was used to operationalize contextual predictability. In particular, we used an information theoretical construct known as surprisal (the negative logarithm of the conditional probability). Contextual semantic similarity was operationalized by using two high-quality co-occurrence-derived vector-based meaning representations for words: GloVe and fastText. The cosine between the vector representation of the sentence frame and final word was used to derive Contextual Cosine Similarity (CCS) estimates. A series of regression models were constructed, where these variables, along with cloze probability and plausibility ratings, were used to predict single trial N400 amplitudes recorded from healthy adults as they read sentences whose final word varied in its predictability, plausibility, and semantic relationship to the likeliest sentence completion. Statistical model comparison indicated GPT-3 surprisal provided the best account of N400 amplitude and suggested that apparently disparate N400 effects of expectancy, plausibility and contextual semantic similarity can be reduced to variations in the predictability of words. The results are argued to support predictive coding in the human language network.

Journal ArticleDOI
23 Mar 2023-Axioms
TL;DR: In this article , the Renyi entropy of the residual lifetime of a coherent system when all system components have lived to a time t is defined, and several findings are studied for the aforementioned entropy, including the bounds and order characteristics.
Abstract: The measurement of uncertainty across the lifetimes of engineering systems has drawn more attention in recent years. It is a helpful metric for assessing how predictable a system’s lifetime is. In these circumstances, Renyi entropy, a Shannon entropy extension, is particularly appealing. In this paper, we develop the system signature to give an explicit formula for the Renyi entropy of the residual lifetime of a coherent system when all system components have lived to a time t. In addition, several findings are studied for the aforementioned entropy, including the bounds and order characteristics. It is possible to compare the residual lifespan predictability of two coherent systems with known signatures using the findings of this study.

Journal ArticleDOI
TL;DR: In this article , the authors use a high-frequency data set of limit order book snapshots from the foreign exchange spot market to develop and test a methodology to assess the feasibility, and hence potential prevalence, of cross-market spoofing.

Journal ArticleDOI
TL;DR: In this article , the principal component analysis based particle swarm optimization algorithm (PPSO) is applied in Geophysical Fluid Dynamics Laboratory Climate Model version 2p1 (GFDL CM2p1) to obtain the optimal precursors (OPRs) for CP El Niño events, based on the conditional nonlinear optimal perturbation (CNOP) method.
Abstract: Compared to the canonical eastern Pacific El Niño, the understanding and ability to predict the central Pacific (CP) type event still need further improvement. In this study, the principal component analysis based particle swarm optimization algorithm (PPSO) is applied in Geophysical Fluid Dynamics Laboratory Climate Model version 2p1 (GFDL CM2p1) to obtain the optimal precursors (OPRs) for CP El Niño events, based on the conditional nonlinear optimal perturbation (CNOP) method. For this, three normal years with neither El Niño nor La Niña events, i.e., three cases, are chosen as the reference states. The obtained OPRs for these cases exhibit a consistent positive sea surface temperature (SST) perturbation distribution in the subtropical Northern Pacific (20°–40°N, 175°E–140°W), which is further proven to be crucial for the evolution of CP El Niño based on the northern and southern hemisphere significance test results. Mechanically, these positive SST perturbations are enhanced and reach the equatorial Pacific via wind-evaporation-SST (WES) feedback to evolve into a CP El Niño at the end of the year. The nonlinear approach is adopted to investigate the predictability of CP El Niño events and can shed some lights on future studies.