Showing papers on "Linear model published in 2023"

PDF

Open Access

Journal Article•DOI•

Decoding of the speech envelope from EEG using the VLAAI deep neural network

[...]

Bernd Accou¹, Bernd Accou², Pei‐Li Yu¹, Pei‐Li Yu²•Institutions (2)

University of Copenhagen Faculty of Science¹, University of Copenhagen²

16 Jan 2023-Dental science reports

TL;DR: In this paper , a very large augmented auditory inference (VLAAI) network was proposed for speech decoder, which outperformed state-of-the-art subject-independent models (median Pearson correlation of 0.19, p < 0.001).

...read moreread less

Abstract: To investigate the processing of speech in the brain, commonly simple linear models are used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped to model a highly-dynamic, complex non-linear system like the brain, and they often require a substantial amount of subject-specific training data. This work introduces a novel speech decoder architecture: the Very Large Augmented Auditory Inference (VLAAI) network. The VLAAI network outperformed state-of-the-art subject-independent models (median Pearson correlation of 0.19, p < 0.001), yielding an increase over the well-established linear model by 52%. Using ablation techniques, we identified the relative importance of each part of the VLAAI network and found that the non-linear components and output context module influenced model performance the most (10% relative performance increase). Subsequently, the VLAAI network was evaluated on a holdout dataset of 26 subjects and a publicly available unseen dataset to test generalization for unseen subjects and stimuli. No significant difference was found between the default test and the holdout subjects, and between the default test set and the public dataset. The VLAAI network also significantly outperformed all baseline models on the public dataset. We evaluated the effect of training set size by training the VLAAI network on data from 1 up to 80 subjects and evaluated on 26 holdout subjects, revealing a relationship following a hyperbolic tangent function between the number of subjects in the training set and the performance on unseen subjects. Finally, the subject-independent VLAAI network was finetuned for 26 holdout subjects to obtain subject-specific VLAAI models. With 5 minutes of data or more, a significant performance improvement was found, up to 34% (from 0.18 to 0.25 median Pearson correlation) with regards to the subject-independent VLAAI network.

...read moreread less

3 citations

Journal Article•DOI•

Non-linear predictor outcome associations

[...]

Frederick K. Ho, Tim J Cole

01 Jan 2023-BMJ Medicine

TL;DR: In this paper , the authors show that ignoring non-linear association can ignore non- linear association in a nonlinear association model, which can be seen as ignoring nonlinear associations.

...read moreread less

Abstract: Ignoring non- linear association can

...read moreread less

2 citations

Journal Article•DOI•

Features from the photoplethysmogram and the electrocardiogram for estimating changes in blood pressure

[...]

Eoin Finnegan, Shaun Davidson, Mirae Harford, Peter J. Watkins, Lionel Tarassenko, Mauricio Villarroel - Show less +2 more

18 Jan 2023-Dental science reports

TL;DR: In this paper , the authors investigated the best features available from the photoplethysmogram (PPG) and ECG for BP estimation using both linear and non-linear machine learning models.

...read moreread less

Abstract: There is a growing emphasis being placed on the potential for cuffless blood pressure (BP) estimation through modelling of morphological features from the photoplethysmogram (PPG) and electrocardiogram (ECG). However, the appropriate features and models to use remain unclear. We investigated the best features available from the PPG and ECG for BP estimation using both linear and non-linear machine learning models. We conducted a clinical study in which changes in BP ([Formula: see text]BP) were induced by an infusion of phenylephrine in 30 healthy volunteers (53.8% female, 28.0 (9.0) years old). We extracted a large and diverse set of features from both the PPG and the ECG and assessed their individual importance for estimating [Formula: see text]BP through Shapley additive explanation values and a ranking coefficient. We trained, tuned, and evaluated linear (ordinary least squares, OLS) and non-linear (random forest, RF) machine learning models to estimate [Formula: see text]BP in a nested leave-one-subject-out cross-validation framework. We reported the results as correlation coefficient ([Formula: see text]), root mean squared error (RMSE), and mean absolute error (MAE). The non-linear RF model significantly ([Formula: see text]) outperformed the linear OLS model using both the PPG and the ECG signals across all performance metrics. Estimating [Formula: see text]SBP using the PPG alone ([Formula: see text] = 0.86 (0.23), RMSE = 5.66 (4.76) mmHg, MAE = 4.86 (4.29) mmHg) performed significantly better than using the ECG alone ([Formula: see text] = 0.69 (0.45), RMSE = 6.79 (4.76) mmHg, MAE = 5.28 (4.57) mmHg), all [Formula: see text]. The highest ranking features from the PPG largely modelled increasing reflected wave interference driven by changes in arterial stiffness. This finding was supported by changes observed in the PPG waveform in response to the phenylephrine infusion. However, a large number of features were required for accurate BP estimation, highlighting the high complexity of the problem. We conclude that the PPG alone may be further explored as a potential single source, cuffless, blood pressure estimator. The use of the ECG alone is not justified. Non-linear models may perform better as they are able to incorporate interactions between feature values and demographics. However, demographics may not adequately account for the unique and individualised relationship between the extracted features and BP.

...read moreread less

2 citations

Journal Article•DOI•

A sparse additive model for high-dimensional interactions with an exposure variable

[...]

Sahir Bhatnagar¹•Institutions (1)

McGill University¹

01 Mar 2023

TL;DR: In this paper , a method called sail is proposed for detecting non-linear interactions with a key environmental or exposure variable in high-dimensional settings which respects the strong or weak heredity constraints.

...read moreread less

Abstract: A conceptual paradigm for onset of a new disease is often considered to be the result of changes in entire biological networks whose states are affected by a complex interaction of genetic and environmental factors. However, when modeling a relevant phenotype as a function of high dimensional measurements, power to estimate interactions is low, the number of possible interactions could be enormous and their effects may be non-linear. A method called sail for detecting non-linear interactions with a key environmental or exposure variable in high-dimensional settings which respects the strong or weak heredity constraints is proposed. It is proven that asymptotically, sail possesses the oracle property, i.e., it performs as well as if the true model were known in advance. A computationally efficient fitting algorithm with automatic tuning parameter selection, which scales to high-dimensional datasets is proposed. Simulation results show that sail outperforms existing penalized regression methods in terms of prediction accuracy and support recovery when there are non-linear interactions with an exposure variable. sail is applied to detect non-linear interactions between genes and a prenatal psychosocial intervention program on cognitive performance in children at 4 years of age. Results show that individuals who are genetically predisposed to lower educational attainment are those who stand to benefit the most from the intervention. The proposed algorithms are implemented in an R package available on CRAN (https://cran.r-project.org/package=sail).

...read moreread less

1 citations

Journal Article•DOI•

Prediction of linear model on stunting prevalence with machine learning approach

[...]

Mambang Mambang, Finki Dona Marleny, Muhammad Zulfadhilah

01 Feb 2023-Bulletin of Electrical Engineering and Informatics

TL;DR: In this paper , the authors used a scikit-learn linear model with a minimum variable of 19 and a maximum variable of 48 to test the polynomial regression with pipeline model with MAPE 0.02, root mean square error (RMSE) 3.32 and coefficient of determination (R2) 1,00.

...read moreread less

Abstract: An increase in the number of residents should be anticipated including in the health sector, especially the problem of stunting. Stunting in children disrupts height and lack of absorption of nutrients. Information and data drive change in many areas such as health, entertainment, economics, business, and other strategic areas. The stages carried out in this study are initiating, developing linear models, and making prediction results on linear machine learning models. The results of testing with the scikit-learn linear model with a minimum variable of 19 get the best test results, namely the polynomial regression with pipeline model with mean absolute percentage error (MAPE) 0.02, root mean square error (RMSE) 3.32, and coefficient of determination (R2) 1,00. Testing with the scikit-learn linear model with a maximum variable of 48 gets the best test results, namely the polynomial regression with pipeline model with MAPE 0.00, RMSE 3.79 and R2 1.00. Testing with the scikit-learn linear model with an average variable of 32 gets the best test results, namely the polynomial regression model with MAPE 0.01, RMSE 3.32, and R2 1.00. The results of testing with the scikit-learn linear model with the minimum, maximum, and average variables get the best test results, namely the polynomial regression with pipeline model.

...read moreread less

1 citations

Posted Content•DOI•

Statistical power and false positive rates for interdependent outcomes are strongly influenced by test type: Implications for behavioral neuroscience

[...]

Michelle Frankot, Peyton Mueller, Michael E. Young, Cole Vonder Haar

01 May 2023-bioRxiv

TL;DR: In this article , Monte Carlo methods were used to simulate behavioral data for a task with four interdependent choices (i.e., increased choice of a given outcome decreases others) using 16,000 datasets were simulated (1,000 each of 4 effect sizes by 4 sample sizes).

...read moreread less

Abstract: Statistical errors in preclinical science are a barrier to reproducibility and translation. For instance, linear models (e.g., ANOVA, linear regression) may be misapplied to data that violate assumptions. In behavioral neuroscience and psychopharmacology, linear models are frequently applied to interdependent or compositional data, which includes behavioral assessments where animals concurrently choose between chambers, objects, outcomes, or types of behavior (e.g., forced swim, novel object, place/social preference). The current study simulated behavioral data for a task with four interdependent choices (i.e., increased choice of a given outcome decreases others) using Monte Carlo methods. 16,000 datasets were simulated (1,000 each of 4 effect sizes by 4 sample sizes) and statistical approaches evaluated for accuracy. Linear regression and linear mixed effects regression (LMER) with a single random intercept resulted in high false positives (>60%). Elevated false positives were attenuated in an LMER with random effects for all choice-levels and a binomial logistic mixed effects regression. However, these models were underpowered to reliably detect effects at common preclinical sample sizes. A Bayesian method using prior knowledge for control subjects increased power by up to 30%. These results were confirmed in a second simulation (8,000 datasets). These data suggest that statistical analyses may often be misapplied in preclinical paradigms, with common linear methods increasing false positives, but potential alternatives lacking power. Ultimately, using informed priors may balance statistical requirements with ethical imperatives to minimize the number of animals used. These findings highlight the importance of considering statistical assumptions and limitations when designing research studies.

...read moreread less

1 citations

Journal Article•DOI•

A versatile workflow for linear modelling in R

[...]

Matteo Santon, Fränzi Korner-Nievergelt, Nico K. Michiels, Nils Anthes

24 Apr 2023-Frontiers in Ecology and Evolution

TL;DR: In this article , the authors present a generic R-workflow template that facilitates (Generalized) Linear (Mixed) Model analyses, guiding users from data exploration through model formulation, assessment and refinement to the graphical and numerical presentation of results.

...read moreread less

Abstract: Linear models are applied widely to analyse empirical data. Modern software allows implementation of linear models with a few clicks or lines of code. While convenient, this increases the risk of ignoring essential assessment steps. Indeed, inappropriate application of linear models is an important source of inaccurate statistical inference. Despite extensive guidance and detailed demonstration of exemplary analyses, many users struggle to implement and assess their own models. To fill this gap, we present a versatile R-workflow template that facilitates (Generalized) Linear (Mixed) Model analyses. The script guides users from data exploration through model formulation, assessment and refinement to the graphical and numerical presentation of results. The workflow accommodates a variety of data types, distribution families, and dependency structures that arise from hierarchical sampling. To apply the routine, minimal coding skills are required for data preparation, naming of variables of interest, linear model formulation, and settings for summary graphs. Beyond that, default functions are provided for visual data exploration and model assessment. Focused on graphs, model assessment offers qualitative feedback and guidance on model refinement, pointing to more detailed or advanced literature where appropriate. With this workflow, we hope to contribute to research transparency, comparability, and reproducibility.

...read moreread less

1 citations

Journal Article•DOI•

Expert Algorithm for Substance Identification Using Mass Spectrometry: Statistical Foundations in Unimolecular Reaction Rate Theory

[...]

Glen P. Jackson, Samantha A. Mehnert, J. Tyler Davidson

31 May 2023-Journal of the American Society for Mass Spectrometry

TL;DR: In this article , a general linear regression model was used to predict ion abundances of cocaine using the 20 most abundant fragments in a database of 128 training spectra collected over 6 months in an operational crime laboratory.

...read moreread less

1 citations

Journal Article•DOI•

An adjusted coefficient of determination (R2 ) for generalized linear mixed models in one go.

[...]

Hans-Peter Piepho

01 May 2023-Biometrical Journal

TL;DR: In this paper , the authors proposed a new coefficient of determination (R2 ) for linear models that accounts for any such correlation, which only requires the fit of the model under consideration, with no need to also fit a null model.

...read moreread less

Abstract: The coefficient of determination (R2 ) is a common measure of goodness of fit for linear models. Various proposals have been made for extension of this measure to generalized linear and mixed models. When the model has random effects or correlated residual effects, the observed responses are correlated. This paper proposes a new coefficient of determination for this setting that accounts for any such correlation. A key advantage of the proposed method is that it only requires the fit of the model under consideration, with no need to also fit a null model. Also, the approach entails a bias correction in the estimator assessing the variance explained by fixed effects. Three examples are used to illustrate new measure. A simulation shows that the proposed estimator of the new coefficient of determination has only minimal bias.

...read moreread less

1 citations

Posted Content•DOI•

Confidence Intervals and Regions for Proportions Under Various Three-Endmember Linear Mixture Models

[...]

Katherine Schmirler¹•Institutions (1)

NICTA¹

24 Jan 2023

TL;DR: In this article , the authors show how to produce confidence intervals (CIs) and joint confidence regions (JCRs) for the proportions associated with various linear mixture models, assuming that the coefficients in the model are non-negative.

...read moreread less

Abstract: Many papers in recent years have been devoted to estimating the per pixel proportions of three broad classes of materials (e.g. photosynthetic vegetation, non-photosynthetic vegetation and bare soil) using data from multispectral sensors. Many of these papers use estimation methods based on the linear mixture model. Very few of these papers assess the accuracy of their estimators. I show how to produce confidence intervals (CIs) and joint confidence regions (JCRs) for the proportions associated with various linear mixture models. There are two main models, both of which assume that the coefficients in the model are non-negative. The first model assumes that the coefficients sum to 1. The second does not, but uses rescaling of the estimated coefficients to produce estimated proportions. Three variants of these two models are also analysed. JCRs are shown to be particularly informative, because they are typically better at localising the information than CIs are. The methodology is illustrated using examples from Landsat Thematic Mapper data at 1169 locations across Australia, each of which has associated field observations. There is also discussion about the extent to which the methodology can be extended to hyperspectral data.

...read moreread less

1 citations

Journal Article•DOI•

Statistical power and false positive rates for interdependent outcomes are strongly influenced by test type: Implications for behavioral neuroscience

[...]

Amitava Dutt¹•Institutions (1)

Ohio State University¹

04 May 2023-Neuropsychopharmacology

TL;DR: In this paper , Monte Carlo methods were used to simulate behavioral data for a task with four interdependent choices (i.e., increased choice of a given outcome decreases others) using 16,000 datasets were simulated (1000 each of four effect sizes by 4 sample sizes) and statistical approaches evaluated for accuracy.

...read moreread less

Journal Article•DOI•

A Data-Driven Linear Optimal Power Flow Model for Distribution Networks

[...]

01 Jan 2023-IEEE Transactions on Power Systems

TL;DR: In this paper , a data-driven linear power flow (PF) model incorporating the KCL constraints is proposed and can be embedded in OPF for distribution networks (DNs), which is robust against bad data in measurements.

...read moreread less

Abstract: The linearized power flow (PF) model is mainly used to make the optimal power flow (OPF) problem convex. However, existing data-driven linear PF models are not applicable for OPF calculation since the Kirchhoff's law (KCL) constraints are neglected. In this letter, we propose a data-driven linear PF model incorporating the KCL constraints and can be embedded in OPF for distribution networks (DNs). By combining the support vector regression (SVR) and ridge regression (RR) algorithms, the proposed method is robust against bad data in measurements. Numerical tests show that the proposed model has much higher accuracy than the existing linear models, especially for OPF calculation.

...read moreread less

Journal Article•DOI•

Bayesian model averaging to improve the yield prediction in wheat breeding trials

[...]

Roger Conover¹, diptonil banerjee•Institutions (1)

Harbin University of Science and Technology¹

01 Jan 2023-Agricultural and Forest Meteorology

TL;DR: In this article , the authors proposed a novel wheat yield prediction framework based on canopy hyperspectral reflectance (350-2500 nm) and adopted the ensemble Bayesian model averaging (EBMA) method to improve model performance.

...read moreread less

Journal Article•DOI•

The impact of network connectivity on factor exposures, asset pricing, and portfolio diversification

[...]

Nicholas C. Price¹•Institutions (1)

University of Padua¹

01 Mar 2023

TL;DR: In this paper , the authors extend the classic factor-based asset pricing model by including network linkages, leading to a network-augmented linear factor model, which allows a better understanding of the determinants of systematic risk and shows that cross-sectional risk premia can be estimated more precisely.

...read moreread less

Abstract: This paper extends the classic factor-based asset pricing model by including network linkages, leading to a network-augmented linear factor model. This extension of the model allows a better understanding of the determinants of systematic risk and shows that cross-sectional risk premia can be estimated more precisely. Moreover, we demonstrate that in the presence of network links a misspecified traditional linear factor model presents residuals that are correlated and heteroskedastic. We support our claims with an extensive simulation experiment and real data.

...read moreread less

Journal Article•DOI•

When a joint model should be preferred over a linear mixed model for analysis of longitudinal health-related quality of life data in cancer clinical trials

[...]

C. Touraine, B. Cuer, Thierry Conroy, Beata Juzyna, Sophie Gourgou, Caroline Mollevi - Show less +2 more

10 Feb 2023-BMC Medical Research Methodology

TL;DR: In this paper , the authors compare the linear mixed model with the random intercept and slope model to investigate the consequences of using the most frequently used linear mixed models in clinical trial context, rather than its corresponding joint model.

...read moreread less

Abstract: Abstract Background Patient-reported outcomes such as health-related quality of life (HRQoL) are increasingly used as endpoints in randomized cancer clinical trials. However, the patients often drop out so that observation of the HRQoL longitudinal outcome ends prematurely, leading to monotone missing data. The patients may drop out for various reasons including occurrence of toxicities, disease progression, or may die. In case of informative dropout, the usual linear mixed model analysis will produce biased estimates. Unbiased estimates cannot be obtained unless the dropout is jointly modeled with the longitudinal outcome, for instance by using a joint model composed of a linear mixed (sub)model linked to a survival (sub)model. Our objective was to investigate in a clinical trial context the consequences of using the most frequently used linear mixed model, the random intercept and slope model, rather than its corresponding joint model. Methods We first illustrate and compare the models on data of patients with metastatic pancreatic cancer. We then perform a more formal comparison through a simulation study. Results From the application, we derived hypotheses on the situations in which biases arise and on their nature. Through the simulation study, we confirmed and complemented these hypotheses and provided general explanations of the bias mechanisms. Conclusions In particular, this article reveals how the linear mixed model fails in the typical situation where poor HRQoL is associated with an increased risk of dropout and the experimental treatment improves survival. Unlike the joint model, in this situation the linear mixed model will overestimate the HRQoL in both arms, but not equally, misestimating the difference between the HRQoL trajectories of the two arms to the disadvantage of the experimental arm.

...read moreread less

Book Chapter•DOI•

Potentials based on linear models

[...]

Xin Li

01 Jan 2023

TL;DR: In this paper , a review of linear methods for the construction of machine-learning interaction potentials is presented and two case studies are presented to test the advantages and drawbacks of different linear regression methods.

...read moreread less

Abstract: This chapter is dedicated to linear methods for the construction of machine-learning interaction potentials. After a short introduction, we describe different linear regression schemes and explain some of their mathematical foundations. Then, we review the current literature in the domain of machine-learning interaction potentials. Finally, two case studies are presented to test the advantages and drawbacks of different linear regression methods.

...read moreread less

Journal Article•DOI•

Review of linear and nonlinear models in breath analysis by Cyranose 320

[...]

Maryan Arrieta, Barbara Swanson, Abhinav Bhushan

21 Apr 2023-Journal of Breath Research

TL;DR: In this article , a systematic review was conducted according to the guidelines of the Preferred Reporting Items for Systematic Review and Meta-Analyses using keywords related to e-nose and breath.

...read moreread less

Abstract: Analysis of volatile organic compounds (VOCs) in breath specimens has potential for point of care (POC) screening due to ease of sample collection. While the electronic nose (e-nose) is a standard VOC measure across a wide range of industries, it has not been adopted for POC screening in healthcare. One limitation of the e-nose is the absence of mathematical models of data analysis that yield easily interpreted findings at POC. The purposes of this review were to (1) examine the sensitivity/specificity results from studies that analyzed breath smellprints using the Cyranose 320, a widely used commercial e-nose, and (2) determine whether linear or nonlinear mathematical models are superior for analyzing Cyranose 320 breath smellprints. This systematic review was conducted according to the guidelines of the Preferred Reporting Items for Systematic Review and Meta-Analyses using keywords related to e-nose and breath. Twenty-two articles met the eligibility criteria. Two studies used a linear model while the rest used nonlinear models. The two studies that used a linear model had a smaller range for mean of sensitivity and higher mean (71.0%–96.0%; M = 83.5%) compared to the studies that used nonlinear models (46.9%–100%; M = 77.0%). Additionally, studies that used linear models had a smaller range for mean of specificity and higher mean (83.0%–91.5%; M = 87.2%) compared to studies that used nonlinear models (56.9%–94.0%; M = 76.9%). Linear models achieved smaller ranges for means of sensitivity and specificity compared to nonlinear models supporting additional investigations of their use for POC testing. Because our findings were derived from studies of heterogenous medical conditions, it is not known if they generalize to specific diagnoses.

...read moreread less

Journal Article•DOI•

A robust spline approach in partially linear additive models

[...]

01 Feb 2023

TL;DR: In this paper , a family of robust estimators for partially linear additive models that combine B-splines with robust linear MM-regression estimators is proposed under mild assumptions, consistency results and rates of convergence for the proposed estimators are derived.

...read moreread less

Abstract: Partially linear additive models generalize linear regression models by assuming that the relationship between the response and a set of explanatory variables is linear on some of the covariates, while the other ones enter into the model through unknown univariate smooth functions. The harmful effect of outliers either in the residuals or in the covariates involved in the linear component has been described in the situation of partially linear models, that is, when only one nonparametric component is involved. When dealing with additive components, the problem of providing reliable estimators when atypical data arise is of practical importance motivating the need of robust procedures. Based on this fact, a family of robust estimators for partially linear additive models that combines B-splines with robust linear MM-regression estimators is proposed. Under mild assumptions, consistency results and rates of convergence for the proposed estimators are derived. Furthermore, the asymptotic normality for the linear regression estimators is obtained. A Monte Carlo study is carried out to compare, under different models and contamination schemes, the performance of the robust MM-proposal based on B-splines with its classical counterpart and also with a quantile approach. The obtained results show the benefits of using the robust MM-approach. The analysis of a real data set illustrates the usefulness of the proposed method.

...read moreread less

Posted Content•DOI•

Long-term Forecasting with TiDE: Time-series Dense Encoder

[...]

17 Apr 2023

TL;DR: In this article , a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), was proposed for long-term time-series forecasting.

...read moreread less

Abstract: Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and non-linear dependencies. Theoretically, we prove that the simplest linear analogue of our model can achieve near optimal error rate for linear dynamical systems (LDS) under some assumptions. Empirically, we show that our method can match or outperform prior approaches on popular long-term time-series forecasting benchmarks while being 5-10x faster than the best Transformer based model.

...read moreread less

Journal Article•DOI•

Double-Estimation-Friendly Inference for High-Dimensional Misspecified Models

[...]

01 Feb 2023-Statistical Science

TL;DR: In this paper , the authors propose a methodology for high-dimensional regression settings that respects the double-estimation-friendly (DEF) property for Wald tests in generalised linear models.

...read moreread less

Abstract: All models may be wrong—but that is not necessarily a problem for inference. Consider the standard t-test for the significance of a variable X for predicting response Y while controlling for p other covariates Z in a random design linear model. This yields correct asymptotic type I error control for the null hypothesis that X is conditionally independent of Y given Z under an arbitrary regression model of Y on (X,Z), provided that a linear regression model for X on Z holds. An analogous robustness to misspecification, which we term the “double-estimation-friendly” (DEF) property, also holds for Wald tests in generalised linear models, with some small modifications. In this expository paper, we explore this phenomenon, and propose methodology for high-dimensional regression settings that respects the DEF property. We advocate specifying (sparse) generalised linear regression models for both Y and the covariate of interest X; our framework gives valid inference for the conditional independence null if either of these hold. In the special case where both specifications are linear, our proposal amounts to a small modification of the popular debiased Lasso test. We also investigate constructing confidence intervals for the regression coefficient of X via inverting our tests; these have coverage guarantees even in partially linear models where the contribution of Z to Y can be arbitrary. Numerical experiments demonstrate the effectiveness of the methodology.

...read moreread less

Journal Article•DOI•

Accounting for endogenous effects in decision-making with a non-linear diffusion decision model

[...]

Isabelle Hoxha, Sylvain Chevallier, Matteo Ciarchi, Stefan Glasauer, Arnaud Delorme, Michel-Ange Amorim - Show less +2 more

30 Jan 2023-Dental science reports

TL;DR: In this article , a non-linear drift-diffusion model (nl-DDM) is proposed to capture inter-trial dynamics at the single-trial level and endogenous influences, and the model paves the way toward more accurately analyzing across-trial variability for perceptual decisions and accounts for peri-stimulus influences.

...read moreread less

Abstract: The Drift-Diffusion Model (DDM) is widely accepted for two-alternative forced-choice decision paradigms thanks to its simple formalism and close fit to behavioral and neurophysiological data. However, this formalism presents strong limitations in capturing inter-trial dynamics at the single-trial level and endogenous influences. We propose a novel model, the non-linear Drift-Diffusion Model (nl-DDM), that addresses these issues by allowing the existence of several trajectories to the decision boundary. We show that the non-linear model performs better than the drift-diffusion model for an equivalent complexity. To give better intuition on the meaning of nl-DDM parameters, we compare the DDM and the nl-DDM through correlation analysis. This paper provides evidence of the functioning of our model as an extension of the DDM. Moreover, we show that the nl-DDM captures time effects better than the DDM. Our model paves the way toward more accurately analyzing across-trial variability for perceptual decisions and accounts for peri-stimulus influences.

...read moreread less

Proceedings Article•DOI•

Improved Accuracy for Exploring Text - Based Emotion Recognition in Social Media Conversation Generalized Linear Model Compared with Random Forest

[...]

06 Apr 2023

TL;DR: The authors evaluated the efficacy of the New Generalized Linear Model (GLM) and the Random Forest Algorithm (RFA) in identifying the sentiment of social media posts and found that the accuracy of the GLM algorithm was significantly higher than the RFA algorithm.

...read moreread less

Abstract: Aim: The primary goal of this research was to evaluate the efficacy of the New Generalized Linear Model (GLM) and the Random Forest Algorithm in identifying the sentiment of social media posts. Materials and Methods: We estimate numerous times using the Generalized Linear Model with a sample size of 10, and using the Random Forest with a sample size of 10, to predict with an accuracy of 93.01%. Results: In this study, the accuracy of the Generalized Linear Model (GLM) Algorithm was found to be 70%, which is significantly higher than the accuracy of the Random Forest Algorithm (85.18%). With a pre-test probability of 80%, p=0.824 (p0.005) is not statistically significant. Conclusion: In summary, the Generalized Linear Model (GLM) outperformed the Random Forest algorithm when it came to examining text-based emotion recognition in social media interaction.

...read moreread less

Posted Content•DOI•

A Linear Distflow Model Considering Line Shunts

[...]

23 Feb 2023

TL;DR: In this paper , a modified linear Distflow model with line shunts (LinDistS) is proposed to address relevant model errors, which not only lies in a straightforward structure like LinDist, but also maintains the linearity after further considering three-phase unbalanced systems.

...read moreread less

Abstract: <p>Line shunts are usually ignored by various power flow (PF) models in distribution system analysis, planning and optimization. However, ``charging effects" from line shunts of underground/submarine power cables would cause non-negligible model errors for these commonly used PF models. In this brief, we propose a modified linear Distflow model (LinDist) with line shunts (LinDistS) to address relevant model errors. The strength of the proposed model not only lies in a straightforward structure like LinDist, but also maintaining the linearity after further considering three-phase unbalanced systems. The linearization error of voltage component is theoretically analyzed. Case studies show that compared with non-linear and linear models, LinDistS achieves the descent calculation accuracy and efficiency in large scale distribution systems.</p>

...read moreread less

Proceedings Article•DOI•

Research on Housing Price Forecasting Model Based on Multiple Linear Regression Model and Neural Network Model

[...]

RuiHong Xu

01 Jan 2023

TL;DR: Wang et al. as mentioned in this paper used forecasting models to provide reference for potential house buyers, thus avoiding blind purchase and adjust real estate policies based on model predictions to reduce the stagnancy of the real estate markets.

...read moreread less

Abstract: There is a huge demand for the housing market in China. Effective forecasting models can provide reference for potential house buyers, thus avoiding blind purchase. In addition, policymakers can adjust real estate policies based on model predictions to reduce the stagnancy of the real estate markets

...read moreread less

Journal Article•DOI•

Strategies for constructing mathematical models of nonlinear systems based on multiple linear regression models

[...]

Yongcun Shao, Cong Qin

28 Apr 2023-Applied mathematics and nonlinear sciences

TL;DR: In this paper , a mathematical model of nonlinear systems based on a multiple linear regression model is proposed to improve the experimental prediction accuracy of a trap-pipe system. But, the model is not suitable for large-scale systems, such as those with time-varying, time-lagged, and uncertain factors.

...read moreread less

Abstract: Abstract Mathematical systems often have nonlinear, time-varying, time-lagged, and uncertain factors, which affect the experimental prediction accuracy. In order to improve the experimental prediction accuracy, this paper inputs the independent and dependent variable data sets as the original samples into a multiple linear regression function performs fitting calculations to obtain the nonlinear factors, and constructs a mathematical model of nonlinear systems based on a multiple linear regression model. In this model, the expected output value is calculated, and the input vector and output vector are continuously controlled for rolling operations to obtain the prediction results. A mathematical experiment of nonlinear system dynamics of vibration of deep water trap-test pipe system is set up to test the prediction ability of the model. The results show that the nonlinear system mathematical model based on the multiple linear regression model has a very high prediction accuracy. In the mathematical experiments of vibration nonlinear system dynamics of deep water trap-test pipe system, the error of the nonlinear system mathematical model based on multiple linear regression model in the transverse flow vibration frequency of the trap pipe column is 2%, which is lower than the single trap pipe calculation model by 4%. The prediction accuracy of the nonlinear system mathematical model based on the multiple linear regression model is higher than that of the single test tube model calculation by 78%. This shows that the nonlinear system mathematical model based on the multiple linear regression model can improve the experimental prediction accuracy.

...read moreread less

Journal Article•DOI•

Partial replacement imputation estimation for partially linear models with complex missing pattern covariates

[...]

Zishu Zhan, Xiangjie Li, Jingxiao Zhang

06 Jun 2023-Statistics and computing

Journal Article•DOI•

Model Selection with Coefficient of Determination in Linear Mixed Effects Model

[...]

Bonghee Lee

28 Feb 2023-Journal of the Korean data analysis society

TL;DR: In this article , the authors provide a brief review on several different coefficients of determination proposed for linear mixed effects model and provide detailed review on the definition and estimation methods for two extended coefficient of determinations which are based on decomposition of total variation and conditional prediction.

...read moreread less

Abstract: Coefficient of determination(R²) is most popular criteria for model selection in linear regression model since it is easy to use and also explain how much proportion the candidate mode can accounts among total amount of variation in data. In linear mixed effects model, however, it is not only easy to extend definition of R² but also it is hard to interpret since variation could be defined in various ways. This article provides brief review on several different coefficients of determination proposed for linear mixed effects model. We provides detailed review on the definition and estimation methods for two extended coefficient of determinations which are based on decomposition of total variation and conditional prediction, Also we demonstrate how to use and interpret different coefficients of determination for linear mixed effects model by providing real data example from national physical test 100 project.

...read moreread less

Journal Article•DOI•

Robust hypothesis testing in functional linear models

[...]

Yan Zhang, Yuehua Wu

04 Apr 2023-Journal of Statistical Computation and Simulation

TL;DR: In this paper , the authors extend three robust tests (Wald-type, the likelihood ratio-type and F-type) in functional linear models with the scalar dependent variable and the functional covariate.

...read moreread less

Abstract: We extend three robust tests – Wald-type, the likelihood ratio-type and F-type in functional linear models with the scalar dependent variable and the functional covariate. Based on the percentage of variance explained criterion, we use the functional principal components analysis and re-express a functional linear model to a finite regression. We investigate the theoretical properties of these robust testing procedures and assess the finite sample properties through the numerical simulation. In our experiments, the power performance and Type I error rates are studied separately in the sparsely and densely functional linear models. The simulation results show that the robust test procedures are more stable and less sensitive to heavy-tailed distributed errors than the classical ones. Two real datasets are analysed to compare the classical and robust testing procedures.

...read moreread less

Journal Article•DOI•

Simple Ways to Interpret Effects in Modeling Binary Data

[...]

Joan F. Burke¹•Institutions (1)

University of Florida¹

01 Jan 2023-Statistics for social and behavioral sciences

TL;DR: In this article , the authors survey probability-based effect measures that can be simpler to understand than logistic and probit regression model parameters and their corresponding effect measures, such as odds ratios.

...read moreread less

Abstract: Traditional methods for the analysis of binary response data are generalized linear models that employ logistic or probit link functions. Unfortunately, effect measures for these type of models do not have a straightforward interpretation. Hence, in this paper we survey probability-based effect measures that can be simpler to understand than logistic and probit regression model parameters and their corresponding effect measures, such as odds ratios. For describing the effect of an explanatory variable while adjusting for others, it is sometimes possible to employ the identity and log link functions to generate simple effect measures. When such link functions are inappropriate, one can still construct analogous effect measures. For comparing groups that are levels of categorical explanatory variables or relevant values for quantitative explanatory variables, such measures can be based on average differences or log-ratios of the probability modeled. For quantitative explanatory variables, they can also be based on average instantaneous rates of change for the probability. We also propose analogous measures for interpreting effects in models with nonlinear predictors, such as generalized additive models. We illustrate the measures for two examples and show how to implement them with R software.

...read moreread less

Journal Article•DOI•

Dependences between the scales of deficient fear and actual self-perception at the stage of self-isolation

[...]

20 Jun 2023-Naučno-pedagogičeskoe obozrenie

TL;DR: In this article , the authors presented a COVID-19 survey on the state of the art in the field of computer vision and artificial intelligence in the context of virtual reality applications.

...read moreread less

Abstract: Цель работы – выявить природу причинно-следственной связи между характеристиками дефицитарного страха и показателями актуального самовосприятия. В рамках эмпирического исследования студентов, находящихся в условиях самоизоляции в период пандемии COVID-19, проведен анализ с использованием авторского метода зависимостей дефицитарного страха от компонент актуального самовосприятия на предмет линейности-нелинейности и сделан вывод о нелинейной природе этих зависимостей. Все линейные корреляции между показателями дефицитарного страха и компонентами актуального самовосприятия не превышают по модулю 0,25, т. е. они крайне слабые, и говорить о поставленной проблеме с позиции линейных моделей неприемлемо. Чтобы понять природу дефицитарного страха, необходимо уходить от линейных моделей. Для двух показателей дефицитарного страха и 26 показателей актуального самовосприятия в рамках модели для кварт независимой переменной было выявлено пять сильных простейших нелинейных зависимостей, демонстрирующих ошибку 1-го типа, когда корреляция крайне мала, меньше по модулю даже порога значимых значений (0,17), а потому связи нет в рамках линейной модели корреляционного анализа. Одна зависимость демонстрирует ошибку 2-го типа, когда сильная нелинейная зависимость в рамках линейной модели сторонниками значимой корреляции будет рассматриваться как значимая линейная связь (очень слабый коэффициент корреляции –0,18 превосходит по модулю порог (0,17) значимости). Выход за рамки линейных моделей дает принципиально новую информацию об изучаемом феномене дефицитарного страха, а линейные модели в данном случае неприемлемы, они только могут крайне исказить результаты и натолкнуть на ошибочные выводы и интерпретации. Aim: to reveal the nature of the causal relationship between the characteristics of Deficient fear and indicators of actual self-perception. As part of an empirical study of students in self-isolation during the COVID-19 pandemic, an analysis was made using the author’s method of Deficient fear dependencies on the components of actual self-perception for linearity-nonlinearity, and a conclusion was made about the nonlinear nature of these dependencies. All linear correlations between indicators of deficient fear and components of actual self-perception do not exceed 0.25 in modulus, i.e. they are extremely weak, and it is unacceptable to speak about the problem posed from the standpoint of linear models. For two indicators of deficient fear and 26 indicators of actual self-perception, within the framework of the model for quarts of an independent variable, five strong simplest non-linear dependencies were identified, demonstrating a type 1 error, when the correlation is extremely small, even less than the threshold of “significant” values (0.17), and therefore there is no connection within the framework of the linear model of correlation analysis. One dependence demonstrates a type 2 error, when a strong non-linear dependence in the framework of a linear model will be considered by supporters of a “significant” correlation as a “significant” linear relationship (a very weak correlation coefficient of –0.18 exceeds the threshold (0.17) of “significance” in absolute value). Going beyond linear models gives fundamentally new information about the phenomenon of deficient fear under study. The article provides detailed descriptions and interpretations of two of the six found strong dependencies (the rest are presented in the tables), visual graphical representations are considered, as well as their most probable estimates in the traditional approach.

...read moreread less