Showing papers in &quot;Annals of Data Science in 2019&quot;

On the Inverse Power Lomax Distribution

TL;DR: A new intelligent machine learning framework for predicting the results of games played at the NBA is proposed by aiming to discover the influential features set that affects the outcomes of NBA games by comparing the performance and the models derived against different features sets related to basketball games.

...read moreread less

Abstract: In the recent years, sports outcome prediction has gained popularity, as demonstrated by massive financial transactions in sports betting. One of the world’s popular sports that lures betting and attracts millions of fans worldwide is basketball, particularly the National Basketball Association (NBA) of the United States. This paper proposes a new intelligent machine learning framework for predicting the results of games played at the NBA by aiming to discover the influential features set that affects the outcomes of NBA games. We would like to identify whether machine learning methods are applicable to forecasting the outcome of an NBA game using historical data (previous games played), and what are the significant factors that affect the outcome of games. To achieve the objectives, several machine learning methods that utilise different learning schemes to derive the models, including Naive Bayes, artificial neural network, and Decision Tree, are selected. By comparing the performance and the models derived against different features sets related to basketball games, we can discover the key features that contribute to better performance such as accuracy and efficiency of the prediction model. Based on the results analysis, the DRB (defensive rebounds) feature was chosen and was deemed as the most significant factor influencing the results of an NBA game. Furthermore, others crucial factors such as TPP (three-point percentage), FT (free throws made), and TRB (total rebounds) were also selected, which subsequently increased the model’s prediction accuracy rate by 2–4%.

...read moreread less

64 citations

Journal Article•DOI•

[...]

Amal S. Hassan¹, Marwa Abd-Allah¹•Institutions (1)

Cairo University¹

Alpha-Power Transformed Lindley Distribution: Properties and Associated Inference with Application to Earthquake Data

TL;DR: In this article, a new three-parameter lifetime distribution named as the inverse power Lomax distribution was proposed, which is the inverse form of the power LOMAX distribution.

...read moreread less

Abstract: We introduce and study a new three-parameter lifetime distribution named as the inverse power Lomax. The proposed distribution is obtained as the inverse form of the power Lomax distribution. Some statistical properties of the inverse power Lomax model are implemented. Based on censored samples, maximum likelihood estimators of the model parameters are obtained. An intensive simulation study is performed for evaluating the behavior of estimators based on their biases and mean square errors. Superiority of the new model over some well-known distributions is illustrated by means of real data sets. The results revealed the fact that; the suggested model can produce better fits than some well-known distributions.

...read moreread less

44 citations

Journal Article•DOI•

[...]

Sanku Dey¹, Indranil Ghosh², Devendra Kumar³•Institutions (3)

St. Anthony's College, Shillong¹, University of North Carolina at Wilmington², Central University, India³

On Regularisation Methods for Analysis of High Dimensional Data

TL;DR: In this paper, a new two-parameter distribution with decreasing failure rate is introduced, called Alpha Power Transformed Lindley (APTL), which provides better fits than the Lindley distribution and some of its known generalizations.

...read moreread less

Abstract: The Lindley distribution has been generalized by many authors in recent years. A new two-parameter distribution with decreasing failure rate is introduced, called Alpha Power Transformed Lindley (APTL, in short, henceforth) distribution that provides better fits than the Lindley distribution and some of its known generalizations. The new model includes the Lindley distribution as a special case. Various properties of the proposed distribution, including explicit expressions for the ordinary moments, incomplete and conditional moments, mean residual lifetime, mean deviations, L-moments, moment generating function, cumulant generating function, characteristic function, Bonferroni and Lorenz curves, entropies, stress-strength reliability, stochastic ordering, statistics and distribution of sums, differences, ratios and products are derived. The new distribution can have decreasing increasing, and upside-down bathtub failure rates function depending on its parameters. The model parameters are obtained by the method of maximum likelihood estimation. Also, we obtain the confidence intervals of the model parameters. A simulation study is carried out to examine the bias and mean squared error of the maximum likelihood estimators of the parameters. Finally, two data sets have been analyzed to show how the proposed models work in practice.

...read moreread less

43 citations

Journal Article•DOI•

[...]

Tanin Sirimongkolkasem¹, Reza Drikvandi²•Institutions (2)

Imperial College London¹, Manchester Metropolitan University²

13 Apr 2019-Annals of Data Science

TL;DR: It is found that correlated data when associated with important variables improve those common regularisation methods in all aspects, and that the level of sparsity can be reflected not only from the number of important variables but also from their overall effect size and locations.

...read moreread less

Abstract: High dimensional data are rapidly growing in many domains due to the development of technological advances which helps collect data with a large number of variables to better understand a given phenomenon of interest. Particular examples appear in genomics, fMRI data analysis, large-scale healthcare analytics, text/image analysis and astronomy. In the last two decades regularisation approaches have become the methods of choice for analysing such high dimensional data. This paper aims to study the performance of regularisation methods, including the recently proposed method called de-biased lasso, for the analysis of high dimensional data under different sparse and non-sparse situations. Our investigation concerns prediction, parameter estimation and variable selection. We particularly study the effects of correlated variables, covariate location and effect size which have not been well investigated. We find that correlated data when associated with important variables improve those common regularisation methods in all aspects, and that the level of sparsity can be reflected not only from the number of important variables but also from their overall effect size and locations. The latter may be seen under a non-sparse data structure. We demonstrate that the de-biased lasso performs well especially in low dimensional data, however it still suffers from issues, such as multicollinearity and multiple hypothesis testing, similar to the classical regression methods.

...read moreread less

33 citations

Journal Article•DOI•

Inverse Gompertz Distribution: Properties and Different Estimation Methods with Application to Complete and Censored Data

[...]

M. S. Eliwa¹, M. El-Morshedy¹, Mohamed Nabil Fathy Ibrahim²•Institutions (2)

Mansoura University¹, Damietta University²

Power Lindley-G Family of Distributions

TL;DR: In this paper, the inverse Gompertz distribution with two parameters is introduced and the model parameters are estimated by the method of maximum likelihood, bootstrap, least squares, weighted least squares and Cramer-von Mises.

...read moreread less

Abstract: In this article, we introduce inverse Gompertz distribution with two parameters. Some statistical properties are presented such as hazard rate function, quantile, probability weighted (moments), skewness, kurtosis, entropies function, mean residual lifetime and mean inactive lifetime. The model parameters are estimated by the method of maximum likelihood, bootstrap, least squares, weighted least squares and Cramer-von Mises. Further, Monte Carlo simulations are carried out to compare the long-run performance of the estimators based on complete and type II right censored data. Finally, we estimate the parameters based on behavioral sciences data and fatigue life of 10 bearing of a certain type in hours censored data, which explain that the model fits the data better than some models.

...read moreread less

30 citations

Journal Article•DOI•

[...]

Amal S. Hassan¹, Said G. Nassr²•Institutions (2)

Cairo University¹, Sinai University²

A Modified Cancelable Biometrics Scheme Using Random Projection

TL;DR: In this article, a new family of probability distributions generated from a power Lindley random variable is introduced, which is called the Power Lindley-Generated Family (PLFG).

...read moreread less

Abstract: In this paper, we introduce a new family of probability distributions generated from a power Lindley random variable called the power Lindley-generated family. The new family extends several classical distributions as well as generalizes the odd Lindley family which is performed by Silva et al. (Austrian J Stat 46:65–87, 2017). Some of the mathematical properties are obtained involving moments, incomplete moments, quantile function and order statistics. New four distributions are provided as special models from the family. The model parameters of the family are estimated by the maximum likelihood technique. An application to real data set and simulation study are provided to demonstrate the flexibility and interest of one special model of the suggested family.

...read moreread less

26 citations

Journal Article•DOI•

[...]

Randa F. Soliman¹, Mohamed Amin¹, Fathi E. Abd El-Samie¹•Institutions (1)

Menoufia University¹

Bivariate Gumbel-G Family of Distributions: Statistical Properties, Bayesian and Non-Bayesian Estimation with Application

TL;DR: This paper presents a random projection scheme for cancelable iris recognition that guarantees exclusion of eyelids and eyelashes effects, and masking of the original Gabor features to increase the level of security.

...read moreread less

Abstract: This paper presents a random projection scheme for cancelable iris recognition. Instead of using original iris features, masked versions of the features are generated through the random projection in order to increase the security of the iris recognition system. The proposed framework for iris recognition includes iris localization, sector selection of the iris to avoid eyelids and eyelashes effects, normalization, segmentation of normalized iris region into halves, selection of the upper half for further reduction of eyelids and eyelashes effects, feature extraction with Gabor filter, and finally random projection. This framework guarantees exclusion of eyelids and eyelashes effects, and masking of the original Gabor features to increase the level of security. Matching is performed with a Hamming Distance (HD) metric. The proposed framework achieves promising recognition rates of 99.67% and a leading Equal Error Rate (EER) of 0.58%.

...read moreread less

26 citations

Journal Article•DOI•

[...]

M. S. Eliwa¹, M. El-Morshedy¹•Institutions (1)

Mansoura University¹

The Fréchet Topp Leone-G Family of Distributions: Properties, Characterizations and Applications

TL;DR: In this paper, a new class of bivariate distributions called the bivariate Gumbel-G family is proposed, whose marginal distributions are gumbel families, and a special model of the new family is discussed in detail.

...read moreread less

Abstract: In this paper, a new class of bivariate distributions called the bivariate Gumbel-G family is proposed, whose marginal distributions are Gumbel-G families. Several of its statistical properties are derived. After introducing the general class, a special model of the new family is discussed in-detail. Bayesian and maximum likelihood techniques are used to estimate the model parameters. Simulation study is carried out to examine the bias and mean square error of Bayesian and maximum likelihood estimators. Finally, a real data set is analyzed for illustrative the flexibility of the proposed bivariate family.

...read moreread less

24 citations

Journal Article•DOI•

[...]

Hesham Reyad¹, Mustafa Ç. Korkmaz², Ahmed Z. Afify³, G. G. Hamedani⁴, Soha Othman⁵ - Show less +1 more•Institutions (5)

Qassim University¹, Artvin Çoruh University², Banha University³, Marquette University⁴, Cairo University⁵

27 Apr 2019-Annals of Data Science

TL;DR: In this paper, a new family of continuous distributions which ensure model flexiblity, based on the Frechet distribution and Topp Leone-G family, is introduced, and the maximum likelihood estimates and the observed information matrix are obtained for the model parameters.

...read moreread less

Abstract: A new family of continuous distributions which ensure model flexiblity, is introduced based on the Frechet distribution and Topp Leone-G family. Two special sub-models of the new family are discussed. We provide some distributional properties of this family in the general setting such as the series expansions of density, moments, generating function, stress strength model, Renyi and Shannon entropies, probability weighted moments and order statistics. Certain characterizations of the proposed family are presented. The maximum likelihood estimates and the observed information matrix are obtained for the model parameters. We assess the performance of the maximum likelihood estimators by means of a graphical simulation study. The potentiality of the new class is shown via two applications to real data sets.

...read moreread less

24 citations

Journal Article•DOI•

Predicting the Unpredictable: An Application of Machine Learning Algorithms in Indian Stock Market

[...]

Ashwini Saini¹, Anoop Sharma¹•Institutions (1)

Singhania University¹

22 Aug 2019-Annals of Data Science

TL;DR: A comparative study of fundamental and technical analysis based on different parameters of the stock market prediction techniques using time series analysis and machine learning algorithms such as the artificial neural network.

...read moreread less

Abstract: The stock market is a popular investment option for investors because of its expected high returns. Stock market prediction is a complex task to achieve with the help of artificial intelligence. Because stock prices depend on many factors, including trends and news in the market. However, in recent years, many creative techniques and models have been proposed and applied to efficiently and accurately forecast the behaviour of the stock market. This paper presents a comparative study of fundamental and technical analysis based on different parameters. We also discuss a comparative Analysis of various prediction techniques used to predict stock price. These strategies include technical analysis like time series analysis and machine learning algorithms such as the artificial neural network (ANN). Along with them, few researchers focused on the textual analysis of stock prices by continuous analysing the public sentiments from social media and other news sources. Various approaches are compared based on methodologies, datasets, and efficiency with the help of visualisation.

...read moreread less

21 citations

Journal Article•DOI•

Estimation and Prediction for Gompertz Distribution Under the Generalized Progressive Hybrid Censored Data

[...]

M. M. Mohie El-Din¹, M. Nagy², M. Nagy³, M. H. Abu-Moussa⁴•Institutions (4)

Al-Azhar University¹, King Saud University², Fayoum University³, Cairo University⁴

Improving Time Complexity and Accuracy of the Machine Learning Algorithms Through Selection of Highly Weighted Top k Features from Complex Datasets

TL;DR: In this article, the statistical inference for the Gompertz distribution based on generalized progressively hybrid censored data is discussed, and the estimation of the parameters for GOMERTZ distribution is discussed using the maximum likelihood method and the Bayesian methods under different loss functions.

...read moreread less

Abstract: In this paper, the statistical inference for the Gompertz distribution based on generalized progressively hybrid censored data is discussed. The estimation of the parameters for Gompertz distribution is discussed using the maximum likelihood method and the Bayesian methods under different loss functions. The existence and uniqueness of the maximum likelihood estimation are proved. The point and interval Bayesian predictions for unobserved failures from the same sample and that from the future sample are derived. The Monte Carlo simulation is applied to compare the proposed methods. A real data example is used to apply the methods of estimation and to construct the prediction intervals.

...read moreread less

Journal Article•DOI•

[...]

Abdul Majeed¹•Institutions (1)

Korea Aerospace University¹

The Exponentiated Generalized Marshall–Olkin Family of Distribution: Its Properties and Applications

TL;DR: An efficient feature selection algorithm based on random forest is presented to improve the performance of the MLAs without sacrificing the guarantees on the accuracy while processing the large and complex datasets.

...read moreread less

Abstract: Machine learning algorithms (MLAs) usually process large and complex datasets containing a substantial number of features to extract meaningful information about the target concept (a.k.a class). In most cases, MLAs suffer from the latency and computational complexity issues while processing such complex datasets due to the presence of lesser weight (i.e., irrelevant or redundant) features. The computing time of the MLAs increases explosively with increase in the number of features, feature dependence, number of records, types of the features, and nested features categories present in such datasets. Appropriate feature selection before applying MLA is a handy solution to effectively resolve the computing speed and accuracy trade-off while processing large and complex datasets. However, selection of the features that are sufficient, necessary, and are highly co-related with the target concept is very challenging. This paper presents an efficient feature selection algorithm based on random forest to improve the performance of the MLAs without sacrificing the guarantees on the accuracy while processing the large and complex datasets. The proposed feature selection algorithm yields unique features that are closely related with the target concept (i.e., class). The proposed algorithm significantly reduces the computing time of the MLAs without degrading the accuracy much while learning the target concept from the large and complex datasets. The simulation results fortify the efficacy and effectiveness of the proposed algorithm.

...read moreread less

Journal Article•DOI•

[...]

Laba Handique¹, Subrata Chakraborty¹, Thiago A. N. de Andrade²•Institutions (2)

Dibrugarh University¹, Federal University of Pernambuco²

Dimensionality Reduction Using Band Correlation and Variance Measure from Discrete Wavelet Transformed Hyperspectral Imagery

TL;DR: In this paper, a new generator of continuous distributions called Exponentiated Generalized Marshall-Olkin-G family with three additional parameters is proposed, which contains several known distributions as sub models.

...read moreread less

Abstract: A new generator of continuous distributions called Exponentiated Generalized Marshall–Olkin-G family with three additional parameters is proposed. This family of distribution contains several known distributions as sub models. The probability density function and cumulative distribution function are expressed as infinite mixture of the Marshall–Olkin distribution. Important properties like quantile function, order statistics, moment generating function, probability weighted moments, entropy and shapes are investigated. The maximum likelihood method to estimate model parameters is presented. A simulation result to assess the performance of the maximum likelihood estimation is briefly discussed. A distribution from this family is compared with two sub models and some recently introduced lifetime models by considering three real life data fitting applications.

...read moreread less

Journal Article•DOI•

[...]

Arati Paul¹, Nabendu Chaki²•Institutions (2)

Indian Space Research Organisation¹, University of Calcutta²

13 Apr 2019-Annals of Data Science

TL;DR: A minimum redundancy and maximum variance based unsupervised band selection methodology is proposed and is compared with four other existing state-of-the-art methods in the similar field in terms of OA and execution time for evaluating the performance.

...read moreread less

Abstract: Contiguous narrow bands of hyperspectral images greatly increase computational complexity. Redundancy reduction is therefore necessary. Here, a minimum redundancy and maximum variance based unsupervised band selection methodology is proposed. Discrete wavelet transformation is applied on the data to reduce spatial redundancy without much effecting the overall band correlations. This in turn made the process more time efficient and noise resilient. Highly correlated bands are considered similar, and one with higher variance is accepted as being more discriminating. Finally, classification is performed with the selected bands and overall accuracy (OA) is calculated. The proposed method is compared with four other existing state-of-the-art methods in the similar field in terms of OA and execution time for evaluating the performance.

...read moreread less

Journal Article•DOI•

Cubic Transmuted Weibull Distribution: Properties and Applications

[...]

Md. Mahabubur Rahman¹, Md. Mahabubur Rahman², Bander Al-Zahrani², Muhammad Shahbaz²•Institutions (2)

Islamic University¹, King Abdulaziz University²

Burr–Hatke Exponential Distribution: A Decreasing Failure Rate Model, Statistical Inference and Applications

TL;DR: In this paper, a cubic transmuted Weibull (\\\\\\\\ CTW $$ ) distribution has been proposed by using the general family of transmuted distributions introduced by Rahman et al. The parameter estimation and inference procedure for the proposed distribution have been discussed.

...read moreread less

Abstract: In this paper, a cubic transmuted Weibull ( $$ CTW $$ ) distribution has been proposed by using the general family of transmuted distributions introduced by Rahman et al. (Pak J Stat Oper Res 14:451–469, 2018). We have explored the proposed $$ CTW $$ distribution in details and have studied its statistical properties as well. The parameter estimation and inference procedure for the proposed distribution have been discussed. We have conducted a simulation study to observe the performance of estimation technique. Finally, we have considered two real-life data sets to investigate the practicality of proposed $$ CTW $$ distribution.

...read moreread less

Journal Article•DOI•

[...]

Abhimanyu Singh Yadav¹, Emrah Altun², Haitham M. Yousof³•Institutions (3)

Central University of Rajasthan¹, Bartın University², Banha University³

13 Apr 2019-Annals of Data Science

TL;DR: In this paper, a new one-parameter lifetime distribution named as Burr-Hatke exponential (BHE) distribution is introduced and Monte Carlo simulations are performed to compare the performances of the obtained estimators in mean square error sense.

...read moreread less

Abstract: In this paper, we introduce a new one-parameter lifetime distribution as an alternative to exponential distribution named as Burr–Hatke exponential (BHE) distribution. Classical and Bayesian estimation procedure for the estimation of BHE model parameter are discussed using on the Type-II hybrid censored data. The Monte Carlo simulations are performed to compare the performances of the obtained estimators in mean square error sense. Two real data sets are analyzed for the illustrative purpose of the considered study. Additionally, a new log-location regression model based on the new distribution is introduced and studied.

...read moreread less

Journal Article•DOI•

The Inverse Xgamma Distribution: Statistical Properties and Different Methods of Estimation

[...]

Abhimanyu Singh Yadav¹, Sudhansu S. Maiti², Mahendra Saha¹•Institutions (2)

Central University of Rajasthan¹, Visva-Bharati University²

16 Apr 2019-Annals of Data Science

TL;DR: In this paper, a new probability distribution, named inverse xgamma (IXG) distribution, was proposed and different mathematical and statistical properties, viz., reliability characteristics, inverse moments, quantile function, mean inverse residual life, stress-strength reliability, stochastic ordering and order statistics of the proposed distribution have been derived and discussed.

...read moreread less

Abstract: The paper proposes a new probability distribution, named inverse xgamma (IXG) distribution. Different mathematical and statistical properties, viz., reliability characteristics, inverse moments, quantile function, mean inverse residual life, stress-strength reliability, stochastic ordering and order statistics of the proposed distribution have been derived and discussed. Estimation of the parameter of IXG distribution has been approached by different methods, namely, maximum likelihood estimation, least squares estimation, weighted least squares estimation, Cramer–von-Mises estimation and maximum product of spacing estimation (MPSE). A simulation study has been carried out to compare the performance of these estimators in terms of their mean squared errors. Asymptotic confidence interval of the parameter in terms of average widths and coverage probabilities is also obtained using MPSE of the parameter. Finally, a data set is used to demonstrate the applicability of IXG distribution in real life situations.

...read moreread less

Journal Article•DOI•

Odd Hyperbolic Cosine Exponential-Exponential (OHC-EE) Distribution

[...]

Omid Kharazmi, Ali Saadatinik¹, Shahla Jahangard²•Institutions (2)

University of Mazandaran¹, Isfahan University of Technology²

Multi-objective Inventory Model with Both Stock-Dependent Demand Rate and Holding Cost Rate Under Fuzzy Random Environment

TL;DR: In this paper, a new lifetime distribution based on the general odd hyperbolic cosine-FG model is introduced, which is shown to have better performance than other fundamental statistical distributions.

...read moreread less

Abstract: In the present paper, we introduce a new lifetime distribution based on the general odd hyperbolic cosine-FG model. Some important properties of proposed model including survival function, quantile function, hazard function, order statistic are obtained. In addition estimating unknown parameters of this model will be examined from the perspective of classic and Bayesian statistics. Moreover, an example of real data set is studied; point and interval estimations of all parameters are obtained by maximum likelihood, bootstrap (parametric and non-parametric) and Bayesian procedures. Finally, the superiority of proposed model in terms of parent exponential distribution over other fundamental statistical distributions is shown via the example of real observations.

...read moreread less

Journal Article•DOI•

[...]

Totan Garai, Dipankar Chakraborty¹, Tapan Kumar Roy•Institutions (1)

Heritage Institute of Technology¹

03 Jan 2019-Annals of Data Science

TL;DR: A multi-objective inventory model under both stock-dependent demand rate and holding cost rate with fuzzy random coefficients is investigated to determine optimal order quantity and inventory level such that the total profit and wastage cost are maximized and minimize for the retailer respectively.

...read moreread less

Abstract: In this paper, we investigated a multi-objective inventory model under both stock-dependent demand rate and holding cost rate with fuzzy random coefficients. Chance constrained fuzzy random multi-objective model and a traditional solution procedure based on an interactive fuzzy satisfying method are discussed. In addition, the technique of fuzzy random simulation is applied to deal with general fuzzy random objective functions and fuzzy random constraints which are usually difficult to converted into their crisp equivalents. The purposed of this study is to determine optimal order quantity and inventory level such that the total profit and wastage cost are maximized and minimize for the retailer respectively. Finally, illustrate example is given in order to show the application of the proposed model.

...read moreread less

Journal Article•DOI•

Bayesian Inference for Rayleigh Distribution Under Step-Stress Partially Accelerated Test with Progressive Type-II Censoring with Binomial Removal

[...]

Manoj Kumar¹, Anurag Pathak¹, Sukriti Soni²•Institutions (2)

Central University, India¹, Central University of Rajasthan²

A Dynamic Panel Gravity Model Application on the Determinant Factors of Ethiopia’s Coffee Export Performance

TL;DR: The MLEs and corresponding Bayes estimators are compared in terms of their risks based on simulated samples from Rayleigh distribution and two sets of real data are analyzed to show its applicability.

...read moreread less

Abstract: In this paper, we propose maximum likelihood estimators (MLEs) and Bayes estimators of parameters of the step-stress partially accelerated life testing of Rayleigh distribution in presence of progressive type-II censoring with binomial removal scheme under Square error loss function, General entropy loss function and Linear exponential loss function . The MLEs and corresponding Bayes estimators are compared in terms of their risks based on simulated samples from Rayleigh distribution. Also, we present to analyze two sets of real data to show its applicability.

...read moreread less

Journal Article•DOI•

[...]

Wondesen Teshome Bekele¹, Fekadu Gelaw Mersha²•Institutions (2)

Dire Dawa University¹, Agricultural & Applied Economics Association²

An Improved LDA Topic Modeling Method Based on Partition for Medium and Long Texts

TL;DR: In this paper, the authors analyzed the determinant factors of Ethiopia's coffee exports (ECE) performance, in the dimension of export sales, via a more realistic model application, dynamic panel gravity model.

...read moreread less

Abstract: Ethiopia’s coffee export earning percentage share in the total export has been rapidly waning over the last decades while it is the first commodity in currency grossing of the country. Since, this study analyses the determinant factors of Ethiopia’s coffee exports (ECE) performance, in the dimension of export sales, via a more realistic model application, dynamic panel gravity model. It commences with the disintegration of the determinant into supply- and demand-side factors. It used short panel data that comprise 71 countries of consistent Ethiopia’s coffee importers for the period of 11 years from 2005 to 2015. The panel unit root test of Harris–Tzavalis was made for each variable and applied the first difference transformation for the variables that had a unit root. The system model of a linear dynamic panel gravity model was specified and estimated with two-step general method moment estimation approach. The model results suggested that lagged ECE performance, real gross domestic product (GDP) of importing countries, Ethiopian population, Ethiopian real GDP, openness to trade of importing countries, Ethiopian institutional quality, and weighted distance were found to be the determinant factors of Ethiopia’s coffee exports performance. The study also implied policies that would promote institutional quality or permits favorable market environments, supply capacity, trade liberalization, and destination with relatively cheaper transportation costs in order to progress Ethiopia’s coffee exports performance.

...read moreread less

Journal Article•DOI•

[...]

Chonghui Guo¹, Menglin Lu¹, Wei Wei²•Institutions (2)

Dalian University of Technology¹, Zhengzhou University²

25 Apr 2019-Annals of Data Science

TL;DR: An improved LDA topic model based on partition (LDAP) is proposed, which preserves the benefits of the original LDA but also refines the modeled granularity from the document level to the semantic topic level, which is particularly suitable for the topic modeling of the medium and long text.

...read moreread less

Abstract: Latent Dirichlet Allocation (LDA) is a topic model that represents a document as a distribution of multiple topics. It expresses each topic as a distribution of multiple words by mining semantic relationships hidden in text. However, traditional LDA ignores some of the semantic features hidden inside the document semantic structure of medium and long texts. Instead of using the original LDA to model the topic at the document level, it is better to refine the document into different semantic topic units. In this paper, we propose an improved LDA topic model based on partition (LDAP) for medium and long texts. LDAP not only preserves the benefits of the original LDA but also refines the modeled granularity from the document level to the semantic topic level, which is particularly suitable for the topic modeling of the medium and long text. The extensive experimental classification results on Fudan University corpus and Sougou Lab corpus demonstrate that LDAP achieves better performance compared with other topic models, such as LDA, HDP, LSA and doc2vec.

...read moreread less

Journal Article•DOI•

Marshall–Olkin Alpha Power Inverse Exponential Distribution: Properties and Applications

[...]

Abdulkareem M. Basheer

20 Jul 2019-Annals of Data Science

TL;DR: In this article, the authors used the method of the Marshall Olkin alpha power transformation to introduce a new generalized MOAPIE distribution and its characterization and statistical properties are obtained, such as reliability, entropy and order statistics.

...read moreread less

Abstract: In this paper, we use the method of the Marshall Olkin alpha power transformation to introduce a new generalized Marshall Olkin alpha power inverse exponential (MOAPIE) distribution. Its characterization and statistical properties are obtained, such as reliability, entropy and order statistics. Moreover, the estimation of the MOAPIE parameters is discussed by using maximum likelihood estimation method. Finally, application of the proposed new distribution to a real data representing the survival times in days of guinea pigs injected with different doses of tubercle bacilli is given and its goodness-of-fit is demonstrated. In addition, comparisons to other models are carried out to illustrate the flexibility of the proposed model.

...read moreread less

Journal Article•DOI•

Exponentiated Generalized Power Series Family of Distributions

[...]

Suleman Nasiru¹, Peter N. Mwita², Oscar Ngesa•Institutions (2)

Pan-African University¹, Machakos University²

Modeling the Relationships Across Nigeria Inflation, Exchange Rate, and Stock Market Returns and Further Analysis

TL;DR: In this paper, a new family of distributions called the exponentiated generalized power series family is proposed and studied and statistical properties such as stochastic order, quantile function, entropy, mean residual life and order statistics were derived.

...read moreread less

Abstract: In this paper, a new family of distributions called the exponentiated generalized power series family is proposed and studied. Statistical properties such as stochastic order, quantile function, entropy, mean residual life and order statistics were derived. Bivariate and multivariate extensions of the family was proposed. The method of maximum likelihood estimation was proposed for the estimation of the parameters. Some special distributions from the family were defined and their applications were demonstrated with real data sets.

...read moreread less

Journal Article•DOI•

[...]

Idika E. Okorie¹, A. C. Akpanta², Johnson Ohakwe³, D. C. Chikezie², Chris U. Onyemachi², M. C. Ugwu⁴ - Show less +2 more•Institutions (4)

University of Manchester¹, Abia State University², Federal University Otuoke³, University of Nigeria, Nsukka⁴

20 Apr 2019-Annals of Data Science

TL;DR: In this article, a more detailed statistical analysis of the dependence across Nigeria inflation, exchange rate, and stock market returns is provided by means of copulas, and a positive relationship is found to exist between Nigeria inflation and the exchange rate of Nigeria Naira versus USD.

...read moreread less

Abstract: For the first time, a more detailed statistical analysis of the dependence across Nigeria inflation, exchange rate, and stock market returns is provided by means of copulas. A positive relationship is found to exist between Nigeria inflation and the exchange rate of Nigeria Naira versus USD, a negligible positive relationship exists between Nigeria inflation and her stock market returns, and a weak positive relationship exists between the exchange rate of Nigeria Naira versus USD and her stock market returns. Eighteen months forecast for each of the time series and the value at risk estimates for the Nigeria stock market returns are given. The Nigeria stock market is confirmed to be weak form inefficient.

...read moreread less

Journal Article•DOI•

Statistical Inference for the Chen Distribution Based on Upper Record Values

[...]

Farhad Yousaf¹, Sajid Ali¹, Ismail Shah¹•Institutions (1)

Quaid-i-Azam University¹

Order Statistics from the Power Lindley Distribution and Associated Inference with Application

TL;DR: A Markov Chain Monte Carlo method is presented to obtain the posterior summaries of the Chen distribution assuming upper record values and a comparison between the Bayesian and frequentist approaches is given.

...read moreread less

Abstract: This article presents the Bayesian and classical inferences for the Chen distribution assuming upper record values. As the posterior distribution is not in a closed form, a Markov Chain Monte Carlo method is presented to obtain the posterior summaries. To assess the effect of prior on the estimated parameters, sensitivity analysis is also a part of this study. Moreover, a comparison between the Bayesian and frequentist approaches is also given. Besides the simulation studies, a real data example to show the application of the study is also discussed.

...read moreread less

Journal Article•DOI•

[...]

Devendra Kumar¹, Anju Goyal²•Institutions (2)

Central University, India¹, Panjab University, Chandigarh²

The Generalized Burr XII Power Series Distributions with Properties and Applications

TL;DR: In this paper, the expected values, second moments, variances and covariances of order statistics from samples of sizes up to 10 for various values of the parameters were tabulated, and the best linear unbiased estimates of the location and scale parameters based on Type-II right-censored samples were obtained.

...read moreread less

Abstract: Power Lindley distribution has been proposed recently by Ghitany et al. (Comput Stat Data Anal 64:20–33, 2013) as a simple and useful reliability model for analysing lifetime data. This model provides more flexibility than the Lindley distribution in terms of the shape of the density and hazard rate functions as well as its skewness and kurtosis. For this distribution, exact explicit expressions for single moments, product moments, marginal moment generating functions and joint moment generating functions of each of these order statistics are derived. By using these relations, we have tabulated the expected values, second moments, variances and covariances of order statistics from samples of sizes up to 10 for various values of the parameters. In addition, we use these moments to obtain the best linear unbiased estimates of the location and scale parameters based on Type-II right-censored samples. In addition, we carry out some numerical illustrations through Monte Carlo simulations to show the usefulness of the findings. Finally, we apply the findings of the paper to some real data set.

...read moreread less

Journal Article•DOI•

[...]

Ibrahim Elbatal¹, Emrah Altun², Ahmed Z. Afify³, Gamze Ozel⁴•Institutions (4)

Imam Muhammad ibn Saud Islamic University¹, Bartın University², Banha University³, Hacettepe University⁴

A New Generalization of the Extended Exponential Distribution with an Application

TL;DR: In this paper, a new family of distributions, called generalized Burr XII power series class, was defined and studied by compounding the generalized Burr 12 and power series distributions, and the maximum likelihood estimation method was used to estimate the model parameters.

...read moreread less

Abstract: We define and study a new family of distributions, called generalized Burr XII power series class, by compounding the generalized Burr XII and power series distributions. Several properties of the new family are derived. The maximum likelihood estimation method is used to estimate the model parameters. The importance and potentiality of the new family are illustrated by means of three applications to real data sets.

...read moreread less

Journal Article•DOI•

[...]

Devendra Kumar¹, Manoj Kumar¹•Institutions (1)

Central University, India¹