scispace - formally typeset
Search or ask a question

Showing papers in "American Journal of Mathematical and Management Sciences in 2008"


Journal ArticleDOI
TL;DR: In this paper, a nonparametric version of the Bartlett-Nanda-Pillai multivariate test has been introduced in the asymptotic context of a large number of treatments and small sample sizes per treatment (large a, small n).
Abstract: SYNOPTIC ABSTRACTWe consider a nonparametric version of the Bartlett-Nanda-Pillai multivariate test that has been introduced in Bathke and Harrar (2008) in the asymptotic context of a large number of treatments and small sample sizes per treatment (large a, small n). The test is based on separate rankings for the different variables. Here, we derive its asymptotic distribution for large ni and small a. Also, two small sample approximations are presented, and their performance is investigated in a simulation study. In the presence of outliers, the proposed nonparametric version shows far superior power than the parametric Bartlett-Nanda-Pillai test. Similar to the parametric case, there is no clear ordering when comparing the nonparametric versions of Bartlett-Nanda-Pillai, Lawley-Hotelling, and ANOVA type test.We show how to apply the test in practice, using SAS. The application is demonstrated with two different data sets conforming to the two different asymptotic frameworks, large a and large ni respect...

23 citations


Journal ArticleDOI
TL;DR: This paper addresses the problem of class discovery which involves unsupervised clustering of tissue samples based on corresponding gene expression data, and describes a method for class discovery and dimensionality reduction using NMF, based on various measures of distance between two non-negative matrices.
Abstract: DNA microarray technology has made it possible to simultaneously measure the expression levels of tens of thousands of genes. In this paper, we address the problem of class discovery which involves unsupervised clustering of tissue samples based on corresponding gene expression data.Non-negative matrix factorization (NMF) by multiplicative updates algorithm is a powerful method for decomposing the gene expression matrix V into two matrices with nonnegative entries, V ~ WH, where each column of W defines a metagene and each column of H represents the metagene expression pattern of the corresponding sample. We describe a method for class discovery and dimensionality reduction using NMF, based on various measures of distance between two non-negative matrices. Our approach provides a unique framework for class discovery. We demonstrate the applicability of this method using cancer microarray as well as simulated data.

16 citations


Journal ArticleDOI
TL;DR: This paper analyzes a single server Markovian queuing system with the added complexity of customers who are prone to giving up whenever its waiting time is larger than a random threshold - his patience time.
Abstract: This paper analyzes a single server Markovian queuing system with the added complexity of customers who are prone to giving up whenever its waiting time is larger than a random threshold - his pati...

10 citations


Journal ArticleDOI
TL;DR: In this paper, a model for mixtures of generalized lambda distributions (GLDs) was proposed, which can fit the normal well in cases where the normal cannot, and the change of shapes of the mixtures with different proportions.
Abstract: SYNOPTIC ABSTRACTMixture models were studied by Karl Pearson in 1894 when he fitted a mixture of two normal distributions to data consisting of measurements on the ratio of forehead to body length in 1000 crabs. Most work since that time has used mixtures of normal distributions. In this paper, we consider a model for mixtures of generalized lambda distributions (GLDs). The advantage of using the GLD family is that the GLD can fit the normal well, hence whenever a mixture of normals will fit data well, so will a mixture of at most the same number of GLDs. Meanwhile, the GLD family is a much broader family, and can do well in cases where the normal cannot. In this paper, we fit Pearson's data by using the mixture of two GLDs. We also show the change of shapes of the mixtures with different proportions. We include examples and computational considerations compared with normal mixtures by using Kullback-Leibler (KL) distance and overlapping coefficient (δ).

9 citations


Journal ArticleDOI
TL;DR: In this article, three biased but simple estimators for the mean of the normal distribution when the coefficient of variation is known are proposed and their properties are studied and compared with some other existing estimators.
Abstract: SYNOPTIC ABSTRACTIn many situations, the coefficient of variation is known though the mean and variance may not be known. This additional information on the coefficient of variation can be used to improve upon the usual estimator of the unknown mean. Three biased but simple estimators for the mean of the normal distribution when the coefficient of variation is known are proposed and their properties are studied. The performance of these estimators is compared with some other existing methods; and it turns out that the new proposed estimators compete well.

6 citations


Journal ArticleDOI
TL;DR: It is shown, by using the simulated and real mixed data sets, that the proposed algorithm provides a significant improvement in finding more accurate independent components in the presence of outliers.
Abstract: SYNOPTIC ABSTRACTIndependent Component Analysis (ICA) is a statistical and computational technique for decomposing a complex multivariate data into independent components. Several methods have been proposed to find the independent components and they all assume the homogeneity (free of outliers) of the data, which is almost never true in practice. In this study, we propose an algorithm to improve ICA performance in the presence of outliers by introducing an additional step in the data pre-processing. We also show, by using the simulated and real mixed data sets, that the proposed algorithm provides a significant improvement in finding more accurate independent components in the presence of outliers.

5 citations


Journal ArticleDOI
TL;DR: Brown et al. as mentioned in this paper consider the problem of finding the optimal sample size for obtaining a confidence interval of a pre-assigned precision (or length) for the proportion parameter of a finite or infinite binary population.
Abstract: SYNOPTIC ABSTRACTIn this paper we consider the well known problem of finding the optimal sample size for obtaining a confidence interval of a pre-assigned precision (or length) for the proportion parameter of a finite or infinite binary population. We illustrate some special problems that arise due to the discreteness of the population and precision being measured by the length of the interval rather than by the variance. Specifically, the confidence level of an interval of fixed length does not necessarily increase as you increase the sample size. The practitioners usually associate precision with variance, and variance monotonically decreases as sample size increases, regardless of whether the population is discrete or continuous. However, other notions of precision, such as length of the confidence interval as discussed here do not monotonically improve with increasing sample size. Although the results shown here are implicit in previous work [see Brown, Cai, and DasGupta(2001, 2002, 2003) and Cai(2005...

5 citations


Journal ArticleDOI
TL;DR: In this paper, the multivariate version of the generalized Tukey conjecture has been affirmatively proved in the case of three correlated mean vectors by Seo, Mano and Fujikoshi (1994).
Abstract: SYNOPTIC ABSTRACTIn this article, conservative simultaneous confidence intervals for pairwise comparisons among mean vectors in multivariate normal distributions are considered. In order to give the simultaneous confidence intervals, we need the value of the upper 100α percentile of the T2max·p statistic. However, it is difficult to find the exact value. So, as an approximation procedure, Seo, Mano and Fujikoshi (1994) proposed the multivariate Tukey-Kramer procedure which is the multivariate version of Tukey-Kramer procedure (Tukey (1953); Kramer (1956, 1957)). Also, the multivariate version of the generalized Tukey conjecture has been affirmatively proved in the case of three correlated mean vectors by Seo, Mano and Fujikoshi (1994). In this article the affirmative proof of the multivariate generalized Tukey conjecture in the case of four mean vectors can be completed. Further, the upper bound for the conservativeness of the multivariate Tukey-Kramer procedure is also given. Finally, numerical results b...

4 citations


Journal ArticleDOI
TL;DR: Comparative results from a computational experiment over a common set of benchmark problems show the proposed procedure to match or outperform some of the best heuristic routing methods while being fast and highly competitive.
Abstract: SYNOPTIC ABSTRACTRecent progress in information technology brings new challenges in dealing with the dynamic vehicle routing problem with time windows (DVRPTW). However, few alternate methods to the well-known Tabu Search –based techniques have been proposed so far to solve the DVRPTW efficiently. In this paper, a new hybrid genetic approach (HGA-LCS) to address the DVRPTW is presented. The basic scheme consists in concurrently evolving two populations of solutions to minimize customer service denial, lateness or temporal constraint violation, and total traveled distance. Combining variations of key concepts inspired from routing techniques and search strategies to define new hybrid genetic operators, the proposed approach also exploits a least commitment routing policy in servicing scheduled customers to potentially improve solution quality. The strategy consists in delaying customer visits and therefore premature route construction as long as possible to deal with a larger number of customers all at onc...

4 citations


Journal ArticleDOI
TL;DR: In this article, stochastic orderings based on the failure rate function and likelihood ratio are discussed for non-negative integer valued random variables, and it is expected that the results presented could inspire further work on this little-researched area.
Abstract: SYNOPTIC ABSTRACTThere has been little known with respect to stochastic orderings of discrete random variables. In this note, stochastic orderings based on the failure rate function and likelihood ratio are discussed for non-negative integer valued random variables. It is expected that the results presented could inspire further work on this little–researched area.

4 citations


Journal ArticleDOI
TL;DR: In this article, a Monte Carlo study was conducted to study the behavior of bias and variance of estimates, and a procedure was developed to construct confidence interval estimates for these overlap measures.
Abstract: SYNOPTIC ABSTRACTThe exponential distribution is one of the most commonly used distributions for modeling survival functions. Even when the hypothesis testing procedure indicates differences in two exponential parameters, there may be considerable overlap between two distributions. In many practical situations, researchers compare survival functions for two populations using testing of hypothesis procedures. However, an overlap coefficient describes the amount of similarity between two distributions. Thus, instead of just looking at the equality of parameters one also needs to estimate the amount of overlap between two distributions. This paper presents estimation of overlap measures and bias and variance of their estimates. To study the behavior of bias and variance of estimates, a Monte Carlo study was conducted, the results of which are presented in this paper. A procedure was developed to construct confidence interval estimates for these overlap measures. Some results are demonstrated using survival d...

Journal ArticleDOI
TL;DR: In this article, a model is constructed for the dynamic covariance structure in the stock returns which allows us to measure the current diversification level of the investment portfolio of stocks and detect these unusually behaving stocks.
Abstract: SYNOPTIC ABSTRACTThis paper addresses the assessment of risk associated with an investment portfolio of stocks. We consider two issues concerning risk: the overall diversification level of the investment portfolio of stocks and stocks that behave unusually when compared to similar stocks in the portfolio. A model is constructed for the dynamic covariance structure in the stock returns which allows us to measure the current diversification level of the investment portfolio. Once we are able to estimate this dynamic covariance structure, we are able to detect these unusually behaving stocks.

Journal ArticleDOI
TL;DR: Initial results are described from an ongoing research that statistically compares goodness-of-fit obtained from fitting several families of distributions to a sample of commonly applied distributions, and suggest that it is possible to rank widely used families of distribution in terms of their capability to serve as general platforms for distribution fitting.
Abstract: SYNOPTIC ABSTRACTResponse Modeling Methodology (RMM) is a new approach for empirical modeling of systematic variation and of random variation. Applied to various fields of science, engineering and operations management, RMM has been shown to deliver good modeling capabilities while preserving desirable “uniformity of practice” across widely divergent disciplines. In this paper, RMM is briefly outlined, and its basic philosophy, relative to other approaches, is discussed. A detailed numerical example demonstrates application of RMM for distribution fitting and compares the results to fitting by generalized lambda distribution. Initial results are described from an ongoing research that statistically compares goodness-of-fit obtained from fitting several families of distributions to a sample of commonly applied distributions. The results suggest that it is possible to rank widely used families of distributions in terms of their capability to serve as general platforms for distribution fitting.

Journal ArticleDOI
TL;DR: In this article, the α-cut approach is used to transform a fuzzy queue into a family of crisp queues in this context, and a set of parametric nonlinear programs is developed to describe the crisp queues with nonpreemptive priority.
Abstract: SYNOPTIC ABSTRACTThe nonpreemptive priority queues are effective for performance evaluations of production/manufacturing systems, inventory control and computer and telecommunication systems. Due to uncontrollable factors, parameters in the nonpreemptive priority queues may be fuzzy. This work constructs the membership functions of the system characteristics of a nonpreemptive priority queueing system with J type jobs, in which the arrival rate and service rate for jobs are all fuzzy numbers. The α-cut approach is used to transform a fuzzy queue into a family of conventional crisp (distinct) queues in this context. By means of the membership functions of the system characteristics, a set of parametric nonlinear programs is developed to describe the family of crisp queues with nonpreemptive priority. A numerical example is solved successfully to illustrate the validity of the proposed approach. Because the system characteristics are expressed and governed by the membership functions, the fuzzy nonpreemptiv...

Journal ArticleDOI
TL;DR: In this article, a convolution technique is used to smooth out the empirical density function of the combined samples, which is later used in the construction of a log likelihood function for the combined sample, and the likelihood function is minimized with respect to an arbitrary shift variable to find an estimate of the true shift parameter.
Abstract: SYNOPTIC ABSTRACTThis study estimates the shift parameter in the two-sample location problem. The proposed method combines the sample X and sample Y which is shifted with an arbitrary shift variable and a convolution technique is used to smooth out the empirical density function of the combined samples. The smoothed empirical density function is later used in the construction of a log likelihood function for the combined sample. The likelihood function is minimized with respect to an arbitrary shift variable to find an estimate of the true shift parameter. As shown in the study, the proposed estimator is asymptotically efficient and robust against contaminations and outliers. This result is supported by an example and by a simulation study.

Journal ArticleDOI
Xiao-Feng Wang1
TL;DR: In this article, a semiparametric mixed regression model was proposed to decompose pattern variation by using the Karhunen-Loeve expansion of a spatial random process.
Abstract: SYNOPTIC ABSTRACTThis paper is concerned with analyzing the image data with repeated measurement. Modeling the complex spatial variability in medical images can enhance the inference for comparing treatments and examining the time evolution. We propose a semiparametric mixed regression model that combines the robustness of a low-rank spline method with the efficiency of likelihood-based parameter estimation. In the convenient framework of a linear mixed model, we decompose pattern variation by using the Karhunen-Loeve expansion of a spatial random process. Additional spatial covariates, the time variable and other experimental factors are easily accommodated in the linear predictor, which enables comprehensive analyses underlying spatial patterns. The back-fitting algorithm and the smoothing parameter selections are investigated by simulation for our semiparametric model. We illustrate our method through application to data from a clinical study of neuromuscular electrical stimulation effects on the seati...

Journal ArticleDOI
TL;DR: In this article, the problem of simultaneous predicted response and prediction of average value of the study variable in a linear regression model when some prior exact restrictions are available, which bind the regression coefficients is considered.
Abstract: SYNOPTIC ABSTRACTThis paper considers the problem of simultaneous predicted response and prediction of average value of the study variable in a linear regression model when some prior exact restrictions are available, which bind the regression coefficients. The performance properties of the predictors based on restricted least squares and improved estimators are analyzed. A comparison of these predictors with respect to risk as the performance criterion is then presented.

Journal ArticleDOI
TL;DR: In this article, the authors define skew-symmetric distributions based on the reflected generalized uniform distribution, a double compound gamma distribution and a double generalized Pareto distribution, all of which have symmetric density about zero.
Abstract: SYNOPTIC ABSTRACTWe define skew-symmetric distributions based on the reflected generalized uniform distribution, a double compound gamma distribution and a double generalized Pareto distribution, all of which have symmetric density about zero. Expressions are derived for the probability density function (pdf), cumulative distribution function (cdf), and moments of the distributions. Several special functions are involved in the calculations.

Journal ArticleDOI
TL;DR: In this article, a truncated D-optimal screening algorithm is proposed to reduce the number of experimental runs below a minimum D-optimality criterion using covering arrays, which are adapted to guide this reduction.
Abstract: SYNOPTIC ABSTRACTD-optimal designs have proved useful in analyzing common factorial experiments involving multilevel categorical factors. When analyzed by ANOVA, they allow the estimation of coefficients in a regression equation and the contributions to the variance by the main effects and interactions. If the measurement of contribution to variance is necessary but the estimation of all interaction coefficients in the regression equation is not, it is possible to reduce the number of experimental runs below a minimum D-optimal design, using what we call truncated D-optimal screening designs. D-efficiency calculations are not available due to the singularity of the design matrix; another method must be used to pare down the matrix while maintaining reasonable estimation of the original full factorial data. Covering arrays are adapted to guide this reduction. Combining properties of D-optimal designs and covering arrays produces designs that perform well at estimating full factorial results. A method is th...

Journal ArticleDOI
TL;DR: In this article, the authors derived explicit expressions for the moments of Y for any fixed r and the product moments of arbitrary higher orders of the variables X and Y. The results are derived by assuming that the distribution of the random variable is a member of the generalized lambda distribution family.
Abstract: SYNOPTIC ABSTRACTLet X be a unimodal continuous random variable with support (α, β) and Y = min(X,r), α < r < β, α and β being any reals. Then y is a mixed type random variable, continuous within the range α < y < r and discrete at Y = r, and we call Y a mixed type of truncated random variable. In this article, we derive the explicit expressions for the moments of Y for any fixed r and the product moments of arbitrary higher orders of the variables X and Y. The results are derived by assuming that the distribution of the random variable is a member of the generalized lambda distribution family. Applications of the results to find the optimal deductibles in the purchase of an automobile insurance policy and the optimum order size maximizing the expected utility of an investor in an inventory model are given. As the method employed is independent of any specific distributional assumptions, it can be used for all unimodal continuous distributions.

Journal ArticleDOI
TL;DR: In this paper, a semi-parametric method is proposed to evaluate the risks of rare events from biased samples with non-ignorable missing values, and a modified iterative re-weighted least square estimator is developed to correct the biases caused by both the self-selection and the nonresponses.
Abstract: SYNOPTIC ABSTRACTSurvey data usually contain a large portion of missing values and may be subject to selection bias as well. In this paper, a semi-parametric method is proposed to evaluate the risks of rare events from biased samples with non-ignorable missing values. A logistic regression model is used and a modified iterative re-weighted least square estimator is developed to correct the biases caused by both the self-selection and the nonresponses. The performance of the new method is illustrated via simulations by comparing with the performances of three existing methods.

Journal ArticleDOI
TL;DR: In this paper, the authors provided variance formulae for the number of one-and two-step level-crossings in a strictly stationary ellipsoidal process whose finite dimensional distributions are governed by sacle mixtures of normal distributions.
Abstract: SYNOPTIC ABSTRACTThe number of one-step level-crossings corresponding to fixed levels υ and υ (−∞ < υ < υ < ∞) for a sample X1,…, XN is defined by the number of crossings such that Xk ≥ υ and υ ≤ Xk–1 < υ or υ ≤ Xk < υ and Xk–1 ≥ υ, or υ ≤ Xk < υ and Xk–1 < υ, or finally Xk < υ and υ ≤ Xk–1 < υ for k = 2,3…,N. Similarly the number of two-step level-crossings is defined by the number of crossings such that Xk ≥ υ and Xk–1 < υ, or Xk < υ and Xk–1 ≥ υ. The paper provides variance formulae for the numbers of one- and two-step level-crossings in a strictly stationary ellipsoidal process whose finite dimensional distributions are governed by sacle mixtures of normal distributions. A numerical illustration for Gaussian, Cauchy, t- and Laplace processes is given.

Journal ArticleDOI
TL;DR: In this paper, a fully Bayesian approach to estimation of parameters for generalized Poisson data in a multiple population context has been proposed, which has applications in industrial, biological and sociological disciplines.
Abstract: SYNOPTIC ABSTRACTWe consider a fully Bayesian approach to estimation of parameters for generalized Poisson data in a multiple population context. The hierarchical model we consider here extends previous single population models. This hierarchical model has applications in industrial, biological and sociological disciplines. We also extend two recently developed subset selection criteria to the generalized Poisson hierarchical model we propose. The procedures are used to analyze two real data sets. Probabilities are approximated via Markov Chain Monte Carlo.

Journal ArticleDOI
TL;DR: In this paper, principal component analysis (PCA) is used as a penalty function in a maximum penalized likelihood setting to measure fidelity of rotated components to the data, where the objective is to assign interpretations to the components that provide intuition and serve as a guide for further exploration.
Abstract: SYNOPTIC ABSTRACTPrincipal component analysis provides a ready exploratory tool for multivariate data when a priori models are unavailable. The objective is to assign interpretations to the components that provide intuition and serve as a guide for further exploration. However, principal components based on small samples are subject to high sampling variation that can obscure straightforward interpretations. Ad hoc techniques like Varimax rotation can enhance interpretability, at the expense of possibly losing fidelity to the data. These problems are alleviated if rotation criteria are instead used as penalty functions in a maximum penalized likelihood setting. Advantageous features of this approach include a smooth continuum of possible rotations, preferential rotation of components that are poorly defined, and a way to measure fidelity of rotated components to the data. Some computational challenges inherent in this technique have been alleviated by recent developments in algorithms for optimization sub...

Journal ArticleDOI
TL;DR: In this paper, a nonlinear time series model that can best be described as a complex-valued random coefficient regression is presented, with the design matrix depending nonlinearly on the direction and speed parameters.
Abstract: SYNOPTIC ABSTRACTAn important objective of environmental monitoring study is to understand and characterize the effects of local transport of air pollutants through the diurnal or semidiurnal pattern of pollutant concentrations. Due to the complexity of spatial and temporal dynamics of pollutant data, this problem has not received serious attention in statistics. The problem involves developing a statistical model for a hypothesis test for no-transport and estimating the incoming direction and speed of the transport that crosses a set of monitoring stations at a given time period. This paper discusses a nonlinear time series model that can best be described as a complex-valued random coefficient regression, with the design matrix depending nonlinearly on the direction and speed parameters. Using observed pollutant data from a set of local monitoring stations, maximum likelihood estimation for the transport parameters and a hypothesis test for no-transport are considered under stochastic correlated signal ...