scispace - formally typeset
Search or ask a question
Book ChapterDOI

Penalty Specialists Among Goalkeepers: A Nonparametric Bayesian Analysis of 44 Years of German Bundesliga

TL;DR: In this paper, the authors used Bayesian hierarchical random effects models to shrink the individual goalkeepers estimates towards an overall estimate with the degree of shrinkage depending on the amount of information that is available for each goalkeeper.
Abstract: Penalty saving abilities are of major importance for a goalkeeper in modern football. However, statistical investigations of the performance of individual goalkeepers in penalties, leading to a ranking or a clustering of the keepers, are rare in the scientific literature. In this paper we will perform such an analysis based on all penalties in the German Bundesliga from 1963 to 2007. A challenge when analyzing such a data set is the fact that the counts of penalties for the different goalkeepers are highly imbalanced, leading to the question on how to compare goalkeepers who were involved in a disparate number of penalties. We will approach this issue by using Bayesian hierarchical random effects models. These models shrink the individual goalkeepers estimates towards an overall estimate with the degree of shrinkage depending on the amount of information that is available for each goalkeeper. The underlying random effects distribution will be modelled nonparametrically based on the Dirichlet process. Proceeding this way relaxes the assumptions underlying parametric random effect models and additionally allows to find clusters among the goalkeepers.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the existing goalkeeper (GK) research is based on GK performance but not with a match analysis theme, instead, it is focused on physiology, psychology and injury prevention.
Abstract: Much of the existing goalkeeper (GK) research is based around GK’s performance but not with a match analysis theme. Research has focused on physiology, psychology and injury prevention. Performance...

20 citations

Journal ArticleDOI
TL;DR: In this article, the impact of players' domestic/foreign status on performance-based pay offered to professional footballers is investigated. But the authors focus on the Italian league Serie A and find that the relationship between previous and current performance was partially mediated by the current salary.
Abstract: This study contributes to research on migrant pay disparities by analysing the impact of players' domestic/foreign status on performance-based pay offered to professional footballers, to understand if foreign players benefit from a preferential labour market. We used information from publicly available data of 275 footballers who played for two consecutive seasons in the Italian league Serie A. We found that the relationship between previous and current performance was partially mediated by the current salary. This result reinforced earlier findings on the pay-performance relationship, where seasonal performance is particularly relevant. Moreover, our results show that pay discrimination does not indicate a straightforward (dis)advantage for one group, but presents a more complex picture. We have examined possible underlying reasons for these disparities and offered suggestions for further research. We conclude by discussing how clubs and managers could consider incentives to strengthen pay-performance relationships by being sensitive to the complex influence of players' origins.

14 citations


Cites background or result from "Penalty Specialists Among Goalkeepe..."

  • ...This view is consistent with recent literature in football that focuses on the role of positional skills in determining the performance of both players and teams (e.g. Bornkamp et al., 2009; Seaton and Campos, 2011)....

    [...]

  • ...Nonetheless, using game statistics has the advantage of ensuring the data’s objectivity, but it might not entirely reflect each player’s performance in a particular game (Bornkamp et al., 2009)....

    [...]

  • ...Empirical evidence suggests that different roles require different skills to be successful (e.g. Bornkamp et al., 2009; Seaton and Campos, 2011)....

    [...]

01 Jan 2002
TL;DR: In this paper, the authors used a paradigm simulating a penalty kick in the laboratory to investigate the dynamics of the closed-loop strategy in these controlled conditions, and the probability of correctly responding to the simulated goalkeeper motion as a function of time available followed a logistic curve.
Abstract: Sport scientists have devoted relatively little attention to soccer penalty kicks, despite their decisive role in important competitions such as the World Cup. Two possible kicker strategies have been described: ignoring the goalkeeper action (open loop) or trying to react to the goalkeeper action (closed loop). We used a paradigm simulating a penalty kick in the laboratory to investigate the dynamics of the closed-loop strategy in these controlled conditions. The probability of correctly responding to the simulated goalkeeper motion as a function of time available followed a logistic curve. Kickers on average reached perfect performance only if the goalkeeper committed him or herself to one side about 400 ms before ball contact and showed chance performance if the goalkeeper motion occurred less than 150 ms before ball contact. Interestingly, coincidence judgement - another aspect of the laboratory responses - appeared to be affected for a much longer time (>500 ms) than was needed to correctly determine...

6 citations

Journal ArticleDOI
TL;DR: This work investigates the potential occurrence of a 'hot shoe' effect for the performance of penalty takers in football based on data from the German Bundesliga, and considers hidden Markov models (HMMs) to model the (latent) forms of players.
Abstract: We propose a penalized likelihood approach in hidden Markov models (HMMs) to perform automated variable selection. To account for a potential large number of covariates, which also may be substanti...

3 citations


Cites methods from "Penalty Specialists Among Goalkeepe..."

  • ...Parts of the data have already been used in Bornkamp et al. (2009)....

    [...]

  • ...ata The considered data set comprises all taken penalty kicks in the German Bundesliga from its rst season 1963/1964 until the end of the season 2016/2017. Parts of the data have already been used in Bornkamp et al. (2009). In the analysis, we include all players who took at least 5 penalty kicks during the time period considered, resulting in n = 3;482 penalty kicks taken by 310 dierent players. For these penalty kic...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

11,691 citations

Journal ArticleDOI
TL;DR: In this article, a class of prior distributions, called Dirichlet process priors, is proposed for nonparametric problems, for which treatment of many non-parametric statistical problems may be carried out, yielding results that are comparable to the classical theory.
Abstract: The Bayesian approach to statistical problems, though fruitful in many ways, has been rather unsuccessful in treating nonparametric problems. This is due primarily to the difficulty in finding workable prior distributions on the parameter space, which in nonparametric ploblems is taken to be a set of probability distributions on a given sample space. There are two desirable properties of a prior distribution for nonparametric problems. (I) The support of the prior distribution should be large--with respect to some suitable topology on the space of probability distributions on the sample space. (II) Posterior distributions given a sample of observations from the true probability distribution should be manageable analytically. These properties are antagonistic in the sense that one may be obtained at the expense of the other. This paper presents a class of prior distributions, called Dirichlet process priors, broad in the sense of (I), for which (II) is realized, and for which treatment of many nonparametric statistical problems may be carried out, yielding results that are comparable to the classical theory. In Section 2, we review the properties of the Dirichlet distribution needed for the description of the Dirichlet process given in Section 3. Briefly, this process may be described as follows. Let $\mathscr{X}$ be a space and $\mathscr{A}$ a $\sigma$-field of subsets, and let $\alpha$ be a finite non-null measure on $(\mathscr{X}, \mathscr{A})$. Then a stochastic process $P$ indexed by elements $A$ of $\mathscr{A}$, is said to be a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with parameter $\alpha$ if for any measurable partition $(A_1, \cdots, A_k)$ of $\mathscr{X}$, the random vector $(P(A_1), \cdots, P(A_k))$ has a Dirichlet distribution with parameter $(\alpha(A_1), \cdots, \alpha(A_k)). P$ may be considered a random probability measure on $(\mathscr{X}, \mathscr{A})$, The main theorem states that if $P$ is a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with parameter $\alpha$, and if $X_1, \cdots, X_n$ is a sample from $P$, then the posterior distribution of $P$ given $X_1, \cdots, X_n$ is also a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with a parameter $\alpha + \sum^n_1 \delta_{x_i}$, where $\delta_x$ denotes the measure giving mass one to the point $x$. In Section 4, an alternative definition of the Dirichlet process is given. This definition exhibits a version of the Dirichlet process that gives probability one to the set of discrete probability measures on $(\mathscr{X}, \mathscr{A})$. This is in contrast to Dubins and Freedman [2], whose methods for choosing a distribution function on the interval [0, 1] lead with probability one to singular continuous distributions. Methods of choosing a distribution function on [0, 1] that with probability one is absolutely continuous have been described by Kraft [7]. The general method of choosing a distribution function on [0, 1], described in Section 2 of Kraft and van Eeden [10], can of course be used to define the Dirichlet process on [0, 1]. Special mention must be made of the papers of Freedman and Fabius. Freedman [5] defines a notion of tailfree for a distribution on the set of all probability measures on a countable space $\mathscr{X}$. For a tailfree prior, posterior distribution given a sample from the true probability measure may be fairly easily computed. Fabius [3] extends the notion of tailfree to the case where $\mathscr{X}$ is the unit interval [0, 1], but it is clear his extension may be made to cover quite general $\mathscr{X}$. With such an extension, the Dirichlet process would be a special case of a tailfree distribution for which the posterior distribution has a particularly simple form. There are disadvantages to the fact that $P$ chosen by a Dirichlet process is discrete with probability one. These appear mainly because in sampling from a $P$ chosen by a Dirichlet process, we expect eventually to see one observation exactly equal to another. For example, consider the goodness-of-fit problem of testing the hypothesis $H_0$ that a distribution on the interval [0, 1] is uniform. If on the alternative hypothesis we place a Dirichlet process prior with parameter $\alpha$ itself a uniform measure on [0, 1], and if we are given a sample of size $n \geqq 2$, the only nontrivial nonrandomized Bayes rule is to reject $H_0$ if and only if two or more of the observations are exactly equal. This is really a test of the hypothesis that a distribution is continuous against the hypothesis that it is discrete. Thus, there is still a need for a prior that chooses a continuous distribution with probability one and yet satisfies properties (I) and (II). Some applications in which the possible doubling up of the values of the observations plays no essential role are presented in Section 5. These include the estimation of a distribution function, of a mean, of quantiles, of a variance and of a covariance. A two-sample problem is considered in which the Mann-Whitney statistic, equivalent to the rank-sum statistic, appears naturally. A decision theoretic upper tolerance limit for a quantile is also treated. Finally, a hypothesis testing problem concerning a quantile is shown to yield the sign test. In each of these problems, useful ways of combining prior information with the statistical observations appear. Other applications exist. In his Ph. D. dissertation [1], Charles Antoniak finds a need to consider mixtures of Dirichlet processes. He treats several problems, including the estimation of a mixing distribution, bio-assay, empirical Bayes problems, and discrimination problems.

5,033 citations


"Penalty Specialists Among Goalkeepe..." refers background or methods in this paper

  • ...A flexible and convenient solution is to use the Dirichlet process, dating back to [5]....

    [...]

  • ...which allows for an efficient exact implementation in many cases (see [5] for details)....

    [...]

ReportDOI
01 May 1991
TL;DR: In this article, a class of priors known as Dirichlet measures have been used for the distribution of a random variable X when it takes values in R sub K, where K is the dimension of all probability measures on a large space.
Abstract: : The parameter in a Bayesian nonparametric problem is the unknown distribution P of the observation X. A Bayesian uses a prior distribution for P, and after observing X, solves the statistical inference problem by using the posterior distribution of P, which is the conditional distribution of P given X. For Bayesian nonparametrics to be successful one needs a large class of priors for which posterior distributions can be easily calculated. Unless X takes values in a finite space, the unknown distribution P varies in an infinite dimensional space. Thus one has to talk about measures in a complicated space like the space of all probability measures on a large space. This has always required a more careful attention to the attendant measure theoretic problems. A class of priors known as Dirichlet measures have been used for the distribution of a random variable X when it takes values in R sub K.

2,162 citations


"Penalty Specialists Among Goalkeepe..." refers background in this paper

  • ...Another reason for the popularity of Dirichlet process priors is the constructive stick-breaking representation of the Dirichlet process given by [17]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the conditional distribution of the random measure, given the observations, is no longer that of a simple Dirichlet process, but can be described as being a mixture of DirICHlet processes.
Abstract: process. This paper extends Ferguson's result to cases where the random measure is a mixing distribution for a parameter which determines the distribution from which observations are made. The conditional distribution of the random measure, given the observations, is no longer that of a simple Dirichlet process, but can be described as being a mixture of Dirichlet processes. This paper gives a formal definition for these mixtures and develops several theorems about their properties, the most important of which is a closure property for such mixtures. Formulas for computing the conditional distribution are derived and applications to problems in bio-assay, discrimination, regression, and mixing distributions are given.

2,146 citations


"Penalty Specialists Among Goalkeepe..." refers background in this paper

  • ...For a random sample of size n from a probability distribution realized by a Dirichlet process [1] has shown that the prior density of the number of distinct values (clusters/components) k in n realizations is...

    [...]

Journal ArticleDOI
TL;DR: Two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stick-breaking priors are presented and the blocked Gibbs sampler, based on an entirely different approach that works by directly sampling values from the posterior of the random measure.
Abstract: A rich and flexible class of random probability measures, which we call stick-breaking priors, can be constructed using a sequence of independent beta random variables. Examples of random measures that have this characterization include the Dirichlet process, its two-parameter extension, the two-parameter Poisson–Dirichlet process, finite dimensional Dirichlet priors, and beta two-parameter processes. The rich nature of stick-breaking priors offers Bayesians a useful class of priors for nonparametric problems, while the similar construction used in each prior can be exploited to develop a general computational procedure for fitting them. In this article we present two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stick-breaking priors. The first type of Gibbs sampler, referred to as a Polya urn Gibbs sampler, is a generalized version of a widely used Gibbs sampling method currently employed for Dirichlet process computing. This method applies t...

1,701 citations