scispace - formally typeset
Search or ask a question
Author

Patrick E. Osborne

Other affiliations: University of Exeter, University of Stirling, Newbury College  ...read more
Bio: Patrick E. Osborne is an academic researcher from University of Southampton. The author has contributed to research in topics: Population & Bustard. The author has an hindex of 28, co-authored 81 publications receiving 7733 citations. Previous affiliations of Patrick E. Osborne include University of Exeter & University of Stirling.


Papers
More filters
Journal ArticleDOI
TL;DR: It was found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold-based pre-selection and the value of GLM in combination with penalised methods and thresholds when omitted variables are considered in the final interpretation.
Abstract: Collinearity refers to the non independence of predictor variables, usually in a regression-type analysis. It is a common feature of any descriptive ecological data set and can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of relevant predictors in a statistical model. Collinearity is a severe problem when a model is trained on data from one region or time, and predicted to another with a different or unknown structure of collinearity. To demonstrate the reach of the problem of collinearity in ecology, we show how relationships among predictors differ between biomes, change over spatial scales and through time. Across disciplines, different approaches to addressing collinearity problems have been developed, ranging from clustering of predictors, threshold-based pre-selection, through latent variable methods, to shrinkage and regularisation. Using simulated data with five predictor-response relationships of increasing complexity and eight levels of collinearity we compared ways to address collinearity with standard multiple regression and machine-learning approaches. We assessed the performance of each approach by testing its impact on prediction to new data. In the extreme, we tested whether the methods were able to identify the true underlying relationship in a training dataset with strong collinearity by evaluating its performance on a test dataset without any collinearity. We found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold-based pre-selection. Our results highlight the value of GLM in combination with penalised methods (particularly ridge) and threshold-based pre-selection when omitted variables are considered in the final interpretation. However, all approaches tested yielded degraded predictions under change in collinearity structure and the ‘folk lore’-thresholds of correlation coefficients between predictor variables of |r| >0.7 was an appropriate indicator for when collinearity begins to severely distort model estimation and subsequent prediction. The use of ecological understanding of the system in pre-analysis variable selection and the choice of the least sensitive statistical approaches reduce the problems of collinearity, but cannot ultimately solve them.

6,199 citations

Journal ArticleDOI
TL;DR: In this article, the authors presented predictive models for great bustards in central Spain based on readily available advanced very high resolution radiometer (AVHRR) satellite imagery combined with mapped features in the form of geographic information system (GIS) data layers.
Abstract: Summary 1. Many species are adversely affected by human activities at large spatial scales and their conservation requires detailed information on distributions. Intensive ground surveys cannot keep pace with the rate of land-use change over large areas and new methods are needed for regional-scale mapping. 2. We present predictive models for great bustards in central Spain based on readily available advanced very high resolution radiometer (AVHRR) satellite imagery combined with mapped features in the form of geographic information system (GIS) data layers. As AVHRR imagery is coarse-grained, we used a 12-month time series to improve the definition of habitat types. The GIS data comprised measures of proximity to features likely to cause disturbance and a digital terrain model to allow for preference for certain topographies. 3. We used logistic regression to model the above data, including an autologistic term to account for spatial autocorrelation. The results from models were combined using Bayesian integration, and model performance was assessed using receiver operating characteristics plots. 4. Sites occupied by bustards had significantly lower densities of roads, buildings, railways and rivers than randomly selected survey points. Bustards also occurred within a narrower range of elevations and at locations with significantly less variable terrain. 5. Logistic regression analysis showed that roads, buildings, rivers and terrain all contributed significantly to the difference between occupied and random sites. The Bayesian integrated probability model showed an excellent agreement with the original census data and predicted suitable areas not presently occupied. 6. The great bustard’s distribution is highly fragmented and vacant habitat patches may occur for a variety of reasons, including the species’ very strong fidelity to traditional sites through conspecific attraction. This may limit recolonization of previously occupied sites. 7. We conclude that AVHRR satellite imagery and GIS data sets have potential to map distributions at large spatial scales and could be applied to other species. While models based on imagery alone can provide accurate predictions of bustard habitats at some spatial scales, terrain and human influence are also significant predictors and are needed for finer scale modelling.

465 citations

Journal ArticleDOI
TL;DR: In this article, the authors examined the effects of agricultural abandonment on birds during the breeding and non-breeding seasons in the Mediterranean and Eurosiberian regions of Spain using a successional gradient.

219 citations

Journal ArticleDOI
TL;DR: In this paper, the uncertainty generated by using different climate predictor variable sets for modelling the impacts of climate change is assessed and the use of sound ecological theory and statistical methods to check predictor variables can reduce this uncertainty, but our knowledge of species may be too limited to make more than arbitrary choices.
Abstract: Aim: species distribution modelling is commonly used to guide future conservation policies in the light of potential climate change. However, arbitrary decisions during the model-building process can affect predictions and contribute to uncertainty about where suitable climate space will exist. For many species, the key climatic factors limiting distributions are unknown. This paper assesses the uncertainty generated by using different climate predictor variable sets for modelling the impacts of climate change. Location: Europe, 10° W to 50° E and 30° N to 60° N. Methods: using 1453 presence pixels at 30 arcsec resolution for the great bustard (Otis tarda), predictions of future distribution were made based on two emissions scenarios, three general climate models and 26 sets of predictor variables. Twenty-six current models were created, and 156 for both 2050 and 2080. Map comparison techniques were used to compare predictions in terms of the quantity and the location of presences (map comparison kappa, MCK) and using a range change index (RCI). Generalized linear models (GLMs) were used to partition explained deviance in MCK and RCI among sources of uncertainty. Results: the 26 different variable sets achieved high values of AUC (area under the receiver operating characteristic curve) and yet introduced substantial variation into maps of current distribution. Differences between maps were even greater when distributions were projected into the future. Some 64–78% of the variation between future maps was attributable to choice of predictor variable set alone. Choice of general climate model and emissions scenario contributed a maximum of 15% variation and their order of importance differed for MCK and RCI. Main conclusions: generalized variable sets produce an unmanageable level of uncertainty in species distribution models which cannot be ignored. The use of sound ecological theory and statistical methods to check predictor variables can reduce this uncertainty, but our knowledge of species may be too limited to make more than arbitrary choices. When all sources of modelling uncertainty are considered together, it is doubtful whether ensemble methods offer an adequate solution. Future studies should explicitly acknowledge uncertainty due to arbitrary choices in the model-building process and develop ways to convey the results to decision-makers

206 citations

Journal ArticleDOI
TL;DR: There was a significant degree of consistency among bird species in the ranking of crops, with oil-seed rape the most preferred and spring-sown cereal the least preferred.
Abstract: 1. Passerine birds were surveyed during the breeding season in hedgerows on 46 farms in lowland England. The incidence of each species was recorded in 50-m lengths of hedgerow and various attributes of these hedgerow sections were also recorded. 2. Logistic regression models were fitted to the data to describe the effects on the incidence of 18 bird species of the number of trees, hedge height and width, dominant plant species in the woody hedge, under the hedge and adjacent to the hedge in the uncultivated strip, the number of woody species in a standard length and other hedgerow characteristics. The effects of adjacent land use and cropping, reduced use of pesticides on cereal field edges and the geographical location of the study farms were also included in the models. 3. Most bird species preferred tall hedges with many trees, but there were some (dunnock, willow warbler and lesser whitethroat) which preferred tall hedges with few trees and others (whitethroat, linnet, yellowhammer) which preferred short hedges with few trees. 4. The differences among bird species in response to a sevenfold reduction in the height of hedges estimated from the models showed good agreement with the variation among species in the effects of severe hedge cutting on bird populations at one farm observed in an independent study. 5. The incidence of six bird species was positively influenced by the number of woody species in a standard length of hedgerow. 6. The incidence of two bird species was significantly affected by the identity of the dominant woody plant species in the hedge and one species by the identity of the dominant plant species at the base of the hedge. 7. Land use adjacent to the hedgerow, categorized as grass, tillage and roadside, had a significant influence on the incidence of five species. However, there was no evidence of consistency among species in the direction of effects. 8. The crops grown on tilled land adjacent to the hedgerow had a significant influence on the incidence of the blackbird. There was a significant degree of consistency among bird species in the ranking of crops, with oil-seed rape the most preferred and spring-sown cereal the least preferred. 9. The incidence of greenfinch, robin and song thrush was significantly lower in hedgerows adjacent to autumn-sown cereals which had received reduced levels of spraying of pesticides than in those adjacent to autumn-sown cereals which were fully sprayed. Most of the other species showed non-significant differences in the same direction. Most of the species studied also showed a non-significant tendency towards higher incidence in hedgerows adjacent to spring-sown cereals with reduced spraying than in those adjacent to fully sprayed spring-sown cereals.

204 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Preface to the Princeton Landmarks in Biology Edition vii Preface xi Symbols used xiii 1.
Abstract: Preface to the Princeton Landmarks in Biology Edition vii Preface xi Symbols Used xiii 1. The Importance of Islands 3 2. Area and Number of Speicies 8 3. Further Explanations of the Area-Diversity Pattern 19 4. The Strategy of Colonization 68 5. Invasibility and the Variable Niche 94 6. Stepping Stones and Biotic Exchange 123 7. Evolutionary Changes Following Colonization 145 8. Prospect 181 Glossary 185 References 193 Index 201

14,171 citations

Journal ArticleDOI
25 Apr 2013-Nature
TL;DR: These new risk maps and infection estimates provide novel insights into the global, regional and national public health burden imposed by dengue and will help to guide improvements in disease control strategies using vaccine, drug and vector control methods, and in their economic evaluation.
Abstract: Dengue is a systemic viral infection transmitted between humans by Aedes mosquitoes. For some patients, dengue is a life-threatening illness. There are currently no licensed vaccines or specific therapeutics, and substantial vector control efforts have not stopped its rapid emergence and global spread. The contemporary worldwide distribution of the risk of dengue virus infection and its public health burden are poorly known. Here we undertake an exhaustive assembly of known records of dengue occurrence worldwide, and use a formal modelling framework to map the global distribution of dengue risk. We then pair the resulting risk map with detailed longitudinal information from dengue cohort studies and population surfaces to infer the public health burden of dengue in 2010. We predict dengue to be ubiquitous throughout the tropics, with local spatial variations in risk influenced strongly by rainfall, temperature and the degree of urbanization. Using cartographic approaches, we estimate there to be 390 million (95% credible interval 284-528) dengue infections per year, of which 96 million (67-136) manifest apparently (any level of disease severity). This infection total is more than three times the dengue burden estimate of the World Health Organization. Stratification of our estimates by country allows comparison with national dengue reporting, after taking into account the probability of an apparent infection being formally reported. The most notable differences are discussed. These new risk maps and infection estimates provide novel insights into the global, regional and national public health burden imposed by dengue. We anticipate that they will provide a starting point for a wider discussion about the global impact of this disease and will help to guide improvements in disease control strategies using vaccine, drug and vector control methods, and in their economic evaluation.

7,238 citations

Journal ArticleDOI
TL;DR: A review of predictive habitat distribution modeling is presented, which shows that a wide array of models has been developed to cover aspects as diverse as biogeography, conservation biology, climate change research, and habitat or species management.

6,748 citations

Journal ArticleDOI
TL;DR: It was found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold-based pre-selection and the value of GLM in combination with penalised methods and thresholds when omitted variables are considered in the final interpretation.
Abstract: Collinearity refers to the non independence of predictor variables, usually in a regression-type analysis. It is a common feature of any descriptive ecological data set and can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of relevant predictors in a statistical model. Collinearity is a severe problem when a model is trained on data from one region or time, and predicted to another with a different or unknown structure of collinearity. To demonstrate the reach of the problem of collinearity in ecology, we show how relationships among predictors differ between biomes, change over spatial scales and through time. Across disciplines, different approaches to addressing collinearity problems have been developed, ranging from clustering of predictors, threshold-based pre-selection, through latent variable methods, to shrinkage and regularisation. Using simulated data with five predictor-response relationships of increasing complexity and eight levels of collinearity we compared ways to address collinearity with standard multiple regression and machine-learning approaches. We assessed the performance of each approach by testing its impact on prediction to new data. In the extreme, we tested whether the methods were able to identify the true underlying relationship in a training dataset with strong collinearity by evaluating its performance on a test dataset without any collinearity. We found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold-based pre-selection. Our results highlight the value of GLM in combination with penalised methods (particularly ridge) and threshold-based pre-selection when omitted variables are considered in the final interpretation. However, all approaches tested yielded degraded predictions under change in collinearity structure and the ‘folk lore’-thresholds of correlation coefficients between predictor variables of |r| >0.7 was an appropriate indicator for when collinearity begins to severely distort model estimation and subsequent prediction. The use of ecological understanding of the system in pre-analysis variable selection and the choice of the least sensitive statistical approaches reduce the problems of collinearity, but cannot ultimately solve them.

6,199 citations

Journal ArticleDOI
TL;DR: Thirteen recommendations are made to enable the objective selection of an error assessment technique for ecological presence/absence models and a new approach to estimating prediction error, which is based on the spatial characteristics of the errors, is proposed.
Abstract: Predicting the distribution of endangered species from habitat data is frequently perceived to be a useful technique. Models that predict the presence or absence of a species are normally judged by the number of prediction errors. These may be of two types: false positives and false negatives. Many of the prediction errors can be traced to ecological processes such as unsaturated habitat and species interactions. Consequently, if prediction errors are not placed in an ecological context the results of the model may be misleading. The simplest, and most widely used, measure of prediction accuracy is the number of correctly classified cases. There are other measures of prediction success that may be more appropriate. Strategies for assessing the causes and costs of these errors are discussed. A range of techniques for measuring error in presence/absence models, including some that are seldom used by ecologists (e.g. ROC plots and cost matrices), are described. A new approach to estimating prediction error, which is based on the spatial characteristics of the errors, is proposed. Thirteen recommendations are made to enable the objective selection of an error assessment technique for ecological presence/absence models.

6,044 citations