Accounting for preferential sampling in species distribution models

Maria Grazia Pennino*, Iosu Paradinas, Janine B. Illian, Facundo Muñoz, José María Bellido, Antonio López-Quílez, David Conesa

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

61 Citations (Scopus)

Abstract

Species distribution models (SDMs) are now being widely used in ecology for management and conservation purposes across terrestrial, freshwater, and marine realms. The increasing interest in SDMs has drawn the attention of ecologists to spatial models and, in particular, to geostatistical models, which are used to associate observations of species occurrence or abundance with environmental covariates in a finite number of locations in order to predict where (and how much of) a species is likely to be present in unsampled locations. Standard geostatistical methodology assumes that the choice of sampling locations is independent of the values of the variable of interest. However, in natural environments, due to practical limitations related to time and financial constraints, this theoretical assumption is often violated. In fact, data commonly derive from opportunistic sampling (e.g., whale or bird watching), in which observers tend to look for a specific species in areas where they expect to find it. These are examples of what is referred to as preferential sampling, which can lead to biased predictions of the distribution of the species. The aim of this study is to discuss a SDM that addresses this problem and that it is more computationally efficient than existing MCMC methods. From a statistical point of view, we interpret the data as a marked point pattern, where the sampling locations form a point pattern and the measurements taken in those locations (i.e., species abundance or occurrence) are the associated marks. Inference and prediction of species distribution is performed using a Bayesian approach, and integrated nested Laplace approximation (INLA) methodology and software are used for model fitting to minimize the computational burden. We show that abundance is highly overestimated at low abundance locations when preferential sampling effects not accounted for, in both a simulated example and a practical application using fishery data. This highlights that ecologists should be aware of the potential bias resulting from preferential sampling and account for it in a model when a survey is based on non‐randomized and/or non‐systematic sampling.
Original languageEnglish
Pages (from-to)653-663
Number of pages11
JournalEcology and Evolution
Volume9
Issue number1
Early online date26 Dec 2018
DOIs
Publication statusPublished - 1 Jan 2019

Keywords

  • Bayesian modelling
  • Integrated nested Laplace approximation
  • Point processes
  • Species Distribution Models (SDMs)
  • Stochastic partial differential equation

Fingerprint

Dive into the research topics of 'Accounting for preferential sampling in species distribution models'. Together they form a unique fingerprint.

Cite this