Statistical inference in acoustic spatial capture-recapture
: integrating machine learning for species identification

  • Yuheng Wang

Student thesis: Doctoral Thesis (PhD)

Abstract

Digital acoustic sensors can effectively monitor wildlife populations that are acoustically active but difficult to see. The fast development of sensor technologies and the large volumes of data they generate have great potential for making better inferences about wildlife populations but also pose great challenges for data processing and inference, as manual processing of such data to identify target species can be labour-intensive, time-consuming, and subjective. Machine learning (ML) provides an alternative way to detect target species automatically in acoustic recordings with low resource consumption. However, integrating ML detection with existing wildlife density estimation methods is not trivial. Population density measures the number of individuals per unit area. This measure is crucial for understanding animal population status and the mechanism of population change, and it is thus of great importance to ecology and conservation studies.
This thesis addresses some key challenges in integrating modern ML techniques into inference by acoustic spatial capture-recapture (ASCR). It develops three methods that facilitate the production of a rigorous and flexible analysis pipeline to estimate wildlife population density from raw acoustic recordings. The first is improved ML methods for identifying complex vocalizations in acoustic recordings. We propose two methods for exploiting such data. The former combines convolutional neural network (CNN) models with a hidden Markov model (HMM), and the latter uses a convolutional recurrent neural network (CRNN). Both models learn local acoustic features via a CNN and temporal correlations of features either via an HMM or recurrent network. The second develops a method for ASCR analysis when it is unknown which acoustic detections by different detectors are of the same call (as is usually the case). We propose a Monte Carlo expectation-maximization (MCEM) estimation method to resolve this unknown call identity problem. The third method integrates the uncertainty about species identity into inference by treating it as a latent variable. Individual-level outputs from ML techniques are treated as random variables whose distributions depend on the latent identity. This gives rise to a mixture model likelihood that we maximize to estimate call density. Together, these provide some of the key elements required for an automated analysis pipeline from acoustic recording to density estimates, which would provide conservationists and ecologists with better tools to monitor and manage wildlife populations.
Date of Award3 Dec 2024
Original languageEnglish
Awarding Institution
  • University of St Andrews
SupervisorDavid Louis Borchers (Supervisor) & Juan Ye (Supervisor)

Keywords

  • Machine learning
  • Statistical inference
  • Acoustic survey
  • Spatial capture-recapture
  • Automated pipeline
  • Density estimation

Access Status

  • Full text embargoed until
  • Restricted until 13 Nov 2026

Cite this

'