TY - JOUR
T1 - Applying machine learning to predict reproductive condition in fish
AU - Flores, Andrés
AU - Wiff, Rodrigo
AU - Donovan, Carl R.
AU - Gálvez, Patricio
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2024/5
Y1 - 2024/5
N2 - Knowledge of reproductive traits in exploited marine populations is crucial for their management and conservation. The maturity status in fish is usually assigned by traditional methods such as macroscopy and histology. Macroscopic analysis is the assessing of maturity stages by naked eye and usually introduces large amount of error. In contrast, histology is the most accurate method for maturity staging but is expensive and unavailable for many stocks worldwide. Here, we use the Random Forest (RF) machine learning method for classification of reproductive condition in fish, using the extensive data from Chilean hake (Merluccius gayi gayi). Gonads randomly collected from commercial industrial and acoustic surveys were classified as immature, mature-active and mature-inactive. A classifier for these three maturity classes was fitted using RFs, with the continuous covariates total length (TL), gonadosomatic index (GSI), condition factor (Krel), latitude, longitude, and depth, along with month as a factor variable. The RF model showed high accuracy (>82%) and high proportion of agreement (>71%) compared to histology, with an OOB error rate lower than 15%. GSI and TL were the most important variables for predicting the reproductive condition in Chilean hake, and to lesser extent, depth when using survey data. The application of the RF shows a promising tool for assigning maturity stages in fishes when covariates are available, and also to improve the accuracy of maturity classification when only macroscopic staging is available.
AB - Knowledge of reproductive traits in exploited marine populations is crucial for their management and conservation. The maturity status in fish is usually assigned by traditional methods such as macroscopy and histology. Macroscopic analysis is the assessing of maturity stages by naked eye and usually introduces large amount of error. In contrast, histology is the most accurate method for maturity staging but is expensive and unavailable for many stocks worldwide. Here, we use the Random Forest (RF) machine learning method for classification of reproductive condition in fish, using the extensive data from Chilean hake (Merluccius gayi gayi). Gonads randomly collected from commercial industrial and acoustic surveys were classified as immature, mature-active and mature-inactive. A classifier for these three maturity classes was fitted using RFs, with the continuous covariates total length (TL), gonadosomatic index (GSI), condition factor (Krel), latitude, longitude, and depth, along with month as a factor variable. The RF model showed high accuracy (>82%) and high proportion of agreement (>71%) compared to histology, with an OOB error rate lower than 15%. GSI and TL were the most important variables for predicting the reproductive condition in Chilean hake, and to lesser extent, depth when using survey data. The application of the RF shows a promising tool for assigning maturity stages in fishes when covariates are available, and also to improve the accuracy of maturity classification when only macroscopic staging is available.
KW - Gonadosomatic index
KW - Histology
KW - Maturity
KW - Merluccius gayi gayi
KW - Random forest
UR - http://www.scopus.com/inward/record.url?scp=85182883207&partnerID=8YFLogxK
U2 - 10.1016/j.ecoinf.2024.102481
DO - 10.1016/j.ecoinf.2024.102481
M3 - Article
AN - SCOPUS:85182883207
SN - 1574-9541
VL - 80
JO - Ecological Informatics
JF - Ecological Informatics
M1 - 102481
ER -