TY - JOUR
T1 - Severity Index for Suspected Arbovirus (SISA)
T2 - machine learning for accurate prediction of hospitalization in subjects suspected of arboviral infection
AU - Sippy, Rachel
AU - Farrell, Daniel F
AU - Lichtenstein, Daniel A
AU - Nightingale, Ryan
AU - Harris, Megan A
AU - Toth, Joseph
AU - Hantztidiamantis, Paris
AU - Usher, Nicholas
AU - Cueva Aponte, Cinthya
AU - Barzallo Aguilar, Julio
AU - Puthumana, Anthony
AU - Lupone, Christina D
AU - Endy, Timothy
AU - Ryan, Sadie J
AU - Stewart Ibarra, Anna M
N1 - Funding: This study was supported, in part, by the Department of Defense Global Emerging Infection Surveillance (https://health.mil/Military-Health-Topics/Combat-Support/Armed-Forces-Health-Surveillance-Branch/Global-Emerging-Infections-Surveillance-and-Response) grant (P0220_13_OT) and the Department of Medicine of SUNY Upstate Medical University (http://www.upstate.edu/medicine/). D.F., M.H. and P.H. were supported by the Ben Kean Fellowship from the American Society for Tropical Medicine and Hygeine (https://www.astmh.org/awards-fellowships-medals/benjamin-h-keen-travel-fellowship-in-tropical-medi). S.J.R and A.M.S-I were supported by NSF DEB EEID 1518681, NSF DEB RAPID 1641145 (https://www.nsf.gov/), A.M.S-I was additionally supported by the Prometeo program of the National Secretary of Higher Education, Science, Technology, and Innovation of Ecuador (http://prometeo.educacionsuperior.gob.ec/).
PY - 2020/2/14
Y1 - 2020/2/14
N2 - Background: Dengue, chikungunya, and Zika are arboviruses of major global health concern. Decisions regarding the clinical management of suspected arboviral infection are challenging in resource-limited settings, particularly when deciding on patient hospitalization. The objective of this study was to determine if hospitalization of individuals with suspected arboviral infections could be predicted using subject intake data.Methodology/Principal findings: Two prediction models were developed using data from a surveillance study in Machala, a city in southern coastal Ecuador with a high burden of arboviral infections. Data were obtained from subjects who presented at sentinel medical centers with suspected arboviral infection (November 2013 to September 2017). The first prediction model-called the Severity Index for Suspected Arbovirus (SISA)-used only demographic and symptom data. The second prediction model-called the Severity Index for Suspected Arbovirus with Laboratory (SISAL)-incorporated laboratory data. These models were selected by comparing the prediction ability of seven machine learning algorithms; the area under the receiver operating characteristic curve from the prediction of a test dataset was used to select the final algorithm for each model. After eliminating those with missing data, the SISA dataset had 534 subjects, and the SISAL dataset had 98 subjects. For SISA, the best prediction algorithm was the generalized boosting model, with an AUC of 0.91. For SISAL, the best prediction algorithm was the elastic net with an AUC of 0.94. A sensitivity analysis revealed that SISA and SISAL are not directly comparable to one another.Conclusions/Significance: Both SISA and SISAL were able to predict arbovirus hospitalization with a high degree of accuracy in our dataset. These algorithms will need to be tested and validated on new data from future patients. Machine learning is a powerful prediction tool and provides an excellent option for new management tools and clinical assessment of arboviral infection.
AB - Background: Dengue, chikungunya, and Zika are arboviruses of major global health concern. Decisions regarding the clinical management of suspected arboviral infection are challenging in resource-limited settings, particularly when deciding on patient hospitalization. The objective of this study was to determine if hospitalization of individuals with suspected arboviral infections could be predicted using subject intake data.Methodology/Principal findings: Two prediction models were developed using data from a surveillance study in Machala, a city in southern coastal Ecuador with a high burden of arboviral infections. Data were obtained from subjects who presented at sentinel medical centers with suspected arboviral infection (November 2013 to September 2017). The first prediction model-called the Severity Index for Suspected Arbovirus (SISA)-used only demographic and symptom data. The second prediction model-called the Severity Index for Suspected Arbovirus with Laboratory (SISAL)-incorporated laboratory data. These models were selected by comparing the prediction ability of seven machine learning algorithms; the area under the receiver operating characteristic curve from the prediction of a test dataset was used to select the final algorithm for each model. After eliminating those with missing data, the SISA dataset had 534 subjects, and the SISAL dataset had 98 subjects. For SISA, the best prediction algorithm was the generalized boosting model, with an AUC of 0.91. For SISAL, the best prediction algorithm was the elastic net with an AUC of 0.94. A sensitivity analysis revealed that SISA and SISAL are not directly comparable to one another.Conclusions/Significance: Both SISA and SISAL were able to predict arbovirus hospitalization with a high degree of accuracy in our dataset. These algorithms will need to be tested and validated on new data from future patients. Machine learning is a powerful prediction tool and provides an excellent option for new management tools and clinical assessment of arboviral infection.
KW - Adolescent
KW - Arbovirus infections/epidemiology
KW - Arboviruses/genetics
KW - Child
KW - Child, Preschool
KW - Ecuador/epidemiology
KW - Female
KW - Hospitalization/statistics & numerical data
KW - Humans
KW - Infant
KW - Machine Learning
KW - Male
KW - Prospective studies
KW - Retrospective studies
KW - Severity of Illness Index
U2 - 10.1371/journal.pntd.0007969
DO - 10.1371/journal.pntd.0007969
M3 - Article
C2 - 32059026
SN - 1935-2735
VL - 14
JO - PLoS Neglected Tropical Diseases
JF - PLoS Neglected Tropical Diseases
IS - 2
M1 - e0007969
ER -