TY - JOUR
T1 - Marginal structural models using calibrated weights with SuperLearner
T2 - application to type II diabetes cohort
AU - Kalia, Sumeet
AU - Saarela, Olli
AU - Chen, Tao
AU - O'Neill, Braden
AU - Meaney, Christopher
AU - Gronsbell, Jessica
AU - Sejdic, Ervin
AU - Escobar, Michael
AU - Aliarzadeh, Babak
AU - Moineddin, Rahim
AU - Pow, Conrad
AU - Sullivan, Frank
AU - Greiver, Michelle
N1 - Funding: We like to acknowledge the SOPR-CIHR funding for NDR, and AHRQ
Inspire PHC award for the application of this study. This statistical research was supported by Natural Sciences and Engineering Research Council (NSERC) PhD scholarship (CGS: 534600).
PY - 2022/5/19
Y1 - 2022/5/19
N2 - As different scientific disciplines begin to converge on machine learning for causal inference, we demonstrate the application of machine learning algorithms in the context of longitudinal causal estimation using electronic health records. Our aim is to formulate a marginal structural model for estimating diabetes care provisions in which we envisioned hypothetical (i.e. counterfactual) dynamic treatment regimes using a combination of drug therapies to manage diabetes: metformin, sulfonylurea and SGLT-2i. The binary outcome of diabetes care provisions was defined using a composite measure of chronic disease prevention and screening elements [27] including (i) primary care visit, (ii) blood pressure, (iii) weight, (iv) hemoglobin A1c, (v) lipid, (vi) ACR, (vii) eGFR and (viii) statin medication. We used several statistical learning algorithms to describe causal relationships between the prescription of three common classes of diabetes medications and quality of diabetes care using the electronic health records contained in National Diabetes Repository. In particular, we generated an ensemble of statistical learning algorithms using the SuperLearner framework based on the following base learners: (i) least absolute shrinkage and selection operator, (ii) ridge regression, (iii) elastic net, (iv) random forest, (v) gradient boosting machines, and (vi) neural network. Each statistical learning algorithm was fitted using the pseudo-population generated from the marginalization of the time-dependent confounding process. Covariate balance was assessed using the longitudinal (i.e. cumulative-time product) stabilized weights with calibrated restrictions. Our results indicated that the treatment drop-in cohorts (with respect to metformin, sulfonylurea and SGLT-2i) may have improved diabetes care provisions in relation to treatment naive (i.e. no treatment) cohort. As a clinical utility, we hope that this article will facilitate discussions around the prevention of adverse chronic outcomes associated with type II diabetes through the improvement of diabetes care provisions in primary care.
AB - As different scientific disciplines begin to converge on machine learning for causal inference, we demonstrate the application of machine learning algorithms in the context of longitudinal causal estimation using electronic health records. Our aim is to formulate a marginal structural model for estimating diabetes care provisions in which we envisioned hypothetical (i.e. counterfactual) dynamic treatment regimes using a combination of drug therapies to manage diabetes: metformin, sulfonylurea and SGLT-2i. The binary outcome of diabetes care provisions was defined using a composite measure of chronic disease prevention and screening elements [27] including (i) primary care visit, (ii) blood pressure, (iii) weight, (iv) hemoglobin A1c, (v) lipid, (vi) ACR, (vii) eGFR and (viii) statin medication. We used several statistical learning algorithms to describe causal relationships between the prescription of three common classes of diabetes medications and quality of diabetes care using the electronic health records contained in National Diabetes Repository. In particular, we generated an ensemble of statistical learning algorithms using the SuperLearner framework based on the following base learners: (i) least absolute shrinkage and selection operator, (ii) ridge regression, (iii) elastic net, (iv) random forest, (v) gradient boosting machines, and (vi) neural network. Each statistical learning algorithm was fitted using the pseudo-population generated from the marginalization of the time-dependent confounding process. Covariate balance was assessed using the longitudinal (i.e. cumulative-time product) stabilized weights with calibrated restrictions. Our results indicated that the treatment drop-in cohorts (with respect to metformin, sulfonylurea and SGLT-2i) may have improved diabetes care provisions in relation to treatment naive (i.e. no treatment) cohort. As a clinical utility, we hope that this article will facilitate discussions around the prevention of adverse chronic outcomes associated with type II diabetes through the improvement of diabetes care provisions in primary care.
KW - Causal inference
KW - Machine learning
KW - SuperLearner
KW - Longitudinal interventions
KW - Chronic disease prevention
KW - Electronic health records
KW - Primary care
U2 - 10.1109/JBHI.2022.3175862
DO - 10.1109/JBHI.2022.3175862
M3 - Article
C2 - 35588417
SN - 2168-2194
VL - Early Access
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
ER -