Marginal structural models using calibrated weights with SuperLearner: application to type II diabetes cohort

Sumeet Kalia*, Olli Saarela, Tao Chen, Braden O'Neill, Christopher Meaney, Jessica Gronsbell, Ervin Sejdic, Michael Escobar, Babak Aliarzadeh, Rahim Moineddin, Conrad Pow, Frank Sullivan, Michelle Greiver

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Downloads (Pure)


As different scientific disciplines begin to converge on machine learning for causal inference, we demonstrate the application of machine learning algorithms in the context of longitudinal causal estimation using electronic health records. Our aim is to formulate a marginal structural model for estimating diabetes care provisions in which we envisioned hypothetical (i.e. counterfactual) dynamic treatment regimes using a combination of drug therapies to manage diabetes: metformin, sulfonylurea and SGLT-2i. The binary outcome of diabetes care provisions was defined using a composite measure of chronic disease prevention and screening elements [27] including (i) primary care visit, (ii) blood pressure, (iii) weight, (iv) hemoglobin A1c, (v) lipid, (vi) ACR, (vii) eGFR and (viii) statin medication. We used several statistical learning algorithms to describe causal relationships between the prescription of three common classes of diabetes medications and quality of diabetes care using the electronic health records contained in National Diabetes Repository. In particular, we generated an ensemble of statistical learning algorithms using the SuperLearner framework based on the following base learners: (i) least absolute shrinkage and selection operator, (ii) ridge regression, (iii) elastic net, (iv) random forest, (v) gradient boosting machines, and (vi) neural network. Each statistical learning algorithm was fitted using the pseudo-population generated from the marginalization of the time-dependent confounding process. Covariate balance was assessed using the longitudinal (i.e. cumulative-time product) stabilized weights with calibrated restrictions. Our results indicated that the treatment drop-in cohorts (with respect to metformin, sulfonylurea and SGLT-2i) may have improved diabetes care provisions in relation to treatment naive (i.e. no treatment) cohort. As a clinical utility, we hope that this article will facilitate discussions around the prevention of adverse chronic outcomes associated with type II diabetes through the improvement of diabetes care provisions in primary care.

Original languageEnglish
JournalIEEE Journal of Biomedical and Health Informatics
VolumeEarly Access
Early online date19 May 2022
Publication statusE-pub ahead of print - 19 May 2022


  • Causal inference
  • Machine learning
  • SuperLearner
  • Longitudinal interventions
  • Chronic disease prevention
  • Electronic health records
  • Primary care


Dive into the research topics of 'Marginal structural models using calibrated weights with SuperLearner: application to type II diabetes cohort'. Together they form a unique fingerprint.

Cite this