Using Electronic Health Records to identify risk of lung cancer in smokers.

Activity: Talk or presentation typesPresentation


In the UK, lung cancer is a leading cause of cancer related death, accounting for 21% of all cancer related mortality. To aid in early diagnosis, modelling has been carried out to identify sub-populations at greater risk of developing lung cancer. Most of the predictive models examining risk in smokers use trial data. This study uses both trial and administrative electronic health record data to investigate further possible risk factors and determine the significance of established risk factors in a lung cancer risk model for smokers.

Data on current and former smokers were obtained from the Early Cancer of the Lung Scotland (ECLS) trial (N=12,139). This data was linked with the same participants administrative electronic health records. Both stepwise and forward logistic regression was used to obtain measures of effect and select predictors. Demographic (e.g. age, smoking status) and clinical predictors (e.g. diagnosis of COPD, hospitalised for heart disease) were included in the final model.

The model performed well with an AUC of 0.895 and a negative predictive value of 96.8%. The analysis was possibly underpowered, due to the small number of lung cancer cases in the cohort, resulting in the model producing a positive predictive value of 0.2%.
Period22 Jan 2022
Event titleADEGS 2022
Event typeConference