Detecting High-Risk Smokers in Primary Care Electronic Health Records

Activity: Talk or presentation typesPresentation


In the UK, lung cancer is a leading cause of cancer death, accounting for 21% of all cancer related mortality. Between 74-76% of individuals are diagnosed at a late stage (i.e. stage III/IV). This has resulted in poor patient prognosis, with only 16% surviving 5 or more years. However, screening has the potential to improve survival rates and reduce lung cancer mortality. To effectively implement screening programmes, a targeted approach has been recommended which requires the development of criteria to identify individuals at risk. Previous risk models identified sub-populations at risk of lung cancer incidence by utilising data from clinical trials or surveys but access to GP patient records could achieve better targeting of high-risk groups as demonstrated by Atkinson et al. using the SAIL databank in 2017. This study will aim to develop a model which predicts incidence of lung cancer, using GP electronic health record data.

This project will be an observational study, to examine factors contained in electronic health records (EHRs) that are associated with and produce estimates of risk for lung cancer. This project is limited to working with data collected as part of the Early detection of Cancer of the Lung Scotland (ECLS) trial and the same participants EHR information. Natural language processing will be used to extract information on smoking behaviour, from the free-text in participants EHRs. This data will then be used to model risk of lung cancer incidence.

The primary outcome of the study is incidence of lung cancer. With the objectives of the study i) to identify pre-existing clinical and non-clinical factors that are predictive of lung cancer in individuals that smoke and ii) to develop a method to identify and categorise smoking behaviour in people who are at high risk of developing lung cancer.

As this project is still in the planning stage, there are no findings to report as yet.

As the project will develop metrics for modelling that may not be available to researchers utilising data from clinical trials or surveys, this research will identify whether there are further risk factors that should be considered in risk modelling. With the study having the potential to aid GP practices in identifying high-risk patients that may need referral or safety netting, reducing delays and improving patient prognosis. Moreover, as uptake of screenings is an issue related to this area, future research may look at interventions to target those at high-risk by using EHRs.
Period30 Jun 20211 Jul 2021
Event titleSociety for Academic Primary Care Conference
Event typeConference