TY - JOUR
T1 - Genome-wide association neural networks identify genes linked to family history of Alzheimer's disease
AU - Ghose, Upamanyu
AU - Sproviero, William
AU - Winchester, Laura
AU - Amin, Najaf
AU - Zhu, Taiyu
AU - Newby, Danielle
AU - Ulm, Brittany S
AU - Papathanasiou, Angeliki
AU - Shi, Liu
AU - Liu, Qiang
AU - Fernandes, Marco
AU - Adams, Cassandra
AU - Albukhari, Ashwag
AU - Almansouri, Majid
AU - Choudhry, Hani
AU - van Duijn, Cornelia
AU - Nevado-Holgado, Alejo
N1 - Funding: This work was supported by Alzheimer’s Research UK [grant ID ARUK-PhD2022-031]; King Abdulaziz University and the University of Oxford Centre for Artificial Intelligence in Precision Medicine (KO-CAIPM); Janssen Research and Development (Johnson & Johnson); the John Fell Foundation [grant ID 0010659]; and the Virtual Brain Cloud from European Commission [grant number H2020-SC1-DTH-2018-1]. C.A. is funded by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC).
PY - 2025/1/8
Y1 - 2025/1/8
N2 - Augmenting traditional genome-wide association studies (GWAS) with advanced machine learning algorithms can allow the detection of novel signals in available cohorts. We introduce “genome-wide association neural networks (GWANN)” a novel approach that uses neural networks (NNs) to perform a gene-level association study with family history of Alzheimer’s disease (AD). In UK Biobank, we defined cases (n = 42 110) as those with AD or family history of AD and sampled an equal number of controls. The data was split into an 80:20 ratio of training and testing samples, and GWANN was trained on the former followed by identifying associated genes using its performance on the latter. Our method identified 18 genes to be associated with family history of AD. APOE, BIN1, SORL1, ADAM10, APH1B, and SPI1 have been identified by previous AD GWAS. Among the 12 new genes, PCDH9, NRG3, ROR1, LINGO2, SMYD3, and LRRC7 have been associated with neurofibrillary tangles or phosphorylated tau in previous studies. Furthermore, there is evidence for differential transcriptomic or proteomic expression between AD and healthy brains for 10 of the 12 new genes. A series of post hoc analyses resulted in a significantly enriched protein–protein interaction network (P-value < 1 × 10−16), and enrichment of relevant disease and biological pathways such as focal adhesion (P-value = 1 × 10−4), extracellular matrix organization (P-value = 1 × 10−4), Hippo signaling (P-value = 7 × 10−4), Alzheimer’s disease (P-value = 3 × 10−4), and impaired cognition (P-value = 4 × 10−3). Applying NNs for GWAS illustrates their potential to complement existing algorithms and methods and enable the discovery of new associations without the need to expand existing cohorts.
AB - Augmenting traditional genome-wide association studies (GWAS) with advanced machine learning algorithms can allow the detection of novel signals in available cohorts. We introduce “genome-wide association neural networks (GWANN)” a novel approach that uses neural networks (NNs) to perform a gene-level association study with family history of Alzheimer’s disease (AD). In UK Biobank, we defined cases (n = 42 110) as those with AD or family history of AD and sampled an equal number of controls. The data was split into an 80:20 ratio of training and testing samples, and GWANN was trained on the former followed by identifying associated genes using its performance on the latter. Our method identified 18 genes to be associated with family history of AD. APOE, BIN1, SORL1, ADAM10, APH1B, and SPI1 have been identified by previous AD GWAS. Among the 12 new genes, PCDH9, NRG3, ROR1, LINGO2, SMYD3, and LRRC7 have been associated with neurofibrillary tangles or phosphorylated tau in previous studies. Furthermore, there is evidence for differential transcriptomic or proteomic expression between AD and healthy brains for 10 of the 12 new genes. A series of post hoc analyses resulted in a significantly enriched protein–protein interaction network (P-value < 1 × 10−16), and enrichment of relevant disease and biological pathways such as focal adhesion (P-value = 1 × 10−4), extracellular matrix organization (P-value = 1 × 10−4), Hippo signaling (P-value = 7 × 10−4), Alzheimer’s disease (P-value = 3 × 10−4), and impaired cognition (P-value = 4 × 10−3). Applying NNs for GWAS illustrates their potential to complement existing algorithms and methods and enable the discovery of new associations without the need to expand existing cohorts.
KW - Alzheimer's disease
KW - Neural networks
KW - Artificial intelligence
KW - Machine learning
KW - GWAS
KW - UK Biobank
U2 - 10.1093/bib/bbae704
DO - 10.1093/bib/bbae704
M3 - Article
C2 - 39775791
SN - 1467-5463
VL - 26
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 1
M1 - bbae704
ER -