Abstract
Cancer is a disease driven and characterised by mutations in the DNA. Different categorisations of DNA mutations have allowed the identification of patterns that can act as signatures for the processes that have governed the life of the cancer. Over the last decade, research groups have identified more than 100 such signatures. Mutational signature analyses are improving our understanding of cancer aetiology and have the potential to play a role in diagnosis, prognosis and treatment choice. Consisting of the estimation of probability mass functions or weights determining non-negative weighted combinations, they are perhaps unique amongst comparable analyses in the medical literature, in that no confidence intervals or other representations of uncertainty are demanded when reporting the results. Here, we review the key statistical challenges for the field, assess the potential of existing approaches to adapt to those challenges, and comment on what we think are promising directions. As we deal with data that are noisy and heterogeneous, we evaluate how to present them so that models use all the information available. Often posed as a matrix factorisation problem, we argue that a fully probabilistic approach is required to quantify uncertainty around model parameters and to underpin principled study design. Lastly, we argue that novel methodology is required to evaluate uncertainties in analyses where prior information is available.
Original language | English |
---|---|
Title of host publication | Recent developments in statistics and data science |
Subtitle of host publication | SPE2021, Évora, Portugal, October 13–16 |
Editors | Regina Bispo, Lígia Henriques-Rodrigues, Russell Alpizar-Jara, Miguel de Carvalho |
Place of Publication | Cham |
Publisher | Springer |
Chapter | 17 |
Pages | 241-258 |
ISBN (Electronic) | 9783031127663 |
ISBN (Print) | 9783031127656 |
DOIs | |
Publication status | Published - 29 Nov 2022 |
Event | XXV Congress of the Portuguese Statistical Society - Online, Évora, Portugal Duration: 13 Oct 2021 → 16 Oct 2021 Conference number: 25 http://www.spe2021.uevora.pt/en/inicio-english/ |
Publication series
Name | Springer Proceedings in Mathematics & Statistics |
---|---|
Volume | 398 |
ISSN (Print) | 2194-1009 |
ISSN (Electronic) | 2194-1017 |
Conference
Conference | XXV Congress of the Portuguese Statistical Society |
---|---|
Abbreviated title | SPE |
Country/Territory | Portugal |
City | Évora |
Period | 13/10/21 → 16/10/21 |
Internet address |
Keywords
- Biostatistics
- Bioinformatics
- Cancer
- Genomics
- Next generation sequencing
- Whole genome sequencing