PCR duplicate proportion estimation and consequences for DNA copy number calculations

Andy Lynch*, Mike Smith, Matthew Eldridge, Simon Tavaré

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Downloads (Pure)


The volume of DNA in a sequencing experiment is often amplified by PCR, leading to the possibility that the same original DNA fragment will be sequenced twice - a ‘PCR duplicate’. Sometimes indistinguishable from these are multiple sequences arising from identical but independent molecules, which can lead to an over-estimation of the PCR duplicate proportion. The PCR duplicate proportion, and other measures derived from it, are important statistics for quality assurance, experimental design, and interpretation of sequencing experiments. Here we provide a full likelihood basis for a combinatorial approach using heterozygous SNPs as implemented in our R package, and demonstrate the efficacy of the approach. We also discuss the association with DNA copy number, and demonstrate the impact on a question of inferring mitochondrial DNA copy number that has recently been a feature of several high-profile cancer studies. This is explored through a simulation study.
Original languageEnglish
Title of host publicationRecent developments in statistics and data science
Subtitle of host publicationSPE2021, Évora, Portugal, October 13–16
EditorsRegina Bispo, Lígia Henriques-Rodrigues, Russell Alpizar-Jara, Miguel de Carvalho
Place of PublicationCham
Number of pages21
ISBN (Electronic)9783031127663
ISBN (Print)9783031127656
Publication statusPublished - 29 Nov 2022
EventXXV Congress of the Portuguese Statistical Society - Online, Évora, Portugal
Duration: 13 Oct 202116 Oct 2021
Conference number: 25

Publication series

NameSpringer proceedings in mathematics & statistics
ISSN (Print)2194-1009
ISSN (Electronic)2194-1017


ConferenceXXV Congress of the Portuguese Statistical Society
Abbreviated titleSPE
Internet address


  • Whole-genome sequencing
  • DNA copy number
  • Likelihood
  • Quality control
  • Mitochondria
  • Cancer


Dive into the research topics of 'PCR duplicate proportion estimation and consequences for DNA copy number calculations'. Together they form a unique fingerprint.

Cite this