Hum Genet. 2020 May;139(5):569-574. doi: 10.1007/s00439-020-02135-5. Epub 2020 Feb 13.
Extremely rare diseases are increasingly recognized due to wide-spread, inexpensive genomic sequencing. Understanding the incidence of rare disease is important for appreciating its health impact and allocating recourses for research. However, estimating incidence of rare disease is challenging because the individual contributory alleles are, themselves, extremely rare. We propose a new method to determine incidence of rare, severe, recessive disease in non-consanguineous populations that use known allele frequencies, estimate the combined allele frequency of observed alleles and estimate the number of causative alleles that are thus far unobserved in a disease cohort. Experiments on simulated and real data show that this approach is a feasible method to estimate the incidence of rare disease in European populations but due to several limitations in our ability to assess the full spectrum of pathogenic mutations serves as a useful tool to provide a lower threshold on disease incidence.