Scientific Publications

Automated prioritization of sick newborns for whole genome sequencing using clinical natural language processing and machine learning

Peterson B, Hernandez EJ, Hobbs C, Malone Jenkins S, Moore B, Rosales E, Zoucha S, Sanford E, Bainbridge MN, Frise E, Oriol A, Brunelli L, Kingsmore SF, Yandell M.

Genome Med. 2023 Mar 16;15(1):18. doi: 10.1186/s13073-023-01166-7. ABSTRACT BACKGROUND: Rapidly and efficiently identifying critically ill infants for whole genome sequencing (WGS) is a costly and challenging task currently performed by scarce, highly trained experts and is a major bottleneck for application of WGS in the NICU. There is a dire need for automated means to prioritize patients for WGS. METHODS: Institutional databases of electronic health records (EHRs) are logical starting points for identifying patients with undiagnosed Mendelian diseases. We have developed automated means to prioritize patients for rapid and whole genome sequencing (rWGS and WGS) directly from clinical notes. Our approach combines a clinical natural language processing (CNLP) workflow with a machine learning-based prioritization tool named Mendelian Phenotype Search Engine (MPSE). RESULTS: MPSE accurately and robustly identified NICU patients selected for WGS by clinical experts from Rady Children’s Hospital in San Diego (AUC 0.86) and the University of Utah (AUC 0.85). In addition to effectively identifying patients for WGS, MPSE scores also strongly prioritize diagnostic cases over non-diagnostic cases, with projected diagnostic yields exceeding 50% throughout the first and second quartiles of score-ranked patients. CONCLUSIONS: Our results indicate that an automated pipeline for selecting acutely ill infants in neonatal intensive care units (NICU) for WGS can meet or exceed diagnostic yields obtained through current selection procedures, which require time-consuming manual review of clinical notes and histories by specialized personnel. PMID:36927505 DOI:10.1186/s13073-023-01166-7

March 16, 2023

Artificial Intelligence in the Genetic Diagnosis of Rare Disease

James KN, Phadke S, Wong TC, Chowdhury S.

Clin Lab Med. 2023 Mar;43(1):127-143. doi: 10.1016/j.cll.2022.09.023. Part of special issue: Artificial Intelligence in the Clinical Laboratory: Current Practice and Emerging Opportunities PMID:36764805 DOI:10.1016/j.cll.2022.09.023

March 1, 2023

Scalable, high quality, whole genome sequencing from archived, newborn, dried blood spots

Ding Y, Owen M, Le J, Batalov S, Chau K, Kwon YH, Van Der Kraan L, Bezares-Orin Z, Zhu Z, Veeraraghavan N, Nahas S, Bainbridge M, Gleeson J, Baer RJ, Bandoli G, Chambers C, Kingsmore SF. 

NPJ Genom Med. 2023 Feb 14;8(1):5. doi: 10.1038/s41525-023-00349-w. ABSTRACT Universal newborn screening (NBS) is a highly successful public health intervention. Archived dried bloodspots (DBS) collected for NBS represent a rich resource for population genomic studies. To fully harness this resource in such studies, DBS must yield high-quality genomic DNA (gDNA) for whole genome sequencing (WGS). In this pilot study, we hypothesized that gDNA of sufficient quality and quantity for WGS could be extracted from archived DBS up to 20 years old without PCR (Polymerase Chain Reaction) amplification. We describe simple methods for gDNA extraction and WGS library preparation from several types of DBS. We tested these methods in DBS from 25 individuals who had previously undergone diagnostic, clinical WGS and 29 randomly selected DBS cards collected for NBS from the California State Biobank. While gDNA from DBS had significantly less yield than from EDTA blood from the same individuals, it was of sufficient quality and quantity for WGS without PCR. All samples DBS yielded WGS that met quality control metrics for high-confidence variant calling. Twenty-eight variants of various types that had been reported clinically in 19 samples were recapitulated in WGS from DBS. There were no significant effects of age or paper type on WGS quality. Archived DBS appear to be a suitable sample type for WGS in population genomic studies. PMID:36788231 DOI:10.1038/s41525-023-00349-w

February 14, 2023
Newborn Screening

Reclassification of the Etiology of Infant Mortality With Whole-Genome Sequencing

Owen MJ, Wright MS, Batalov S, Kwon Y, Ding Y, Chau KK, Chowdhury S, Sweeney NM, Kiernan E, Richardson A, Batton E, Baer RJ, Bandoli G, Gleeson JG, Bainbridge M, Chambers CD, Kingsmore SF.

JAMA Netw Open. 2023 Feb 1;6(2):e2254069. doi: 10.1001/jamanetworkopen.2022.54069. ABSTRACT IMPORTANCE: Understanding the causes of infant mortality shapes public health, surveillance, and research investments. However, the association of single-locus (mendelian) genetic diseases with infant mortality is poorly understood. OBJECTIVE: To determine the association of genetic diseases with infant mortality. DESIGN, SETTING, AND PARTICIPANTS: This cohort study was conducted at a large pediatric hospital system in San Diego County (California) and included 546 infants (112 infant deaths [20.5%] and 434 infants [79.5%] with acute illness who survived; age, 0 to 1 year) who underwent diagnostic whole-genome sequencing (WGS) between January 2015 and December 2020. Data analysis was conducted between 2015 and 2022. EXPOSURE: Infants underwent WGS either premortem or postmortem with semiautomated phenotyping and diagnostic interpretation. MAIN OUTCOMES AND MEASURES: Proportion of infant deaths associated with single-locus genetic diseases. RESULTS: Among 112 infant deaths (54 girls [48.2%]; 8 [7.1%] African American or Black, 1 [0.9%] American Indian or Alaska Native, 8 [7.1%] Asian, 48 [42.9%] Hispanic, 1 [0.9%] Native Hawaiian or Pacific Islander, and 34 [30.4%] White infants) in San Diego County between 2015 and 2020, single-locus genetic diseases were the most common identifiable cause of infant mortality, with 47 genetic diseases identified in 46 infants (41%). Thirty-nine (83%) of these diseases had been previously reported to be associated with childhood mortality. Twenty-eight death certificates (62%) for 45 of the 46 infants did not mention a genetic etiology. Treatments that can improve outcomes were available for 14 (30%) of the genetic diseases. In 5 of 7 infants in whom genetic diseases were identified postmortem, death might have been avoided had rapid, diagnostic WGS been performed at time of symptom onset or regional intensive care unit admission. CONCLUSIONS AND RELEVANCE: In this cohort study of 112 infant deaths, the association of genetic diseases with infant mortality was higher than previously recognized. Strategies to increase neonatal diagnosis of genetic diseases and immediately implement treatment may decrease infant mortality. Additional study is required to explore the generalizability of these findings and measure reduction in infant mortality. PMID:36757698 DOI:10.1001/jamanetworkopen.2022.54069

February 9, 2023
Infant Mortality

25: A Multicenter Cohort Analysis of Rapid Genome Sequencing in the PICU

Rodriguez, Katherine; Kobayashi, Erica Sanford; VanDongen-Trimmer, Heather; Salz, Lisa; Foley, Jennifer; Whalen, Drewann; Oluchukwu, Okonkwo; Liu, Kuang Chuen; Burton, Jennifer; Syngal, Prachi; Kingsmore, Stephen; Coufal, Nicole.

Critical Care Medicine 51(1):p 13, January 2023. Genetic disorders contribute significantly to morbidity and mortality in pediatric critical care. Diagnostic rapid whole genome sequencing (rWGS) has dramatically impacted care in neonatal intensive care units (ICU). There remains a population of undiagnosed patients with rare genetic diseases who present critically ill to the pediatric ICU (PICU) and the application of rWGS in this setting is not yet fully described. This study evaluated the clinical utility of rWGS in the PICU. DOI: 10.1097/01.ccm.0000905976.97417.e4

January 31, 2023

Further delineation of the CWC27-associated spliceosomeopathy: Case report and review of the literature

Yassin SH, Henderson R, Lenberg J, Murillo V, Murdock DR, Friedman J, Jones MC, Wigby K, Borooah S

Am J Med Genet A. 2023 Jan 31. doi: 10.1002/ajmg.a.63134. Online ahead of print. ABSTRACT Pre-mRNA splicing factors are crucial in regulating transcript diversity, by removing introns from eukaryotic transcripts, an essential step in gene expression. Splicing of pre-mRNA is catalyzed by spliceosomes. CWC27 is a cyclophilin associated with spliceosome, in which genetic defects of its components have been linked to spliceosomopathies with clinical phenotypes including skeletal developmental defects, retinitis pigmentosa (RP), short stature, skeletal anomalies, and neurological disorders. We report two siblings (male and female) of Mexican descent with a novel homozygous frameshift variant in CWC27 and aim to highlight the cardinal features among the previously described 12 cases as well as expand the currently recognized phenotypic spectrum. Both siblings presented with a range of ocular and extraocular manifestations including novel features such as solitary kidney and tarsal coalition in the male sibling, together with gait abnormalities, and Hashimoto’s thyroiditis in the female sibling. Finally, we highlight ectodermal involvement including sparse scalp hair, eyebrows and lashes, pigmentary differences, nail dysplasia, and dental anomalies as a core phenotype associated with the CWC27 spliceosomopathy. PMID:36718996 DOI:10.1002/ajmg.a.63134

January 31, 2023

The Genomic landscape of short tandem repeats across multiple ancestries

Vijayaraghavan P, Batalov S, Ding Y, Sanford E, Kingsmore SF, Dimmock D, Hobbs C, Bainbridge M. 

PLoS One. 2023 Jan 26;18(1):e0279430. doi: 10.1371/journal.pone.0279430. eCollection 2023. ABSTRACT Short Tandem Repeats (STRs) have been found to play a role in a myriad of complex traits and genetic diseases. We examined the variability in the lengths of over 850,000 STR loci in 996 children with suspected genetic disorders and 1,178 parents across six separate ancestral groups: Africans, Europeans, East Asians, Admixed Americans, Non-admixed Americans, and Pacific Islanders. For each STR locus we compared allele length between and within each ancestry group. In relation to Europeans, admixed Americans had the most similar STR lengths with only 623 positions either significantly expanded or contracted, while the divergence was highest in Africans, with 4,933 chromosomal positions contracted or expanded. We also examined probands to identify STR expansions at known pathogenic loci. The genes TCF4, AR, and DMPK showed significant expansions with lengths 250% greater than their various average allele lengths in 49, 162, and 11 individuals respectively. All 49 individuals containing an expansion in TCF4 and six individuals containing an expansion in DMPK presented with allele lengths longer than the known pathogenic length for these genes. Next, we identified individuals with significant expansions in highly conserved loci across all ancestries. Eighty loci in conserved regions met criteria for divergence. Two of these individuals were found to have exonic STR expansions: one in ZBTB4 and the other in SLC9A7, which is associated with X-linked mental retardation. Finally, we used parent-child trios to detect and analyze de novo mutations. In total, we observed 3,219 de novo expansions, where proband allele lengths are greater than twice the longest parental allele length. This work helps lay the foundation for understanding STR lengths genome-wide across ancestries and may help identify new disease genes and novel mechanisms of pathogenicity in known disease genes. PMID:36701310 DOI:10.1371/journal.pone.0279430

January 26, 2023
Gene Discovery

Are we prepared to deliver gene-targeted therapies for rare diseases?

Yu TW, Kingsmore SF, Green RC, MacKenzie T, Wasserstein M, Caggana M, Gold NB, Kennedy A, Kishnani PS, Might M, Brooks PJ, Morris JA, Parisi MA, Urv TK.

Am J Med Genet C Semin Med Genet. 2023 Jan 24. doi: 10.1002/ajmg.c.32029. Online ahead of print. ABSTRACT The cost and time needed to conduct whole-genome sequencing (WGS) have decreased significantly in the last 20 years. At the same time, the number of conditions with a known molecular basis has steadily increased, as has the number of investigational new drug applications for novel gene-based therapeutics. The prospect of precision gene-targeted therapy for all seems in reach… or is it? Here we consider practical and strategic considerations that need to be addressed to establish a foundation for the early, effective, and equitable delivery of these treatments. PMID:36691939 DOI:10.1002/ajmg.c.32029

January 24, 2023
Rare Disease

TMEM161B modulates radial glial scaffolding in neocortical development

Wang L, Heffner C, Vong KL, Barrows C, Ha YJ, Lee S, Lara-Gonzalez P, Jhamb I, Van Der Meer D, Loughnan R, Parker N, Sievert D, Mittal S, Issa MY, Andreassen OA, Dale A, Dobyns WB, Zaki MS, Murray SA, Gleeson JG.

Proc Natl Acad Sci U S A. 2023 Jan 24;120(4):e2209983120. doi: 10.1073/pnas.2209983120. Epub 2023 Jan 20. ABSTRACT TMEM161B encodes an evolutionarily conserved widely expressed novel 8-pass transmembrane protein of unknown function in human. Here we identify TMEM161B homozygous hypomorphic missense variants in our recessive polymicrogyria (PMG) cohort. Patients carrying TMEM161B mutations exhibit striking neocortical PMG and intellectual disability. Tmem161b knockout mice fail to develop midline hemispheric cleavage, whereas knock-in of patient mutations and patient-derived brain organoids show defects in apical cell polarity and radial glial scaffolding. We found that TMEM161B modulates actin filopodia, functioning upstream of the Rho-GTPase CDC42. Our data link TMEM161B with human PMG, likely regulating radial glia apical polarity during neocortical development. PMID:36669109 DOI:10.1073/pnas.2209983120

January 24, 2023

Stem Cell-Based Organoid Models of Neurodevelopmental Disorders

Wang L, Owusu-Hammond C, Sievert D, Gleeson JG.

Biol Psychiatry. 2023 Jan 24:S0006-3223(23)00039-2. doi: 10.1016/j.biopsych.2023.01.012. Online ahead of print. ABSTRACT The past decade has seen an explosion in the identification of genetic causes of neurodevelopmental disorders, including Mendelian, de novo, and somatic factors. These discoveries provide opportunities to understand cellular and molecular mechanisms as well as potential gene-gene and gene-environment interactions to support novel therapies. Stem cell-based models, particularly human brain organoids, can capture disease-associated alleles in the context of the human genome, engineered to mirror disease-relevant aspects of cellular complexity and developmental timing. These models have brought key insights into neurodevelopmental disorders as diverse as microcephaly, autism, and focal epilepsy. However, intrinsic organoid-to-organoid variability, low levels of certain brain-resident cell types, and long culture times required to reach maturity can impede progress. Several recent advances incorporate specific morphogen gradients, mixtures of diverse brain cell types, and organoid engraftment into animal models. Together with nonhuman primate organoid comparisons, mechanisms of human neurodevelopmental disorders are emerging. PMID:36759260 DOI:10.1016/j.biopsych.2023.01.012

January 24, 2023

