Fair Phenotype Annotation and Genomic Reinterpretation

  • Weng, Chunhua C (PI)
  • Wang, Kai K (CoPI)
  • Chung, Wendy W.K (CoPI)

Project: Research project

Project Details

Description

PROJECT SUMMARY Given the rapid evolution of genomic knowledge, the need for genomic reinterpretation has been increasing. However, there is no standard approach yet to identifying to whom, when, and how reinterpretation should be provided to ensure accuracy, cost-effectiveness and fairness. Access to genomic tests and genetic specialists has widened health disparities, which could be further exacerbated by limited ancestry-specific genetic data. Our overarching goal is to design a scalable and sustainable informatics framework to support continuous genomic reanalysis for symptomatic patients with non-diagnostic exome or genome sequencing in diverse populations. Extending our prior published work on Doc2HPO, Criteria2Query, Phen2Gene, PhenCards, Phenominal, and phenotype-disease knowledge graphs, we will first develop a natural language processing (NLP) pipeline to create a multimodal phenome from clinical notes using the latest Phenopacket schema. By comparing changes in longitudinal EHR phenotypes over time and analyzing the changes in the context of the new evidence for variants, we will identify individuals who can benefit most from genomic reanalysis. Then we will incorporate evolving clinical phenotypes extracted from longitudinal electronic health record (EHR) data to trigger automatic variant reinterpretation using an ancestry-aware and age-sensitive knowledge graph (PhenoKG). Unlike typical phenotype-based gene prioritization tools such as Phen2Gene, here we will build the knowledge graph by extending our previous efforts and extracting phenotype-genotype relations from the EHR as well as the literature. This knowledge graph will enable the query, extraction and inference of ancestry-aware, as well as age-sensitive, phenotype-genotype relationships. By leveraging a multi-layer random-walk integrative network approach, we will incorporate this heterogeneous knowledge graph into a phenotype-driven gene and variant prioritization algorithm for continuous genomic reanalysis across diverse populations. With these methodological developments, we will implement a routine reanalysis informatics pipeline at two academic institutions, Columbia University Irving Medical Center (CUIMC) and Children’s Hospital of Philadelphia (CHOP). We will evaluate the improvements in diagnostic yield across a diverse set of clinical exome/genome sequencing data over a 3-year period. We will evaluate how our approach to fair phenotyping and continuous variant reinterpretation can reduce genomic health disparities for underserved and underrepresented populations. Ultimately, these methods will enable informatics-driven, efficient, scalable, continuous and fair genomic diagnostics for genomic medicine via continuous genomic variant reinterpretation.
StatusFinished
Effective start/end date7/1/234/30/24

ASJC Scopus Subject Areas

  • Genetics
  • Molecular Biology

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.