Project Details
Description
PROJECT SUMMARY/ABSTRACT
This K08 application proposes to establish robust phenotyping pipelines for detection of early Collagen Type
IV-Associated Nephropathy (COL4A-AN) using electronic health records (EHR). The study aims to overcome
diagnostic challenges associated with the diverse manifestations of COL4A-AN, enabling timely, personalized
interventions crucial for delaying chronic kidney disease (CKD) progression. Utilizing EHR data presents an
opportunity to support CKD subtype identification but faces hurdles such as data heterogeneity and semantic
gaps across health systems. The proposed solution involves leveraging well-established, open-source, Natural
Language Processing (NLP) systems to convert unstructured clinical narratives into structured data, aiming to
extract nuanced phenotypic descriptions for early COL4A-AN patient identification. Moreover, these
experiments utilize the Unified Medical Language System (UMLS) and Observational Medical Outcomes
Partnership-Common Data Model (OMOP-CDM v5.4) to standardize language and guarantee data
interoperability, enabling broader research application.
The project endeavors to create precise, transferable EHR prediction models for diverse CKD subtypes, laying
the groundwork for automated decision support tools, with a broad impact on COL4A-AN patients, those with
genetic CKD subtypes, and the precision nephrology field. Through two experiments, it aims to develop highly
efficient early disease prediction models for COL4A-AN by mining unstructured text and leveraging nuanced
phenotype data from clinicians' narratives, distinct from ICD code-based models. The primary goal of this study
is to establish scalable phenotyping pipelines to identify patients with genetic CKD subtypes, using NLP and
standardized data models to enhance prediction accuracy and automate decision-support tools. Standardizing
the early COL4A-AN prediction models will facilitate research transparency and foster collaboration across
multiple institutions. This project facilitates the creation of EHR-embedded decision support tools. Moreover, it
lays the groundwork for a future R01 clinical trial application that aims to investigate the diagnostic yield among
patients highly likely to have COL4A-AN, who will subsequently be referred for genetic testing. These efforts
set the stage for future studies examining how early diagnosis influences the progression of CKD.
The comprehensive training plan is specifically designed to provide Dr. Jordan Nestor with NLP expertise for
extracting nuanced phenotypes from unstructured EHR narratives, creating EHR-based prediction models, and
aiding her transition into an independent NIH-funded investigator. Dr. Nestor's ultimate objective is to advance
patient care and enhance CKD outcomes through the direct implementation of precision nephrology at the
point of care.
Status | Active |
---|---|
Effective start/end date | 9/1/24 → 8/31/25 |
ASJC Scopus Subject Areas
- Genetics
- Nephrology