Project Details
Description
PROJECT SUMMARY The goal of this project is to leverage deep-learning algorithms on Electronic Health Records (EHRs) to improve early detection of pancreatic ductal adenocarcinoma (PDAC), a malignancy with high mortality and morbidity. Although numerous risk factors have been identified, PDAC is most often found in later stages when effective treatments are not feasible or their survival benefit is limited. In this R21, we aim to develop novel structured methodologies for systematically incorporating feature grouping strategy from expert domain knowledge into the training procedure of deep-learning algorithms for improving PDAC diagnosis. The overarching hypothesis for this study is that the groups of highly correlated variables will combine to form superior and interpretable predictors compared to individual clinical variables (current proposal). Furthermore, these new predictors represented by the group of related data will be useful for other downstream tasks such as risk factor identification via causal discovery (future research). The proposed research presents an innovative approach towards unifying human and artificial intelligence, using explainable algorithms to build interpretable prediction models, in contrast to conventional deep-learning algorithms which are non-traceable by humans due to their black-box nature. An optimal strategy for creating composite (grouped) variables should maximize both predictive power as well as human-interpretability. We will thus explore a variety of grouping strategies relying heavily on human-expert knowledge (e.g. clinical workflows) as well as auto-correlation tests. An effective grouping strategy will allow our prediction model to learn the relative importance of both individual measurements as well as interpretable groups of measurements in predicting PDAC. Examples in the literature show that such grouped predictors often have superior predictive power compared to their individual components, which can be attributed to the mutual information shared within the group. Different types of explainable (attention) neural networks may also be applied depending on the group characteristics to further improve interpretability as well as prediction accuracy. We believe that similar methodologies applied to predictive modeling in healthcare data have the potential to fundamentally advance clinical decision making with improved model interpretability. The success of this proposal will be leveraged in a larger ongoing project which aims to establish new causal relationships between various risk factors associated with PDAC. This involves an advanced graph-based approach for building interpretable models. Our direct application of causal discoveries in the future research will be a program for collecting patient-generated health data (PGHD) for PDAC early diagnosis.
Status | Finished |
---|---|
Effective start/end date | 7/28/21 → 6/30/22 |
Funding
- National Cancer Institute: US$176,874.00
ASJC Scopus Subject Areas
- Cancer Research
- Artificial Intelligence
- Oncology
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.