Collaborative Research: SHF: Medium: Learning Semantics of Code To Automate Software Assurance Tasks

Kaiser, Gail E. (CoPI)
Ray, Baishakhi (PI)

Columbia University

Proyecto

Description

Deep learning has demonstrated great potential for accomplishing software engineering tasks. However, its capabilities are limited for challenging yet very important software assurance tasks such as bug detection, debugging, test input generation, and test suite prioritization. These tasks are hard to formulate into a learning problem. A major part of the difficulty is that these complex tasks require modeling of program semantics. To the best of our knowledge, even state-of-the-art deep learning models have an insufficient understanding of program semantics. As a result, the models fail to achieve sufficient precision and recall to be more widely deployed. The tools do not generalize well to unseen projects and are not robust to small perturbations in source code. It also takes large amounts of computational resources and data to train the models. In this project, the team of researchers aims to improve the performance, robustness, generalizability and efficiency of deep learning models for software assurance and to enable deep learning for complex tasks that have not yet successfully used deep learning. Solutions will target encoding program semantics into the program representation by combining program analysis, software engineering, and deep learning expertise to develop novel formulations to effectively reduce software assurance problems via deep learning. The project has three research thrusts: To learn with abstract semantics, the project will study how to combine static analysis algorithms and the results from static analysis with deep learning models. To learn with concrete semantics, the project will study how to use program execution traces to guide deep learning. Finally, the project will investigate how to identify spurious features used by the current models and then apply causal learning to discourage models that have spurious features. Research results, datasets, and tools will be disseminated to the research community, and workshops will be organized to strengthen the research community of deep learning for code.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Better FPE

Concept 1 Concep2, 50%

Estado	Activo
Fecha de inicio/Fecha fin	10/1/23 → 9/30/27

Keywords

Inteligencia artificial
Redes de ordenadores y comunicaciones
Ingeniería (todo)
Ingeniería eléctrica y electrónica
Comunicación

Acceder al proyecto

https://www.nsf.gov/awardsearch/showAward?AWD_ID=2313055

Collaborative Research: SHF: Medium: Learning Semantics of Code To Automate Software Assurance Tasks

Detalles del proyecto

Description

Better FPE

Keywords

Acceder al proyecto

Huella digital