Generative and Discriminative Methods for Gene Finding and Functional Annotation

  • Noble, William (PI)

Project: Research project

Project Details

Description

The tasks accomplished by this project are divided into two phases corresponding to the two tasks of gene prediction and functional annotation. In the first phase, a gene-finding system is developed and applied. This system is designed to be scalable and extensible with respect to the gene features it models, the machine learning algorithms it employs, and the range of experimental data from which it learns. This project first validates the system by applying it to the complete C.elegans genome, and then retrains the system for the more difficult task of recognizing genes in human DNA. The second phase of this project consists of two parts. First, the software framework used for the gene finding system from phase one is generalized to model families of related proteins. Second, in order to learn from non-sequential data, the project develops functional classification techniques using a discriminative learning method called support vector machines (SVM's). The statistics calculated by the sequence-based modeling system functions as one set of features used by the SVM system. Additional features will come from DNA microarray experiments, the upstream promoter region of each gene, phylogenetic profiles and similarity scores to known protein families.

StatusFinished
Effective start/end date8/15/0010/31/02

Funding

  • National Science Foundation: US$412,195.00

ASJC Scopus Subject Areas

  • Artificial Intelligence
  • Biochemistry, Genetics and Molecular Biology(all)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.