Project Details
Description
High-dimensional non-Gaussian multivariate data represents a fundamental challenge in modern statistics and machine learning. While linear-Gaussian multivariate analysis has been well-established for decades, modern big data sets across all branches of science and engineering typically display various types of non-linear, non-Gaussian, and/or non-stationary structure. This project will develop new mathematical and computational tools that can extract information and structure from such data. Objective: The focus of the proposed work is to develop new methods that can deal with non-Gaussian and non-stationary data and new methods for efficient inference. Approach: The PI will address the following three aims: Aim 1: Generalized factor analysis. He will develop new generalized factor analysis methods that can handle a broad variety of non-Gaussian, non-stationary data. These methods are based on a convex optimization framework that allows for efficient computation even in high-dimensional problems. Aim 2: Exact Hamiltonian Monte Carlo methods. He will develop new Monte Carlo methods, based on an exact Hamiltonian Monte Carlo framework that his group has recently developed, to sample from and explore the structure of high-dimensional models with mixed discrete and continuous components. Aim 3: New spike-and-slab models and methods. He will combine these approaches to develop new methods for performing inference in "spike-and-slab" Bayesian models for sparse regression and factor analysis. Overall Merit and ONR Mission/Relevance: The proposed research will markedly enhance the realism, flexibility, and computational speed of models used for processing of complex information sources and for detection/classification of sparse objects of interest corrupted by non-Gaussian noise or with non-stationary or non-linear dynamics, and thus will improve the performance of various Naval information processing systems and decision making processes. Progress: Latent variable time-series models are among the most heavily used tools from machine learning and applied statistics. These models have the advantage of learning latent structure both from noisy observations and from the temporal ordering in the data, where it is assumed that meaningful correlation structure exists across time. A few highlystructured models, such as the linear dynamical system with linear-Gaussian observations, have closed-form inference procedures (e.g. the Kalman Filter), but this case is an exception to the general rule that exact posterior inference in more complex generative models is intractable. Consequently, much work in time-series modeling focuses on approximate inference procedures for one particular class of models. The PI has extended recent developments in stochastic variational inference to develop a 'black-box' approximate inference technique for latent variable models with latent dynamical structure. He developed a structured Gaussian variational approximate posterior that carries the same intuition as the standard Kalman filter-smoother but, importantly, permits us to use the same inference approach to approximate the posterior of much more general, nonlinear latent variable generative models. The algorithm recovers accurate estimates in the case of basic models with closed-form posteriors, and more interestingly performs well in comparison to variational approaches that were designed in a bespoke fashion for specific non-conjugate models.
Status | Finished |
---|---|
Effective start/end date | 1/19/16 → 1/19/16 |
Funding
- Office of Naval Research: US$183,046.00
ASJC Scopus Subject Areas
- Statistics and Probability
- Energy(all)
- Engineering(all)