CI-ADDO-NEW: Stan, Scalable Software for Bayesian Modeling

  • Gelman, Andrew (PI)
  • Carpenter, Bob (CoPI)

Project: Research project

Project Details

Description

This award is to design, code, document, test, dissememinate, and maintain Stan,an extensible

open-source software framework and compiler for efficient and scalable Bayesian statistical modeling.

Stan is an extensible, open-source, cross-platform software framework for developing Bayesian statistical

models. The first step in Bayesian modeling is setting up a full probability model for all quantities of

interest. Stan facilitates this process by providing an expressive and extensible domain-specific

programming language for specifying probabilistic models. By compiling a model specification into

executable code, Stan fully automates the second step of Bayesian inference, calculating the probabilities

of unobserved quantities, such as model parameters and future observations, conditional on observed data.

The third step involves evaluating the fit of the model to the data and its predictions for unseen data.

When the model is easy to encode and inferences are fast and automatic to compute, it is easy to iterate

the specification, fit and evaluation steps in order to refine the scientific model.

Stan improves on the existing state of the art in both algorithmic and implementation details. Rather than

being interpreted on the fly like its predecessors, Stan models are compiled to C++ code, which

dramatically improves both scalability and efficiency. Stan provides a full algorithmic differentiation library for the functions required for statistical modeling. This method applies the chain rule from calculus to the program computing the probability function in order to calculate derivatives efficiently and accurately (a small multiple of the time taken to compute the

function, independently of dimensionality). This allows Stan to fully automate the model fitting stage

given only a specification of the probability function in Stan's modeling language.

To maximize Stan's accessibility to the scientific community, it is being coded using standards-compliant

C++, so that it will run under Windows, Macintosh, and Unix/Linux. To make running Stan even easier,

it is callable from R, MATLAB, and Python, the three most popular platforms for numerical analysis,

including exploration and plotting.

StatusFinished
Effective start/end date6/1/125/31/15

Funding

  • National Science Foundation: US$499,637.00

ASJC Scopus Subject Areas

  • Statistics and Probability
  • Computer Networks and Communications

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.