Nonparametric Testing: Efficiency and Distribution-freeness via Optimal Transportation

  • Sen, Bodhisattva (PI)

Project: Research project

Project Details

Description

Statistical hypothesis testing is the formal setup a statistician employs to decide between two competing hypotheses about the underlying data generating mechanism. Nonparametric methods have become increasingly popular in the theory and practice of statistics in recent times, primarily because of the greater flexibility they offer over parametric models. This research project investigates some problems in nonparametric hypothesis testing for multi-dimensional data. Modern computational capabilities, and the expanded data sets produced by modern scientific equipment have greatly increased the scope of such flexible statistical inference procedures. The investigator will develop a framework for "distribution-free" inference with multivariate data that generalizes many well-known and popular statistical ideas used for analyzing univariate data. On the collaborative front, the investigator will continue interdisciplinary research in astronomy. Further, some of these research problems will form the dissertation thesis of a current PhD student at Columbia. The investigator also plans to continue the tradition of mentoring undergraduate summer interns.The main thrust of this research is to study distribution-free methods for multivariate and Hilbert space-valued data, based on the theory of optimal transport -- a branch of mathematics that has received much attention lately in applied mathematics/probability/machine learning. These methods generalize the classical univariate rank-based methods to multivariate data. In the second part of the proposal, the investigator will study the asymptotic relative efficiency (ARE) of nonparametric tests and provide a characterization of ARE when the underlying test statistics converge weakly to an infinite mixtures of chi-square distributions, under the null hypothesis. This framework includes many interesting examples that arise in practice, including two-sample testing, independence testing, testing multivariate symmetry, inference on directional data, etc. The investigator will also develop a theoretical framework for estimating the ARE in this setting.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
StatusActive
Effective start/end date7/1/236/30/26

Funding

  • National Science Foundation: US$300,000.00

ASJC Scopus Subject Areas

  • Statistics, Probability and Uncertainty
  • Statistics and Probability
  • Mathematics(all)
  • Physics and Astronomy(all)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.