AF: Medium: The Trace Reconstruction Problem

  • Servedio, Rocco (PI)
  • Chen, Xi (CoPI)

Project: Research project

Project Details

Description

Problems that involve the transmission and recovery of information play an important role in computer science and many related areas. For example, the field of coding theory studies how to cleverly encode a message so that the underlying information can still be recovered even after the message has been subjected to some kind of corruption by noise. While coding theory has been hugely successful and influential, there are many real-world settings in which a noise process affects information 'in the wild' where coding schemes cannot be applied. For example, given a DNA sequence that is subject to mutations or deletions, there is no opportunity to encode the sequence using tools of coding theory because it occurs in nature in a way that is not under human control. So, there is a need for techniques that can reconstruct information that has *not* been cleverly encoded, and yet has been corrupted with different types of noise. One particularly challenging type of noise is 'deletion noise', meaning that some of the characters of the message are deleted with no indication being given as to where the deletions took place. Fortunately, in many scenarios of this sort it is natural to assume that multiple independent copies of the deletion-plagued data are available; for example, in a biological setting one might have many copies of a given DNA sequence where deletions have affected each copy in a different way. This project will study algorithms for performing reconstruction in scenarios such as the above. Another important goal of this project will be dissemination and outreach activities. Planned activities include new advanced courses on reconstruction problems, training graduate students and postdocs through research collaboration, disseminating research results through seminar talks, survey articles and other publications, and continuing ongoing outreach activities aimed at increasing interest in and awareness of theoretical computer science in a broader population.

In more detail, the research team will build on their preliminary results to develop new algorithms for the trace-reconstruction problem, where independent copies of data corrupted by deletions (such a deletion-corrupted data string is known as a 'trace') are available and the challenge is to reconstruct the original uncorrupted data. They will analyze a number of different natural variants of this problem, which are obtained by making different assumptions about the nature of the deletion process (low, intermediate, or high deletion rates, correlated versus uncorrelated deletions, and so on); the type of data that is being transmitted (worst-case data, average-case data, etc); and the desiderata for successful reconstruction (perfect reconstruction versus different notions of approximate reconstruction). The research team will also investigate new algorithmic problems related to deletion noise, including more challenging versions of the problem described above in which the reconstruction algorithm is given a mixture of traces from multiple different source strings and the goal is to reconstruct the entire underlying 'population' of source strings rather than just a single source string.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

StatusActive
Effective start/end date6/1/215/31/25

Funding

  • National Science Foundation: US$583,643.00

ASJC Scopus Subject Areas

  • Computer Science(all)
  • Computer Networks and Communications
  • Electrical and Electronic Engineering
  • Communication

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.