Standard Research Grant: Toxic Docs 2.0 Research Infrastructure Project

Project: Research project

Project Details

Description

The past two decades have seen a veritable explosion in information technology innovations that allow for large-scale, mass digitization of documents. Toxic Docs leverages these innovations to provide users with access to the world's largest database of material on toxic substances (lead, asbestos, dioxin, PCBs, polyvinyl chloride, silica, and more). These include internal memos; unpublished scientific studies; boardroom minutes; and public relations documents. Almost all the material in Toxic Docs was previously inaccessible but has been unleashed in high-stakes legal litigation, then made public by our database. Users include academic researchers, environmental health scientists, journalists, and policy analysts. It is open to all and free of charge.

This iteration of Toxic Docs builds on the previous one in five ways. First, it continues expanding the collection. It is processing millions of additional pages of documents on asbestos and talc; PCBs; and silicosis and regularly receives more material. Second, a companion site, Toxic Tools, will offer users additional ways of probing the material in addition to full-text search. These include a named entity recognition tool that pries out common names, organizations, and places; a geography parser that fishes out documents related to locales; and a network guesser that infers relationships among people mentioned in documents; among others. Third, a new Freedom of Information Act (FOIA) wing will use responses from public records requests to build our collections. Finally, an Application Programming Interface (API) will allow users interested in working with plain-text versions of our data to do so. Toxic Docs provides the raw data necessary to construct fresh narratives about the new world of 21st-century environmental health risk. Its openness allows users to make their own objective judgments on how these substances came to circulate so widely in the environment.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

StatusActive
Effective start/end date4/1/223/31/25

Funding

  • National Science Foundation: US$115,676.00

ASJC Scopus Subject Areas

  • Health, Toxicology and Mutagenesis
  • Social Sciences(all)
  • Economics, Econometrics and Finance(all)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.