High Performance Computing Cluster for Biomedical Research

  • Califano, Andrea A (PI)

Project: Research project

Project Details

Description

The Department of Systems Biology (DSB) at the Columbia University Irving Medical Center (CUIMC) hosts and operates a state-of-the-art high-performance computing environment (HPCE), assembled specifically as a Core resource to serve the needs of biomedical and systems biology research at Columbia University. The HPCE has been instrumental in supporting the research of DSB members, enabling demanding computational investigations that require millions of CPU hours every year and involve the processing of many terabytes of genomic data, and leading to numerous high-impact publications. Beyond its use by DSB faculty, the HPCE is a resource that is available to the entire CUIMC community and is extensively utilized by many non-DSB investigators, serving more than 1000 registered users belonging to over 100 research groups across many Departments. Further, it is crucial to the operation of the Columbia Genome Center, providing high performance computing and storage capabilities that are essential for the bioinformatics analysis of next- generation sequencing data. Central to the HPCE operation is a high-performance computing cluster (HPCC) acquired with funding from a 2012 $2M S10 high-end instrumentation grant and put in production operation in 2014. With 6360 CPU cores, 150 NVIDIA GPUs, and 32 TB of RAM at deployment time, the HPCC has been instrumental in enabling NIH- funded research that has led to many significant discoveries. The cluster has been in service for over eight years and has now reached the end of its service life, punctuated by node failures that have occasionally caused loss of long-running computations. Through this proposal, we seek to retire this aging system and replace it with a new cluster that can accommodate the current and future needs of our investigator community. Further, we aim to optimize the overall cost and performance profile of our architecture by leveraging lessons learned over the past several years. Our goal is to provide a solution that is both more cost- effective and better aligned with the historical and projected usage patterns of our user base, including the increasing utilization of GPU-driven parallel computing, particularly in computationally demanding machine- learning applications. To that end, we have worked with Dell engineers to design a HPCC that we are confident will be able to meet the needs of our users for the next five years, comprising a hybrid CPU/GPU architecture with low and high memory nodes. The proposed configuration will provide a total of 4,736 CPU-cores, 80 powerful NVidia A100 GPUs, 22TB memory, and deliver about 1.6 PFLOPS at peak performance. The new equipment is mission-critical for our continuing ability to support the high performance computing needs of NIH-funded research at CUIMC. No adequate HPCE alternatives exist at Columbia. Further, the cost of commercial cloud computing options is prohibitive when taking into account the sheer volume of our high performance computing and data storage needs.
StatusFinished
Effective start/end date4/15/234/14/24

Funding

  • NIH Office of the Director: US$2,000,000.00

ASJC Scopus Subject Areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.