The group visits the Center for High-Throughput Computing

UW-Plasma recently was given the chance to visit the Center for High-Throughput Computing! Located in the basement of UW-Madison’s Discovery Building, the center’s 30,000 CPU cores are the backbone of hundreds of research projects both on and off campus. It does this through being one of the world’s leaders in high-throughput computing: running multitudes of jobs that fit on a single GPU at the same time.

CHTC sprung out of graduate student work in the 1990s entitled the Condor project. Back then, researchers borrowed workstations from the computer science faculty of the school, running jobs as soon as they shut off their computers for the day. It was like the researchers were scavenging birds flying above their food until the right moment, hence the name Condor. Eventually, the project was given the hundreds of thousands of dollars needed to create a proper center. The initial development of the center was focused on the development of the HTCondor software suite before building of the center rapidly progressed with the interest of CERN. In the 2010s, it evolved from the work of a research group to a proper campus center for scientific computing. As things stand, the center has become so central that it has moved from grant-based funding to being funded by the university’s money itself. Today, the center offers invaluable speed to researchers across the world and tests hardwares and softwares for weeks so that scientists at top laboratories don’t have to deal with the bugs themselves.

The center itself is quite the spectacle. Its primary body consists of rows of rectangular cabinets called racks. These racks contain everything from wiring to temperature control (cooling fans and heat vents) to the actual CPUs themselves. There are “whips” all throughout the ceiling to supply electricity to the center and along the back of the room is a wall of batteries the size of a small school bus which can power the whole facility in just 5 minutes. It is also extremely loud inside due to the many fans cooling the hardware, anyone who enters is required to wear industrial grade ear protection.

In terms of actual use, the center offers its efficient computing through novel networking using Infiniband (think Ethernet but more focused on scientific computing). Researchers everywhere have easy access to the HTCondor software suite which anybody can use to connect the power of the center to their own Raspberry Pis and GPUs. More specifically for UW-Plasma, JAX’s complete differentiability pairs really well with high-throughput computing, allowing for bigger jobs to be ran all these different times on Condor’s CPUs. There are also other applications like parameter sweeps for optimizing models and backwards and forwards passes for future machine-learning uses. If any researcher in the group is interested in using Condor’s resources, any UW-Madison student/faculty member has access to free training with a facilitator if they request so.

All-in-all, it was a very cool trip for everyone involved and the group looks forward to future collaboration with the center in plasma physics simulations.

 

Further reading/sources:

https://uw-madison-datascience.github.io/ML-X-Nexus/Toolbox/Compute/CHTC.html 

https://htcondor.org/ 

https://researchdata.wisc.edu/uncategorized/chtc-interview/ 

https://research.cs.wisc.edu/htcondor/doc/condorgrid.pdf 

https://www.fs.com/blog/infiniband-vs-ethernet-what-are-they-2740.html