During the CERN pilot beams at the end of October, for the first time CMS has reconstructed online the LHC collision data with Graphic Processing Units (GPUs).
At CMS, the High Level Trigger (HLT) is responsible for analysing in real time up to 100’000 collision events per second, and selecting the most highly energetic, rare or otherwise interesting ones for permanent storage and offline analysis.
During Run-1 and Run-2 of the LHC, the HLT ran on a traditional computer farm, comprising over 30’000 Central Processing Unit (CPU) cores in 2018. However, as the studies for the Phase-2 upgrade of CMS¹ have shown, the use of GPUs will be instrumental to keep the cost, size and power consumption of the HLT farm under control at the highest LHC luminosity. And in order to gain experience with a heterogeneous farm and the use of GPUs in a production environment, CMS will equip the whole HLT with GPUs already from the start of Run-3: the new farm will be comprised of 200 nodes, each equipped with two AMD Milan 64-core CPUs and two NVIDIA Tesla T4 GPUs - for a total of 25’600 CPU cores and 400 GPUs.
A candidate HLT node for Run-3, equipped with two AMD Milan 64-core CPUs and two NVIDIA Tesla T4 GPUs. Credits: Felice Pantaleo
First-hand experience is not the only goal, however. The additional computing power provided by these GPUs will not only allow CMS to improve the quality of the online reconstruction² but also to extend its physics programme, running the online data scouting analysis³ at a much higher rate than before. And, as in CMS the online and offline reconstruction share the same implementation, rewriting the online reconstruction to run on GPUs makes it available also offline, allowing CMS to take advantage of the computing power of the latest supercomputers and High Performance Computing (HPC) centres.
Today about 30% of the HLT processing can be offloaded to GPUs: the calorimeters local reconstruction, the pixel tracker local reconstruction, the pixel-only track and vertex reconstruction. The number of algorithms that can run on GPUs will grow during Run-3, as other components are already under development. At the High Luminosity LHC the goal will be to offload to computing accelerators at least 50% of the HLT in Run-4 and 80% in Run-5.
To ensure the reproducibility of physics results on any machine, the HLT can run seamlessly with and without GPUs: algorithms are automatically offloaded to a GPU when one is available, and otherwise fall back to running on the CPU. This configuration was tested in real time with the stable beams collisions during the LHC pilot beams at the end of October: while the rest of the farm was running on CPUs, five of the legacy nodes from 2018 were equipped with a single T4 GPU. On these machines, the HLT ran two copies of the reconstruction of energy deposits in the calorimeters - the original one on CPUs, and the new one on GPUs - and used their output to reconstruct the high energy jets produced by the proton-proton collisions. Comparing the number of events with jets above a given threshold, the results were found to be identical.
Number of events per second with at least one calorimetric jet reconstructed on GPU (blue) and CPU (orange) above a fixed threshold. The two lines overlap completely.
This test prepares the stage for the final commissioning of the HLT reconstruction running on GPUs that will take place in 2022, during the startup of the LHC at the beginning of Run-3. Afterwards, the CMS HLT will be fully validated and capable of running on traditional machines as well as taking advantage of GPUs.
¹ The Phase-2 Upgrade of the CMS Data Acquisition and High Level Trigger, CMS-TDR-022, CERN-LHCC-2021-007, https://cds.cern.ch/record/2759072
² Heterogeneous Reconstruction of Tracks and Primary Vertices With the CMS Pixel Tracker, Front. Big Data, 21 December 2020, https://doi.org/10.3389/fdata.2020.601728
³ CMS data scouting and a search for low-mass dijet resonances, CERN Courier, 13 November 2015, https://cerncourier.com/a/cms-data-scouting-and-a-search-for-low-mass-dijet-resonances/