Triggering and Data AcquisitionTriggering and Data Acquisition lucas Wed, 11/23/2011 - 17:32
When CMS is performing at its peak, about one billion proton-proton interactions will take place every second inside the detector. There is no way that data from all these events could be read out, and even if they could, most would be less likely to reveal new phenomena; they might be low-energy glancing collisions for instance, rather than energetic, head-on interactions.
We therefore need a “trigger” that can select the potentially interesting events, such as those which will produce the Higgs particle, and reduce the rate to just a few hundred “events” per second, which can be read out and stored on computer disk for subsequent analysis.
However, with groups of protons colliding 40 million times per second there are only ever 25 nanoseconds (25 billionths of a second) before the next lot arrive. New waves of particles are being generated before those from the last event have even left the detector! The solution is to store the data in pipelines that can retain and process information from many interactions at the same time. To not confuse particles from two different events, the detectors must have very good time resolution and the signals from the millions of electronic channels must be synchronised so that they can all be identified as being from the same event.
Level 1 of the trigger is an extremely fast and wholly automatic process that looks for simple signs of interesting physics, e.g. particles with a large amount of energy or in unusual combinations. This is like a reader simply scanning the headlines of a newspaper to see if anything catches their eye. This way we select the best 100,000 events or “issues” each second from the billion available. For the next test, the higher level trigger, we assimilate and synchronise information from different parts of the detector to recreate the entire event - like collating the different pages to form the full newspaper - and send it to a farm of more than 1000 standard computers.
Here the PCs are like speed readers, who with more detailed information review the information for longer, less than a tenth of a second. They run very complex physics tests to look for specific signatures, for instance matching tracks to hits in the muon chambers, or spotting photons through their high energy but lack of charge. Overall they select 100 events per second and the remaining 99,900 are thrown out. We are left with only the collision events that might teach us something new about physics. Despite the trigger system, CMS still records and analyses several petabytes of data, that’s millions of gigabytes, around the same amount of information as in 10,000 Encyclopaedia Britannica every second, although not all these data are stored.
For a more detailed account of CMS Triggering see:
CMS The TriDAS Project Technical Design Report
The History of CMS Data Acquisition and Triggering
Early History and Key Decisions
The CMS Trigger Group was originally led by Fritz Szoncsó. Wesley Smith was appointed Trigger Project Manager in 1994. One of the earliest decisions made by the group was not to follow existing three-level trigger-system designs.
Just a few years prior to this, in 1985, the ZEUS experiment was the first to decide to use a three-level trigger system which consisted of: pure hardware (L1), mostly custom hardware (L2) and the computer farm (L3). The Tevatron experiments also adopted this architecture.
At CMS, TriDAS (Trigger and Data Acquisition System) Project Manager Sergio Cittolin, TriDAS Institution Board Chair Paris Sphicas and Smith decided not to have a second level trigger. They would take the output of L1 straight to the computer farm for software processing. The main reason for doing this was that the L2 hardware was too restrictive. It was not fully programmable, and was only used at the time because there was no telecom switch that could convey the full L1 output of 100 kHz of 1 MB events to the farm.
However, Cittolin, studying technology trends and extrapolating world-wide computing network infrastructure, was convinced that a switch with the required bandwidth would be available and affordable by the year 2000. So, when the technical proposal was written in ’94-’95, a plan to go from L1 to the computer farm was laid out.
When the technical proposal was presented to the LHC Experiments Committee (LHCC) in 1994, a different approach was adopted to deal with the bandwidth problem, in order to pass the reviews: The proposal said that 10% of the data would go through the switch for processing and would be used as a basis to knock down the rate by a factor of 10 — sufficient for the switches of 1994 to handle! This 10% — including main calorimeter information, summary of tracking information and so on — would then be used to decide whether to keep the event or not. The group hoped, though, that they would not have to face the problem at all.
Prototyping and testing followed until the Technical Design Report (TDR) was published in 1999/2000 and defended in front of the LHCC. By 2002, network switches with the required bandwidth were available and the problem was solved.
Consequences of Two-Level Trigger
One consequence of the decision was that L1 has to be more efficient than in previous systems in order to perform certain tasks traditionally performed by L2.
Prior to the CMS Trigger, triggers were designed to count objects: the number of electrons/muons over a certain threshold and the like, providing a histogram. In CMS, characteristics of objects, including their energy and co-ordinates would be retained, which required sorting of the objects so that only the prime candidates would be selected.
However, sorting consumes time. Which is why the latency — trigger processing time — is longer than it would have been for the L1 trigger in a three-level trigger system. Custom chips were built for the calorimeter trigger to perform the sorting fast enough.
The upside to this, Smith says, is that instead of asking how many jets or electrons we have, we can ask, “Is there an electron above an ET threshold back-to-back with a jet above that threshold?” and other such questions. This had never been done before and almost allows CMS to perform offline operations at the L1 stage itself.
Wesley Smith adds: “I think it’s been proven that we really built it almost optimally. It was a long extrapolation but I think in retrospect it’s a fairly successful design. The higher level triggers worked out quite well, are well designed, and showed the flexibility of the system. In the end, if you look back at the history, we did make the right decisions. Although at the time, as always, it was not clear.”