Deployment Case Study: High-Luminosity LHC
The High-Luminosity Large Hadron Collider (HL-LHC) is an upgrade to the CERN LHC that will increase luminosity by a factor of ten, enabling deeper exploration of particle physics. This upgrade also drives a massive leap in data requirements, producing 400–600 PB of raw data per year. To handle this scale, the HL-LHC relies on one of the world’s largest distributed compute and storage systems: the Worldwide LHC Computing Grid (WLCG).
Overview
- Location: CERN, Geneva, Switzerland (accelerator complex)
- Experiments: ATLAS, CMS, LHCb, ALICE (major detector collaborations)
- Luminosity: 10× increase compared to LHC baseline
- Data Scale: ~50 PB/year today ? ~400–600 PB/year with HL-LHC
- Processing: Worldwide LHC Computing Grid (170+ sites, 42 countries)
- Timeline: HL-LHC commissioning mid-2030s
Data Pipeline
Stage |
Location |
Function |
Notes |
Event Capture |
Detectors (ATLAS, CMS, ALICE, LHCb) |
Petabytes of collision data per run |
Trigger systems filter most events in real time |
Tier-0 Processing |
CERN Data Center (Geneva) |
First-pass reconstruction of detector events |
High-throughput HPC cluster, 100k+ cores |
Tier-1 Centres |
~13 global Tier-1 sites |
Long-term storage, reprocessing |
Connected via LHC Optical Private Network |
Tier-2 Centres |
~160 sites at universities, labs |
Simulation, analysis, user access |
Federated compute across 42 countries |
Archival & Science |
Global WLCG |
Exabyte-scale archive + distributed analysis |
Accessible to 10,000+ physicists worldwide |
HPC, Storage & Networking
- Compute: Exascale-class distributed computing across 170+ sites.
- Storage: 400–600 PB/year of new data, multi-exabyte archive by 2040s.
- Networking: LHC Optical Private Network (LHCOPN) at multi-Tbps capacity.
- Federation: WLCG acts as a precursor to cloud — distributed, federated computing at science scale.
Partners & Stakeholders
Partner |
Role |
CERN |
Tier-0 data center, accelerator operation |
Worldwide LHC Computing Grid (WLCG) |
Global federated compute/storage infrastructure |
Tier-1 Centres |
National labs in U.S., Europe, Asia for storage + reprocessing |
Tier-2 Centres |
Universities and regional HPC centers |
ATLAS, CMS, ALICE, LHCb Collaborations |
Science users driving workload demands |
Key Challenges
- Data Volume: Scaling from ~50 PB/year to 400–600 PB/year.
- Distributed Management: Coordinating 170+ sites across 42 countries.
- Networking: Multi-Tbps optical links must grow further to avoid bottlenecks.
- Energy: HPC clusters across the globe must address sustainability goals.
- Longevity: Archives must remain usable for decades of physics re-analysis.
Strategic Importance
- Scientific Discovery: Enables high-precision measurements and potential new physics discoveries.
- Infrastructure: Largest distributed scientific data infrastructure in the world.
- Technology Transfer: WLCG pioneered concepts now common in cloud + distributed AI.
- Global Collaboration: 10,000+ scientists using a unified compute and storage backbone.
Future Outlook
- Mid-2030s: HL-LHC begins operations with 10× luminosity.
- 2035–2040s: Annual 400–600 PB growth builds an exabyte-class archive.
- Beyond: WLCG evolves into a hybrid cloud + exascale federation model.
FAQ
- How much data will HL-LHC produce? 400–600 PB/year, an order of magnitude more than today’s LHC.
- How is the data processed? At CERN Tier-0, then across Tier-1/Tier-2 sites globally.
- What’s unique? Truly distributed, federated computing at exascale.
- How is this like AI data centers? Similar exascale compute/storage demands, but across a global federation rather than single campuses.
- Why important? Both for fundamental science and for advancing distributed computing architectures.