Deployment Case Study: High-Luminosity LHC


The High-Luminosity Large Hadron Collider (HL-LHC) is an upgrade to the CERN LHC that will increase luminosity by a factor of ten, enabling deeper exploration of particle physics. This upgrade also drives a massive leap in data requirements, producing 400–600 PB of raw data per year. To handle this scale, the HL-LHC relies on one of the world’s largest distributed compute and storage systems: the Worldwide LHC Computing Grid (WLCG).


Overview

  • Location: CERN, Geneva, Switzerland (accelerator complex)
  • Experiments: ATLAS, CMS, LHCb, ALICE (major detector collaborations)
  • Luminosity: 10× increase compared to LHC baseline
  • Data Scale: ~50 PB/year today ? ~400–600 PB/year with HL-LHC
  • Processing: Worldwide LHC Computing Grid (170+ sites, 42 countries)
  • Timeline: HL-LHC commissioning mid-2030s

Data Pipeline

Stage Location Function Notes
Event Capture Detectors (ATLAS, CMS, ALICE, LHCb) Petabytes of collision data per run Trigger systems filter most events in real time
Tier-0 Processing CERN Data Center (Geneva) First-pass reconstruction of detector events High-throughput HPC cluster, 100k+ cores
Tier-1 Centres ~13 global Tier-1 sites Long-term storage, reprocessing Connected via LHC Optical Private Network
Tier-2 Centres ~160 sites at universities, labs Simulation, analysis, user access Federated compute across 42 countries
Archival & Science Global WLCG Exabyte-scale archive + distributed analysis Accessible to 10,000+ physicists worldwide

HPC, Storage & Networking

  • Compute: Exascale-class distributed computing across 170+ sites.
  • Storage: 400–600 PB/year of new data, multi-exabyte archive by 2040s.
  • Networking: LHC Optical Private Network (LHCOPN) at multi-Tbps capacity.
  • Federation: WLCG acts as a precursor to cloud — distributed, federated computing at science scale.

Partners & Stakeholders

Partner Role
CERN Tier-0 data center, accelerator operation
Worldwide LHC Computing Grid (WLCG) Global federated compute/storage infrastructure
Tier-1 Centres National labs in U.S., Europe, Asia for storage + reprocessing
Tier-2 Centres Universities and regional HPC centers
ATLAS, CMS, ALICE, LHCb Collaborations Science users driving workload demands

Key Challenges

  • Data Volume: Scaling from ~50 PB/year to 400–600 PB/year.
  • Distributed Management: Coordinating 170+ sites across 42 countries.
  • Networking: Multi-Tbps optical links must grow further to avoid bottlenecks.
  • Energy: HPC clusters across the globe must address sustainability goals.
  • Longevity: Archives must remain usable for decades of physics re-analysis.

Strategic Importance

  • Scientific Discovery: Enables high-precision measurements and potential new physics discoveries.
  • Infrastructure: Largest distributed scientific data infrastructure in the world.
  • Technology Transfer: WLCG pioneered concepts now common in cloud + distributed AI.
  • Global Collaboration: 10,000+ scientists using a unified compute and storage backbone.

Future Outlook

  • Mid-2030s: HL-LHC begins operations with 10× luminosity.
  • 2035–2040s: Annual 400–600 PB growth builds an exabyte-class archive.
  • Beyond: WLCG evolves into a hybrid cloud + exascale federation model.

FAQ

  • How much data will HL-LHC produce? 400–600 PB/year, an order of magnitude more than today’s LHC.
  • How is the data processed? At CERN Tier-0, then across Tier-1/Tier-2 sites globally.
  • What’s unique? Truly distributed, federated computing at exascale.
  • How is this like AI data centers? Similar exascale compute/storage demands, but across a global federation rather than single campuses.
  • Why important? Both for fundamental science and for advancing distributed computing architectures.