Deployment Case Study: High-Luminosity LHC

The High-Luminosity Large Hadron Collider (HL-LHC) is an upgrade to the CERN LHC that will increase luminosity by a factor of ten, enabling deeper exploration of particle physics. This upgrade also drives a massive leap in data requirements, producing 400–600 PB of raw data per year. To handle this scale, the HL-LHC relies on one of the world’s largest distributed compute and storage systems: the Worldwide LHC Computing Grid (WLCG).

Overview

Location: CERN, Geneva, Switzerland (accelerator complex)
Experiments: ATLAS, CMS, LHCb, ALICE (major detector collaborations)
Luminosity: 10× increase compared to LHC baseline
Data Scale: ~50 PB/year today ? ~400–600 PB/year with HL-LHC
Processing: Worldwide LHC Computing Grid (170+ sites, 42 countries)
Timeline: HL-LHC commissioning mid-2030s

Data Pipeline

Stage	Location	Function	Notes
Event Capture	Detectors (ATLAS, CMS, ALICE, LHCb)	Petabytes of collision data per run	Trigger systems filter most events in real time
Tier-0 Processing	CERN Data Center (Geneva)	First-pass reconstruction of detector events	High-throughput HPC cluster, 100k+ cores
Tier-1 Centres	~13 global Tier-1 sites	Long-term storage, reprocessing	Connected via LHC Optical Private Network
Tier-2 Centres	~160 sites at universities, labs	Simulation, analysis, user access	Federated compute across 42 countries
Archival & Science	Global WLCG	Exabyte-scale archive + distributed analysis	Accessible to 10,000+ physicists worldwide

HPC, Storage & Networking

Compute: Exascale-class distributed computing across 170+ sites.
Storage: 400–600 PB/year of new data, multi-exabyte archive by 2040s.
Networking: LHC Optical Private Network (LHCOPN) at multi-Tbps capacity.
Federation: WLCG acts as a precursor to cloud — distributed, federated computing at science scale.

Partners & Stakeholders

Partner	Role
CERN	Tier-0 data center, accelerator operation
Worldwide LHC Computing Grid (WLCG)	Global federated compute/storage infrastructure
Tier-1 Centres	National labs in U.S., Europe, Asia for storage + reprocessing
Tier-2 Centres	Universities and regional HPC centers
ATLAS, CMS, ALICE, LHCb Collaborations	Science users driving workload demands

Key Challenges

Data Volume: Scaling from ~50 PB/year to 400–600 PB/year.
Distributed Management: Coordinating 170+ sites across 42 countries.
Networking: Multi-Tbps optical links must grow further to avoid bottlenecks.
Energy: HPC clusters across the globe must address sustainability goals.
Longevity: Archives must remain usable for decades of physics re-analysis.

Strategic Importance

Scientific Discovery: Enables high-precision measurements and potential new physics discoveries.
Infrastructure: Largest distributed scientific data infrastructure in the world.
Technology Transfer: WLCG pioneered concepts now common in cloud + distributed AI.
Global Collaboration: 10,000+ scientists using a unified compute and storage backbone.

Future Outlook

Mid-2030s: HL-LHC begins operations with 10× luminosity.
2035–2040s: Annual 400–600 PB growth builds an exabyte-class archive.
Beyond: WLCG evolves into a hybrid cloud + exascale federation model.

FAQ

How much data will HL-LHC produce? 400–600 PB/year, an order of magnitude more than today’s LHC.
How is the data processed? At CERN Tier-0, then across Tier-1/Tier-2 sites globally.
What’s unique? Truly distributed, federated computing at exascale.
How is this like AI data centers? Similar exascale compute/storage demands, but across a global federation rather than single campuses.
Why important? Both for fundamental science and for advancing distributed computing architectures.