AEVIONTrust · IP · Globus
DemoAuthQRightQSignBureauPlanetAwardsBankChessPricingAPI
Перейти к содержимому
← К разделам
HPC Supercomputing

🖥️ HPC Supercomputing Center (Exascale)

Модуль #319. Astana Nazarbayev University Computer Center + Almaty Институт ИИВТ. Reference: Frontier ORNL (1.7 ExaFlops Top500 #1 2024), Fugaku Japan (442 PFlops), LUMI Finland (380 PFlops). HPC for climate modeling + drug discovery + AI training + nuclear physics simulation. Architecture: ~10 000 nodes × CPU AMD EPYC 9954 96-core + GPU NVIDIA H100/B200 80GB HBM3 × 4-8 per node, InfiniBand HDR 200 Gbps interconnect, parallel storage Lustre 100 PB. Liquid cooling direct-to-chip + adiabatic outdoor chillers. ISO 27001 + TIA-942 Tier IV + Green Grid PUE <1.2 + СН РК 4.04-12.

1. Состав HPC 100 PFlops cluster

  • Compute nodes — 2500 servers Dell PowerEdge XE9680 / HPE Cray EX235n + AMD EPYC 9954 96-core × 2 (192 cores/node);
  • GPU acceleration — NVIDIA H100 80GB HBM3 × 4 per node = 10 000 GPUs total, 60 PFlops FP16 compute;
  • Memory — 1.5 TB DDR5-4800 per node, 3.75 PB total;
  • Storage — Lustre parallel filesystem 100 PB (50 PB usable raid-6) + flash tier 1 PB NVMe;
  • Interconnect — InfiniBand NDR 400 Gbps fat-tree topology, Mellanox QM9700 switches;
  • Cooling — direct-to-chip liquid cooling DLC 100% (water-cooled cold plate Coolant Distribution Unit CDU);
  • Heat removal — outdoor adiabatic chiller Bell+Gossett 5 MW thermal capacity (PUE 1.05-1.15);
  • Power — 5 MW IT + 1 MW cooling + 0.5 MW UPS losses = 6.5 MW total; ATS automatic transfer от 2x utility feeds + 4× 1.5 MW DG;
  • UPS — Eaton 9395 1 MVA × 5 N+1 redundancy + 30-min battery + 5-min flywheel ride-through;
  • Networking — 100 Gbps fiber backbone к Almaty Internet eXchange + dark fiber до universities;
  • Security — TIA-942 Tier IV: biometric + mantrap + 24/7 NOC monitoring + cybersecurity Falcon Crowdstrike;
  • Cleanroom — ISO 14644 cl.8 (data center grade, dust control), pressurised positive +25 Pa;
  • Building — 5000 m² total, 2500 m² data hall raised-floor 1.5 m, ceiling 6 m hot aisle containment;
  • Office + visitor centre + classroom для researchers + workshops;
  • Green sustainability — PUE 1.10 target, 30% PV onsite + waste heat recovery to district heating;
  • Compliance — ISO 27001 + SOC 2 Type II + Green Grid + LEED Platinum BD+C.

Упражнение 1 — Cooling technology

Упражнение 2 — FP16 peak (PFlops)

Упражнение 3 — Капекс 100 PFlops HPC

  • Compute 2500 nodes EPYC + H100 GPUs × $25K avg = 30 млрд
  • Interconnect InfiniBand NDR fat-tree + switches Mellanox = 6 млрд
  • Storage Lustre 100 PB + flash tier = 8 млрд
  • Liquid cooling DLC + CDU + chillers Bell+Gossett 5 MW = 6 млрд
  • UPS Eaton 9395 5 MVA + flywheel + DG 6 MW = 5 млрд
  • Building 5000 м² + raised floor + structural reinforce = 5 млрд
  • Security TIA-942 Tier IV + cybersecurity + NOC = 2 млрд
  • Networking + IT + проект 5% + commissioning + insurance = 3 млрд

Упражнение 4 — Power efficiency

Нормативная база

  • TIA-942 — Data Center Standards
  • Green Grid PUE
  • ASHRAE TC9.9 — Mission Critical Facilities
  • Open Compute Project OCP
  • ISO 27001 + SOC 2 Type II
  • LEED Platinum BD+C
  • Top500 Standards
  • СН РК 4.04-12 — Связь и информатика