We already knew AMD would energy the world’s quickest supercomputer – the US Division of Vitality (DOE) El Capitan. Anticipated to be put in in 2023 on the Lawrence Livermore Nationwide Laboratory (LLNL), the HPE-built system initially leveraged AMD’s Zen 4 CPU cores and MI Intuition GPU accelerators, unlocking unheard-of efficiency above the two Exaflop mark. But there’s one thing that the preliminary announcement didn’t say: the system received’t be leveraging disparate CPU and GPU accelerators. As a substitute, confirming our hypothesis, El Capitan will probably be leveraging AMD’s recently-announced MI 300 Accelerated Processing Items (APUs). It marks the primary time an APU is a supercomputer’s central processing grunt (opens in new tab)– and at Exascale, no much less.
“It’s the primary time we’ve publicly acknowledged this,” stated affiliate director for HPC (Excessive Efficiency Computing) at LLNL, Terri Quinn. In a world-first disclosure in a presentation delivered at the moment to the 79th HPC Person Discussion board at Oak Ridge Nationwide Laboratory (ORNL), he added that the knowledge got here straight from the supply: “I minimize these phrases out of [AMD’s] buyers doc, and that’s what it says: it’s a 3D chiplet design with AMD CDNA3 GPUs, Zen 4 CPUs, cache reminiscence and HBM chiplets.”
AMD’s MI300 APUs will characteristic CPU and GPU chiplets in the identical 3D-enabled packaging with a coherent, HBM3 reminiscence structure, powered by the corporate’s 4th technology Infinity Material and subsequent technology Infinity Cache. Leveraging each Zen 4 and the CDNA 3 graphics acceleration structure, MI300 APUs will leverage TSMC’s 5nm course of know-how (possible N5 or N5P). Nonetheless, the stability of CPU and GPU cores per APU continues to be a wild guess.
Being APUs, El Capitan will profit from what’s more likely to be the densest efficiency profile ever achieved on the planet of supercomputing. Make no mistake: El Capitan will characterize the top of semiconductor efficiency, design, and integration. It’s not hyperbolic to say that it’s more likely to be certainly one of humanity’s most technologically complicated endeavors.
It’s all due to tightly-packaged AMD APUs, bundled into HPE Cray XE racks and tied along with Cray’s Slingshot-11 networking, powered by its 16 nanometer Rosetta controllers that may dish out 200 Gb/sec interconnects. The shape issue and the variety of accelerators per rack continues to be query mark. When push involves shove, Frontier also needs to turn into one of the vital energy-efficient methods (if not probably the most environment friendly), with working energy restricted to 40 MW for an optimum efficiency/energy stability. Workloads will run by way of El Capitan’s circuits ranging from 2Q 2024, with the deliberate finish of life set for 2030.
AMD’s continued roll into the Top500 checklist of the world’s strongest supercomputers retains advancing at a breakneck tempo. The corporate is steamrolling Intel’s earlier dominance, already scoring 5 out of the world’s prime ten supercomputers – together with first place, due to Frontier – towards Intel’s single Xeon-based system powering China’s Tianhe-2A, presently rating ninth (opens in new tab). The corporate has come a great distance from its notorious and practically company-breaking Steamroller structure household.
Not all information is unhealthy information for Intel, nevertheless, as the corporate too has earned an Exascale contract with the Argonne Nationwide Laboratory. The Aurora supercomputer, too, will probably be a 2-exaflops HPE-Intel system that has undergone a number of revisions already. Aurora’s set up is already underway, albeit the precise date it enters operation continues to be unclear. Intel’s delays on its Sapphire Rapids CPUs have already pushed the supercomputer’s set up, so it stays to be seen how lengthy the execution will take.
Nvidia, too has a related presence on the planet’s top-performing methods, though it presently solely operates within the GPU supplier area, scoring three methods powered by its GPUs. However it just lately achieved a necessary contract as a supplier of each CPUs and GPUs for MareNostrum 5, to be put in within the Barcelona Supercomputing Centre (BSC) in Spain. The operation may begin as early as 2023.
Sadly, Nvidia has already taken the “Superchip” nomenclature with its Arm-based Grace CPU product for Excessive-Efficiency Computing (HPC) deployments. So maybe AMD needs to be seeking to declare an “Überchip” already?