Intel held its Information Heart and AI Investor Webinar at the moment, revealing new merchandise and efficiency demos alongside a roadmap replace that reveals the corporate’s Xeon is transferring alongside on schedule. Intel additionally introduced that Sierra Forest, its first-gen effectivity Xeon, will include an unbelievable 144 cores, thus providing higher core density than AMD’s competing 128-core EPYC Bergamo chips. The corporate even teased the chip in a demo at its occasion. Intel additionally revealed the primary particulars of Clearwater Forest, it is second-gen effectivity Xeon that can debut in 2025. Intel passed over its 20A course of node for the extra performant 18A for this new chip, which speaks volumes about its religion within the well being of its future node.
Intel additionally offered a number of demos, together with head-to-head AI benchmarks in opposition to AMD’s EPYC Genoa that present a 4X efficiency benefit for Xeon in a head-to-head of two 48-core chips, and a reminiscence throughput benchmark that confirmed the next-gen Granite Rapids Xeon delivering an unbelievable 1.5 TB/s of bandwidth in a dual-socket server.
Intel’s disclosures, which embrace many different developments that we’ll cowl under, come as the corporate executes on its audacious objective of delivering 5 new nodes in 4 years, an unprecedented tempo that can energy its broad knowledge heart and AI portfolio that features CPUs, GPUs, FPGAs, and Gaudi AI accelerators.
Intel has misplaced the efficiency lead within the knowledge heart to AMD, and its path to redemption has been marred by delays to its Sapphire Rapids and GPU lineups. Nevertheless, the corporate says it has solved the underlying points in its course of node tech and revamped its chip design methodology to forestall additional delays to its next-gen merchandise. Let’s see what the roadmap appears to be like like.
Intel Xeon CPU Information Heart Roadmap
Picture 1 of 2
Intel’s roadmap for its present Xeon merchandise stays intact and on schedule because the final replace in February 2022, but it surely now has a brand new entrant — Clearwater Forest. We’ll cowl that chip intimately additional under.
Intel’s knowledge heart roadmap is cut up into two swim lanes. The P-Core (Efficiency Core) fashions are the standard Xeon knowledge heart processor with solely cores that ship the total efficiency of Intel’s quickest architectures. These chips are designed for high per-core and AI workload efficiency. In addition they come paired with accelerators, as we see with Sapphire Rapids.
The E-Core (Effectivity Core) lineup consists of chips with solely smaller effectivity cores, very like we see current on Intel’s shopper chips, that eschew some options, like AMX and AVX-512, to supply elevated density. These chips are designed for top vitality effectivity, core density, and complete throughput that’s enticing to hyperscalers. Intel’s Xeon processors won’t have any fashions with each P-cores and E-cores on the identical silicon, so these are distinct households with totally different use circumstances.
|Row 0 – Cell 0||2023||2024||2025|
|Intel P-Cores||Emerald Rapids – Intel 7 | Sapphire Rapids HBM||Granite Rapids – Intel 3||Row 1 – Cell 3|
|AMD P-Cores||5nm Genoa-X||Turin – Zen 5||—|
|Intel E-Cores||—||1H – Sierra Forest – Intel 3||Clearwater Forest – Intel 18A|
|AMD E-Cores||1H – Bergamo – 5nm – 128 Cores||—||—|
Right here we are able to see how Intel’s roadmap appears to be like subsequent to AMD’s knowledge heart roadmap. The present high-performance battle rages on between AMD’s EPYC Genoa, launched final 12 months, and Intel’s Sapphire Rapids, launched early this 12 months. Intel has its Emerald Rapids refresh era coming in This autumn of this 12 months, which the corporate says will include extra cores and sooner clock charges, together with its HBM-infused Xeon Max CPUs. AMD has its 5nm Genoa-X merchandise slated for launch later this 12 months. Subsequent 12 months, Intel’s next-gen Granite Rapids will sq. off with AMD’s Turin.
Within the effectivity swim lane, AMD’s Bergamo takes a really comparable core-heavy strategy as Sierra Forest by leveraging AMD’s dense Zen 4c cores, however it should arrive within the first half of this 12 months, whereas Intel’s Sierra Forrest gained’t arrive till the primary half of 2024. AMD hasn’t stated when its second-gen e-core mannequin will arrive, however Intel now has its Clearwater Forest on the roadmap in 2025.
Intel E-Core Xeon CPUs: Sierra Forest and Clearwater Forest
Picture 1 of 2
Intel’s e-core roadmap begins with the 144-core Sierra Forest, which is able to present 256 cores in a single dual-socket server. The fifth-generation Xeon Sierra Forest’s 144 cores additionally outweigh AMD’s 128-core EPYC Bergamo by way of core counts however possible doesn’t take the lead in thread depend — Intel’s e-cores for the buyer market are single-threaded, however the firm hasn’t divulged whether or not the e-cores for the information heart will help hyperthreading. In distinction, AMD has shared that the 128-core Bergamo is hyperthreaded, thus offering a complete of 256 threads per socket.
We additionally don’t know the particulars of efficiency for Intel or AMD’s dense cores, so we gained’t know the way these chips examine till silicon hits the market. Nevertheless, we do know that Intel’s e-cores don’t help a few of the ISA it helps with its p-core; Intel omits AVX-512 and AMX to make sure the utmost density, whereas AMD’s Bergamo Zen 4c cores help the identical options as its normal cores.
Intel’s Sierra Forest is seemingly properly on monitor for the primary half of 2024, although: Pictures of the Mountain Stream programs have already leaked on-line, together with footage of the large LGA7529 socket you’ll be able to see under. This socket will home each the e-core Sierra Forest and p-core Granite Rapids processors (extra on Granite under).
This means that the Sierra Forest platforms are already with Intel’s companions, and the corporate additionally tells us that it has powered on the silicon and had an OS booting in lower than 18 hours (an organization file). This chip is the lead car for the ‘Intel 3’ course of node, so success is paramount. Intel is assured sufficient that it has already sampled the chips to its clients and demoed all 144 cores in motion on the occasion. Intel initially goals the e-core Xeon fashions at particular kinds of cloud-optimized workloads however expects them to be adopted for a broader vary of use circumstances as soon as they’re available in the market.
Intel additionally introduced Clearwater Forest for the primary time. Intel didn’t share many particulars past the discharge within the 2025 timeframe however did say it should use the 18A course of for the chip, not the 20A course of node that arrives half a 12 months earlier. This would be the first Xeon chip with the 18A course of. Intel tells us that the compressed nature of its course of roadmap — the corporate plans to ship 5 nodes in 4 years — gave it the choice to decide on both the 18A course of that arrives in 2024 or the 20A course of that’s manufacturing prepared within the second half of 2024.
The 18A node is Intel’s second-gen ‘Angstrom’ node and is analogous to 1.8nm. Intel’s first-gen Angstrom node, 20A, will incorporate RibbonFET, a gate-all-around (GAA) stacked nanosheet transistor tech, and Intel’s PowerVia bottom energy supply (BSP) know-how. The 18A course of that Intel will use for Clearwater Forest may have a ten% enchancment in performance-per-watt over 20A, together with different enhancements, so Intel selected to go along with this node as it’s the greatest the corporate has to supply within the timeframe of the Clearwater launch.
The 18A course of options all of the modern tech the trade intends to undertake sooner or later, like GAA and BSP, so it represents an extremely superior node. Intel claims that the 18A node is the place it should achieve clear course of management over its rivals TSMC and AMD, and the corporate’s choice to skip 20A and transfer to 18A for Xeon definitely speaks volumes to its confidence within the well being of the node. Intel additionally tells us we gained’t see a Xeon mannequin fabbed on 20A.
Intel P-Core Xeon CPUs: Emerald Rapids and Granite Rapids
Intel’s next-gen Emerald Rapids is scheduled for launch in This autumn of this 12 months, a compressed timeframe provided that Sapphire Rapids launched just a few months in the past. Emerald will drop into the identical platforms as Sapphire Rapids, decreasing validation time for its clients, and is essentially a refresh of Sapphire Rapids. Nevertheless, Intel says it should present sooner efficiency, higher energy effectivity, and, extra importantly, extra cores than its predecessor. Intel says it has the Emerald Rapids silicon in-house and that validation is progressing as anticipated, with the silicon both assembly or exceeding its efficiency and energy targets.
Granite Rapids will arrive in 2024, carefully following Sierra Forest. Intel will fab this chip on the ‘Intel 3’ course of, a vastly improved model of the ‘Intel 4’ course of that lacked the high-density libraries wanted for Xeon. That is the primary p-core Xeon on ‘intel 3,’ and it’ll function extra cores than Emerald Rapids, greater reminiscence bandwidth from DDR5-8800 reminiscence, and different unspecified I/O improvements.
Notably, Sierra Forest, the primary E-core-equipped household, shall be socket appropriate with the P-core-powered Granite Rapids; they even share the identical BIOS and software program. Intel enabled this by transferring these chips to a tile-based design, with a central I/O tile dealing with reminiscence and different connectivity options, very like we see with AMD’s EPYC processors. This separates the core and uncore features, so Intel creates totally different processor sorts through the use of various kinds of compute tiles. This supplies a number of advantages, reminiscent of the flexibility to make use of the identical programs to pack in additional threaded heft with E-cores, however throughout the similar TDP envelope as P-core fashions.
Throughout its webinar, Intel demoed a dual-socket Granite Rapids offering a beastly 1.5 TB/s of DDR5 reminiscence bandwidth; a claimed 80% peak bandwidth enchancment over present server reminiscence. For perspective, Granite Rapids supplies extra throughput than Nvidia’s 960 GB/s Grace CPU superchip designed particularly for reminiscence bandwidth, and greater than AMD’s dual-socket Genoa, which has a theoretical peak of 920 GB/s. Intel achieved this feat utilizing DDR5-8800 Multiplexer Mixed Rank (MCR) DRAM, a brand new kind of bandwidth-optimized reminiscence it invented. Intel has already launched this reminiscence with SK hynix.
Granite Rapids and Sierra Forest are the intercept level for Intel’s latest restructuring of its chip design circulation course of, which ought to assist keep away from the problems that discovered the corporate doing a number of successive steppings of the Sapphire Rapids processors that led to additional delays. Intel says Granite Rapids is far additional alongside in its improvement cycle than Sapphire Rapids was at this level. Intel says that Granite Rapids is hitting all engineering milestones, and the primary stepping is wholesome. As such, it’s already sampling to clients now.
Picture 1 of 3
Intel’s knowledge heart and AI replace targeted on Xeon, however the firm’s portfolio additionally consists of different components, like FPGAs, GPUs, and purpose-built accelerators. Intel has many rivals within the customized silicon realm, like Google with its TPU and Argos video encoding chip (amongst many different corporations), so the Gaudi accelerators and FPGAs are an necessary a part of its portfolio. Intel stated it could launch 15 new FPGAs this 12 months, a file for its FPGA group. We’ve but to listen to of any main wins with the Gaudi chips, however Intel does proceed to develop its lineup and has a next-gen accelerator on the roadmap. The Gaudi 2 AI accelerator is delivery, and Gaudi 3 has been taped in.
Intel additionally says that its Artic Sound and Ponte Vecchio GPUs are delivery, however we aren’t conscious of any of the latter out there on the final market — as an alternative, the primary Ponte Vecchio fashions seem like headed to the oft-delayed Aurora supercomputer.
Intel just lately up to date its GPU roadmap, canceling its upcoming Rialto Bridge collection of knowledge heart Max GPUs and transferring to a two-year cadence for knowledge heart GPU releases. The corporate’s subsequent knowledge heart GPU choices will come within the type of the Falcon Shores chiplet-based hybrid chips, however these gained’t arrive till 2025. The corporate additionally pared again its expectations for Falcon Shores, saying they may now arrive as a GPU-only structure and gained’t embrace the choice for CPU cores as initially supposed — these “XPU” fashions now don’t have a projected launch date.
Intel predicts that AI workloads will proceed to be run predominantly on CPUs, with 60% of all fashions, primarily the small- to medium-sized fashions, working on CPUs. In the meantime, the big fashions will comprise roughly 40% of the workloads and run on GPUs and different customized accelerators.
Intel can also be working to construct out a software program ecosystem for AI that rivals Nvidia’s CUDA. This additionally consists of taking an end-to-end strategy that features silicon, software program, safety, confidentiality, and belief mechanisms at each level within the stack. You possibly can be taught extra about that right here.
Intel’s pivot to AI-centric designs for its CPUs started a number of years in the past, and at the moment’s explosion of AI into the general public eye with massive language fashions (LLMs) like ChatGPT proves this was a stable wager. Nevertheless, at the moment’s AI panorama is altering each day. It spans a complete constellation of lesser-known and smaller fashions, making it a idiot’s errand to optimize new silicon for anybody algorithm. That’s particularly difficult when chip design cycles span as much as 4 years — a lot of at the moment’s AI fashions didn’t exist again then.
We spoke with Intel Senior Fellow Ronak Singhal, who defined that Intel selected way back to deal with supporting the basic workload necessities of AI, like compute energy, reminiscence bandwidth, and reminiscence capability, thus laying a broadly-applicable basis that may help any variety of algorithms. Intel has additionally steadily expanded its help for various knowledge sorts, like AVX-512 and its first-gen AMX tech that’s delivery now with help for 8-bit integer and bfloat16. Intel hasn’t advised us when its second-gen AMX will arrive, however it should help 16-bit integer and has the extensibility to help extra knowledge sorts sooner or later. This basis of help has enabled Intel to offer spectacular efficiency with Xeon in lots of various kinds of AI workloads, typically exceeding AMD’s EPYC.
Sure, many AI fashions are far too massive to run on CPUs, and most coaching workloads will stay within the area of GPUs and customized silicon, however smaller fashions can run on CPUs — like Fb’s LlaMa, which may even run on a Raspberry Pi — and extra of at the moment’s inference workloads run on CPUs than some other kind of compute — GPUs included. We anticipate that pattern will proceed, and Intel is properly positioned with its P-core Xeon roadmap.
Intel has no scarcity of rivals, and the Arm ecosystem is changing into way more prevalent within the knowledge heart, with Amazon’s Graviton, Tencent’s Yitian, Ampere Altra in Microsoft Azure, Oracle Cloud, and Google Cloud, Nvidia with Grace CPUs, Fujitsu, Alibaba, Huawei with Kunpeng, and Google’s Maple and Cypress, to call just a few. There are even two exascale-class supercomputer deployments deliberate with Arm Neoverse V1 chips: SiPearl “Rhea” and the ETRI Ok-AB21.
Which means that Intel, like AMD, must make use of optimized chips that focus extra on energy effectivity and core density to assuage the hyperscalers and CSPs migrating to Arm. If AMD delivers on its roadmap, and there’s no purpose to consider it gained’t, it should beat Intel to market with its density-optimized Bergamo. That would put Intel at a drawback within the high-volume (however lower-margin) cloud market. Then again, Intel does plan to maneuver to what *might* be a extra superior node than AMD may have for its follow-on Clearwater Forest fashions, making for fascinating competitors in 2025.
The truth that Intel stays steadfast within the Xeon roadmap it shared final 12 months is encouraging, given the corporate’s latest historical past. The accelerated adoption of the 18A node additionally speaks volumes in regards to the firm’s broader foundational course of know-how that impacts all sides of its companies.