Arm on Wednesday announced its next-generation general-purpose CPU cores for datacenter processors. The new Neoverse V3, Neoverse N3, and Neoverse E3 CPU cores are aimed at high-performance computing (HPC), general-purpose CPU instances and infrastructure applications, and edge computing and low-power applications, respectively. Alongside the new cores, Arm is also rolling-out Compute Subsystems (CSS), which consist of CPU cores, memory, I/O, and die-to-die interconnect interfaces to speed up processors' development.
Arm's Neoverse Compute Subsystems (CSS) are integrated and verified platforms that bring together all the key components required for the heart of a system-on-chip (SoC). These subsystems are designed to provide a starting point for building custom solutions, enabling Arm's partners to enhance CSS with their own IP and introduce their designs to market rapidly, as the company expects it to take about nine months from design start to tape out. A CSS includes the CPU core complex, memory, and I/O interfaces, and is optimized for specific use cases across a particular market segment — such as cloud computing, networking, and AI.
By using CSS, partners can focus on system-level and workload-specific differentiation, while leveraging Arm's technology for its underlying compute capabilities. Meanwhile, Arm's Neoverse CSS supports Arm Total Design (a package of IPs from 20 Arm's partners) as well as Arm's Chiplet System Architecture CSA and UCIe interfaces for stitching a CSS with a compatible third-party silicon.
Arm's Neoverse V3 is the company's highest-performing CPU core ever. The core is based on the Armv9-A (v9.2) instructions set architecture (ISA) enhanced with SVE2 SIMD extension and equipped with 64KB + 64KB (instructions + data) L1 cache as well as 1MB/2MB/3MB L2 cache with ECC capability.
Arm says that depending on the workload, a simulated 32-core Neoverse V3 offers a 9% - 16% performance uplift compared to a simulated 32-core Neoverse V2 in typical server workloads, which looks quite decent considering we're talking about cores that compete against AMD's Zen 4 and Intel's Raptor Cove — and we rarely see huge generation-to-generation performance upticks on this market. The new Neoverse V3 processor can offer a whopping 84% performance improvement over the Neoverse V2 in AI data analytics, according to simulations by Arm. This is, of course, a major improvement, and will attract attention to the core.
What is important is that, along with the Neoverse V3 core itself, Arm is rolling out its Neoverse V3 Compute Subsystem (CSS), which includes 64 Neoverse V3 cores (with SVE/SVE2, BFloat16, and INT8 MatMul support), a memory subsystem with 12-channel DDR5/LPDDR5 and HBM memory support, 64-lanes of PCIe Gen5 with CXL support, die-to-die interconnects, UCIe 1.1, and/or custom PHYs. The Neoverse V3 can scale to 128 cores per socket — enabling fairly formidable server CPUs.
When it comes to Arm's Neoverse N3 core, these are the company's first Armv9.2-based cores for general-purpose CPU instances and infrastructure applications that have to offer a balance between performance and power consumption. These Armv9.2 cores with SVE2 can be equipped with 32KB/64KB + 32KB/64KB (instructions + data) L1 cache as well as 128KB – 2MB L2 cache with ECC capability.
From performance point of view, Arm claims that a simulated 32-core Neoverse N3 processor outperforms a simulated 32-core Neoverse N2 processor by 9% to 30% — depending on the workload — which is quite good. In AI data analytics, the simulated Neoverse N3-based SoC is 196% faster than the simulated Neoverse N2 chip.
Arm's Neoverse CSS N3 is aimed at workloads that do not need performance at all costs, so one N3 Compute Subsystem packs 32 N3 cores, four 40-bit DDR5/LPDDR5 memory channels, 32 PCIe Gen5 lanes with CXL support, high-speed die-to-die links, and UCI 1.1 support. Such a solution has a TDP of 40W, according to Arm, which did not elaborate on process technology used.
So far, Arm's Neoverse CSS has been adopted by Microsoft for its Cobalt 100 general-purpose server processor. However, Arm expects considerably broader adoption of its CSS offerings going forward.