The market is changing daily. Stay on top of changes with our June Market Update »

8-GPU Open Platform for AI at Scale

The Intel Gaudi 3 OAM platform (HL-325L on HLB-325 baseboard) deploys eight Gaudi 3 accelerators with 1TB of combined HBM2e memory and 192× 200GbE RoCE v2 ports across the platform. Unlike proprietary 8-GPU systems that require NVLink and InfiniBand, Gaudi 3 scales entirely over standard Ethernet.

Available in the Dell PowerEdge XE9680 (the same chassis that supports NVIDIA H100/H200 and AMD MI300X), the Gaudi 3 OAM gives organizations a third option for large-scale AI training and inference without switching server platforms.

Gaudi 3 OAM vs. H100 SXM (8-GPU)

Three 8-GPU platforms, one chassis. Here's how they compare.

Gaudi 3 (×8) H100 SXM (×8)
Memory (total)1,024 GB640 GB
BW (per card)3.7 TB/s3.35 TB/s
BF16 (per card)1,835 TFLOPS1,979 TFLOPS*
FP8 (per card)1,835 TFLOPS3,958 TFLOPS*
Scale-UpEthernet (open)NVLink (proprietary)
SoftwarePyTorch (native)CUDA
* NVIDIA specs with sparsity. Compare All Accelerators
1TB
Combined HBM2e (8-Card)
192
On-Chip 200GbE Ports (8-Card)
14.7
PFLOPS FP8 / BF16 (8-Card)
8
Accelerators per Platform

Dell PowerEdge Server for Gaudi 3 OAM

Dell PowerEdge XE9680

Dell PowerEdge XE9680

  • 8× Intel Gaudi 3 OAM accelerators via HLB-325 baseboard
  • 1TB combined HBM2e memory across all 8 accelerators
  • 192× 200GbE RoCE v2 on-chip networking (6× OSFP800 ports on baseboard)
  • Air-cooled (HL-325L) or liquid-cooled (HL-335) options
  • Same chassis supports NVIDIA H100/H200 and AMD MI300X
View Server
The XE9680 supports NVIDIA, AMD, and Intel accelerators. ServerMonkey can configure and quote any platform in the same chassis.

Where Gaudi 3 OAM Fits

Large-Scale LLM Training

Large-Scale LLM Training

With 1TB of combined HBM2e across eight accelerators, the Gaudi 3 OAM platform holds larger models in memory than competing 8-GPU systems. The 192 on-chip Ethernet ports (routed through 6× OSFP800 connectors on the baseboard) enable multi-node training clusters over standard Ethernet switching, scaling to hundreds of nodes without proprietary interconnect infrastructure.

Enterprise Inference at Scale

Enterprise Inference at Scale

For organizations deploying inference across multiple nodes, Gaudi 3's Ethernet-native architecture means every accelerator already has its networking built in. No separate NIC purchases, no InfiniBand switches, no proprietary fabric licensing. The platform supports vLLM, Hugging Face TGI, and automated FP8 quantization for production LLM serving.

Multi-Vendor Flexibility

Multi-Vendor Flexibility

The Dell PowerEdge XE9680 supports Gaudi 3 OAM, NVIDIA H100/H200 SXM, and AMD MI300X, all in the same chassis with different baseboards. This means you can evaluate all three platforms without committing to a single server platform. ServerMonkey can configure and quote any combination, helping you compare real pricing and performance before you commit.

Intel Gaudi 3 OAM Platform

Specification Gaudi 3 OAM (per card)
ArchitectureIntel Gaudi 3 (5nm)
Compute Engines8 MME + 64 TPC
Memory128GB HBM2e
Memory Bandwidth3.7 TB/s
On-Die SRAM96MB (12.8 TB/s)
FP81,835 TFLOPS
BF161,835 TFLOPS
Data TypesFP8, BF16, FP16, TF32, FP32
Networking24× 200GbE RoCE v2 on-chip (4.8 Tb/s)
TDP450-900W (configurable, air or liquid)
Form FactorOAM (HL-325L air / HL-335 liquid)
PlatformHLB-325 UBB (8× accelerators)
Combined Memory (8-card)1,024 GB HBM2e
Combined Networking (8-card)192× 200GbE (6× OSFP800)
SoftwareIntel Gaudi Software, PyTorch, vLLM, Hugging Face
 

Building an Open AI Cluster?

The Gaudi 3 OAM platform scales with standard Ethernet. ServerMonkey can help you evaluate it alongside NVIDIA and AMD options.

Request a Quote

What are you looking for?