Intel Gaudi 3 PCIe: 128GB AI Accelerator

Home
Accelerators
Intel
Intel Gaudi 3 PCIe

Intel Gaudi 3 · PCIe

Overview

AI Acceleration Without Vendor Lock-In

The Intel Gaudi 3 PCIe card (HL-338) packs 128GB of HBM2e memory, 3.7 TB/s bandwidth, and 24 on-chip 200GbE RoCE v2 networking ports into a standard PCIe Gen5 dual-slot card. That last point is what makes it different from every other accelerator in this lineup: the networking is built into the chip, not bolted on as separate NICs.

This means you can scale Gaudi 3 clusters using the Ethernet switches you already own instead of investing in proprietary interconnects. It supports PyTorch natively, integrates with Hugging Face and vLLM, and handles LLM inference, fine-tuning, and training workloads with automated FP8 quantization.

Gaudi 3 PCIe vs. H100 NVL PCIe

The buyer's real question: open Ethernet scaling with more memory, or the established CUDA ecosystem?

	Gaudi 3 PCIe	H100 NVL PCIe
Memory	128GB HBM2e	94GB HBM3
Bandwidth	3.7 TB/s	3.9 TB/s
BF16	1,835 TFLOPS	1,671 TFLOPS*
FP8	1,835 TFLOPS	3,341 TFLOPS*
On-Chip NICs	24× 200GbE	None
TDP	600W	350-400W
Software	PyTorch (native)	CUDA

* NVIDIA specs with sparsity. Compare All Accelerators

128GB

HBM2e Memory

On-Chip 200GbE Ports

1.8

PFLOPS FP8 / BF16

600W

PCIe Gen5 Dual-Slot

Compatible Servers

Dell PowerEdge Server for Gaudi 3 PCIe

Dell is the lead OEM and first to market with an integrated Gaudi 3 PCIe server configuration.

Dell PowerEdge XE7740

4U server, up to 8× Intel Gaudi 3 PCIe accelerators
Optional 2× groups of 4-way bridged accelerators (RoCE v2)
1:1 accelerator-to-NIC ratio via 8 full-height PCIe slots + OCP module
Air-cooled, fits ~10kW racks without cooling upgrades
Optimized for Llama, DeepSeek, Phi, Qwen, Falcon, and more
Dell Smart Cooling, OpenManage Enterprise, APEX AIOps

View Server

View Full GPU Compatibility Matrix

Use Cases

Where Gaudi 3 PCIe Fits

LLM Inference & Fine-Tuning

128GB of HBM2e means larger models fit in memory without model parallelism overhead. The 24 integrated 200GbE ports eliminate the need for separate NICs, reducing cost and latency in multi-node inference clusters. Native vLLM and Hugging Face support with automated FP8 quantization makes deployment straightforward for popular models including Llama, DeepSeek, and Falcon.

Ethernet-Native Scaling

Every other GPU accelerator requires separate NICs for inter-node communication. Gaudi 3 integrates 24× 200GbE RoCE v2 ports directly on the chip, delivering 4.8 Tb/s of networking bandwidth per card. This means you can build multi-node training and inference clusters using the standard Ethernet switches you already own, without investing in proprietary interconnect hardware like NVLink or InfiniBand.

Open Software & No Lock-In

Gaudi 3 integrates natively with PyTorch, so your team works with the framework they already know. Hugging Face model hub support and automated FP8 quantization simplify deployment. Unlike proprietary ecosystems, Intel's software stack is open, and the hardware scales over standard Ethernet. For organizations building AI infrastructure that they want to own and control, Gaudi 3 removes the lock-in concern.

Specifications

Intel Gaudi 3 PCIe (HL-338)

Specification	Gaudi 3 PCIe
Architecture	Intel Gaudi 3 (5nm)
Compute Engines	8 MME + 64 TPC
Memory	128GB HBM2e
Memory Bandwidth	3.7 TB/s
On-Die SRAM	96MB (12.8 TB/s)
FP8	1,835 TFLOPS
BF16	1,835 TFLOPS
Data Types	FP8, BF16, FP16, TF32, FP32
Networking	24× 200GbE RoCE v2 on-chip (4.8 Tb/s)
Host Interface	PCIe Gen5 x16
TDP	Up to 600W
Form Factor	Dual-slot, FHFL PCIe
Thermal	Passive
Software	Intel Gaudi Software, PyTorch, vLLM, Hugging Face

View Product Brief

Ready to Evaluate Intel Gaudi 3?

Open ecosystem, standard Ethernet, no vendor lock-in. ServerMonkey can configure the Dell PowerEdge XE7740 with Gaudi 3 PCIe.

Request a Quote

Intel Gaudi 3 PCIe

AI Acceleration Without Vendor Lock-In

Gaudi 3 PCIe vs. H100 NVL PCIe

Dell PowerEdge Server for Gaudi 3 PCIe

Dell PowerEdge XE7740

Where Gaudi 3 PCIe Fits

LLM Inference & Fine-Tuning

Ethernet-Native Scaling

Open Software & No Lock-In

Intel Gaudi 3 PCIe (HL-338)

Ready to Evaluate Intel Gaudi 3?

What are you looking for?

Get to Know Us

Resources

What We Do

Customer Service

Follow Us