NVIDIA H200 Tensor Core GPU: 141GB HBM3e

Home
Accelerators
Nvidia
NVIDIA H200

Introduction

The GPU for Generative AI and HPC

The NVIDIA H200 GPU supercharges generative AI and high-performance computing (HPC) workloads with game-changing performance and memory capabilities. As the first GPU with HBM3E, the H200’s larger and faster memory fuels the acceleration of generative AI and large language models (LLMs) while advancing scientific computing for HPC workloads.

NVIDIA Supercharges Hopper, the World’s Leading AI Computing Platform

The NVIDIA HGX H200 features the NVIDIA H200 GPU with advanced memory to handle massive amounts of data for generative AI and high-performance computing workloads.

Read the Press Release

Highlights

Experience Next-Level Performance

1.9Xfaster

Llama2 70B Inference

1.6Xfaster

GPT-3 175B Inference

110Xfaster

High-Performance Computing

Benefits

Higher Performance With Larger, Faster Memory

Based on the NVIDIA Hopper™ architecture, the NVIDIA H200 is the first GPU to offer 141 gigabytes (GB) of HBM3e memory at 4.8 terabytes per second (TB/s) —that’s nearly double the capacity of the NVIDIA H100 GPU with 1.4X more memory bandwidth. The H200’s larger and faster memory accelerates generative AI and LLMs, while advancing scientific computing for HPC workloads with better energy efficiency and lower total cost of ownership.

Unlock Insights With High-Performance LLM Inference

In the ever-evolving landscape of AI, businesses rely on LLMs to address a diverse range of inference needs. An AI inference accelerator must deliver the highest throughput at the lowest TCO when deployed at scale for a massive user base.

The H200 boosts inference speed by up to 2X compared to H100 GPUs when handling LLMs like Llama2.

Supercharge High-Performance Computing

Memory bandwidth is crucial for HPC applications as it enables faster data transfer, reducing complex processing bottlenecks. For memory-intensive HPC applications like simulations, scientific research, and artificial intelligence, the H200’s higher memory bandwidth ensures that data can be accessed and manipulated efficiently, leading up to 110X faster time to results compared to CPUs.

Reduce Energy and TCO

With the introduction of the H200, energy efficiency and TCO reach new levels. This cutting-edge technology offers unparalleled performance, all within the same power profile as the H100. AI factories and supercomputing systems that are not only faster but also more eco-friendly, deliver an economic edge that propels the AI and scientific community forward.

H200 NVL

Accelerating AI Acceleration for Mainstream Enterprise Servers With H200 NVL

NVIDIA H200 NVL is ideal for lower-power, air-cooled enterprise rack designs that require flexible configurations, delivering acceleration for every AI and HPC workload regardless of size. With up to four GPUs connected by NVIDIA NVLink™ and a 1.5x memory increase, large language model (LLM) inference can be accelerated up to 1.7x, and HPC applications achieve up to 1.3x more performance over the H100 NVL.

Server Compatibility

Compatible ServerMonkey servers

Dell PowerEdge XE7740

Supports up to 8× double-wide or 16× single-wide GPUs
Optimized for AI workloads

View Server

Dell PowerEdge XE7745

Supports up to 8× double-wide or 16× single-wide GPUs
Built for dense AI workloads

View Server

Dell PowerEdge R770

High-performance 2U enterprise server
Built for scalable and mixed workloads

View Server

Specifications

NVIDIA H200 GPU

	H200 SXM¹	H200 NVL¹
FP64	34 TFLOPS	30 TFLOPS
FP64 Tensor Core	67 TFLOPS	60 TFLOPS
FP32	67 TFLOPS	60 TFLOPS
TF32 Tensor Core²	989 TFLOPS	835 TFLOPS
BFLOAT16 Tensor Core²	1,979 TFLOPS	1,671 TFLOPS
FP16 Tensor Core²	1,979 TFLOPS	1,671 TFLOPS
FP8 Tensor Core²	3,958 TFLOPS	3,341 TFLOPS
INT8 Tensor Core²	3,958 TFLOPS	3,341 TFLOPS
GPU Memory	141GB	141GB
Decoders	7 NVDEC 7 JPEG	7 NVDEC 7 JPEG
Decoders	7 NVDEC 7 JPEG	7 NVDEC 7 JPEG
Confidential Computing	Supported	Supported
Max Thermal Design Power (TDP)	Up to 700W (configurable)	Up to 600W (configurable)
Multi-Instance GPUs	Up to 7 MIGs @18GB each	Up to 7 MIGs @16.5GB each
Form Factor	SXM	PCIe Dual-slot air-cooled
Interconnect	NVIDIA NVLink™: 900GB/s PCIe Gen5: 128GB/s	2- or 4-way NVIDIA NVLink bridge: 900GB/s per GPU PCIe Gen5: 128GB/s
Server Options	NVIDIA HGX™ H200 partner and NVIDIA-Certified Systems™ with 4 or 8 GPUs	NVIDIA MGX™ H200 NVL partner and NVIDIA-Certified Systems with up to 8 GPUs
NVIDIA AI Enterprise	Add-on	Included

¹ Preliminary specifications. May be subject to change.² With sparsity.

View Datasheet View H200 NVL Product Brief

Interested in This GPU?

Get pricing, availability, and compatibility details from the ServerMonkey team.

Request a Quote

NVIDIA H200 GPU

The GPU for Generative AI and HPC

NVIDIA Supercharges Hopper, the World’s Leading AI Computing Platform

Experience Next-Level Performance

Higher Performance With Larger, Faster Memory

Unlock Insights With High-Performance LLM Inference

Supercharge High-Performance Computing

Reduce Energy and TCO

Accelerating AI Acceleration for Mainstream Enterprise Servers With H200 NVL

Compatible ServerMonkey servers

Dell PowerEdge XE7740

Dell PowerEdge XE7745

Dell PowerEdge R770

NVIDIA H200 GPU

Interested in This GPU?

What are you looking for?

Get to Know Us

Resources

What We Do

Customer Service

Follow Us