During the GTC Spring 2023 keynote, Nvidia revealed its latest product, the H100 NVL. Nivida is adding something interesting to these cards that we haven’t seen in some time. It’s featuring dual-GPU capabilities. Unlike its predecessors, this product is not aimed at the gaming market and does not support SLI or multi-GPU gaming. Instead, it is specifically designed to cater to the rapidly expanding AI industry. The H100 NVL, also known as the H100 NVLink, is equipped with three NVLink connectors on the top and is designed to fit into two separate PCIe slots, with the adjacent cards being used together. Nvidia has released images and information regarding this product.
This seems to be an interesting change of direction, as Nvidia appears to be making changes to its product to cater to servers that do not support the SXM option. Instead, the emphasis is on enhancing inference performance, rather than training. To this end, the H100 NVL is equipped with NVLink connections to make up for the bandwidth deficit that would have been provided by NVSwitch on the SXM solutions. There are also some other noticeable differences in the product design.
The H100 NVL is the top-of-the-line product in the NVIDIA Hopper series, and it is a variation of the H100 data-center accelerator that has been optimized for a single purpose: enhancing AI language models such as Chat-GPT. The “NVL” in its name stands for NVLink, which is the technology used to connect the two PCIe cards of the dual-GPU option. The H100 NVL features three NVLink Gen4 bridges to facilitate this connection.
Aside from its dual-GPU design, the NVL variant offers an advantage over other H100 GPUs in terms of memory capacity. It has six stacks of HBM3 memory, providing a total of 188 GB of the high-speed buffer. However, it’s worth noting that this results in a somewhat unusual memory capacity distribution, with only 94 GB available on each GPU instead of the expected 96 GB.

With its full 6144-bit memory interface, which breaks down into 1024 bits for each HBM3 stack, the H100 NVL is capable of achieving memory speeds of up to 5.1 Gbps. This translates to a maximum throughput of 7.8 GB/s, which is more than double the throughput of the H100 SXM. The ability to handle large buffers is crucial for language models, and the increased bandwidth provided by the H100 NVL is sure to have a significant impact on performance.