In March, Nvidia introduced its GH100, the first GPU based on the new “Hopper” architecture, which is aimed at both HPC and AI workloads, and importantly for the latter, supports an eight-bit FP8 ...
The chip designer says the Instinct MI325X data center GPU will best Nvidia’s H200 in memory capacity, memory bandwidth and peak theoretical performance for 8-bit floating point and 16-bit floating ...
Essentially all AI training is done with 32-bit floating point. But doing AI inference with 32-bit floating point is expensive, power-hungry and slow. And quantizing models for 8-bit-integer, which is ...