Earlier today, we saw NVIDIA present Volta, it’s next-generation GPU architecture that is designed to increase performance by addressing one of the “eternal” problems linked to GPUs more than any other processor type: memory bandwidth. Volta is going to use RAM that is layered on top of the GPU silicon to reduce energy consumption and latency during memory accesses. With this, NVIDIA says that it will be able to push 1 Terabytes of data per second through the chip. This is 5X superior to the GeForce Titan, which tops 192GB/s.
But why do we need all this bandwidth to start with? While CPU programmers do anything they can to limit access to the memory, graphics programmers have little choice but use textures, and their numbers and size have been growing over time. It all started with a single texture applied to polygons to add details. But quickly, it grew to bump maps that simulate small geometric details, then displacement maps which are used during tessellation came by. To push the boundaries of realism, graphics programmers needed more and more data, and a lot of it came as textures which need to be accessed using that bandwidth. Additionally, anti-aliasing and render-to-texture techniques used for reflections and shadows also use bandwidth – and gigantic amounts of it.
Often times, the performance is limited by bandwidth, so addressing this in such a radical way is going to be interesting to observe… and benchmark!