Researchers from the State University of North Carolina have published a paper that studies how multi-core processors performance could be improved by studying how bandwidth and prefetching are managed.
For the novice, I’ll provide some context: processors can run at incredible frequencies of 3GHz+, however, memory chips often run much slower than that. Therefore, it might happen that the processor “stalls” its work while waiting for data to arrive.
“Bandwidth” is the quantity of data that circulates in the chip, and prefetching is a technique that pre-loads a piece of data before it is actually needed. This might prevent the processor from stalling later, which results in faster performance. However, prefetching data that is not actually used also consumes bandwidth in an inefficient way.In multi-core systems, things are even more complicated as each core might be under a different load of work. Depending on what the architecture of the processor is, some cores might have too much (unused) bandwidth available while others don’t have enough. The authors argue that a better allocation of bandwidth would lead to better performance, and there’s frankly no doubt about it.
The research team says that depending on the application, it’s possible to get between a 10% to 40% performance improvement. It’s true, but while some website were very happy to report such potential improvements, they also forgot to tell you that the applications that were tested were mainly about pure number crunching and data compression (which is kind of… pure number crunching). Both scenarios are fundamentally stressing the bandwidth and the memory subsystem, and that explains those great percentages of performance boosts.
This is a best-case scenario for the most bandwidth-demanding apps, if you want. To the regular user like you and I, this is quite remote. That said, I don’t want to dismiss their work. On the contrary, with research just like this, processor manufacturers have steadily improved performance for everyone for decades. However, it’s only fair to deflate some of the hype that might make you think that your computer is about to be 40% faster.
This is a great reminder that merely adding transistors on the chip does not do all the work. Chip architects and software researchers should get a lot of credit for the computing power boom that is still going on.
Link: research from Fang Liu and Yan Solihin, from North Carolina State University. The complete paper will be presented on June 9 in San Jose CA, but you can read the abstract now to get a taste of it.