The upcoming Tegra mobile chip from NVIDIA not contain 4 cores, but 5 cores, and NVIDIA has optimized its new chip to excel at high-performance and low power, two goals that are often contradictory. In a paper entitled “Variable SMP –A Multi-Core CPU Architecture for Low Power and High Performance Power and High Performance”, NVIDIA reveals how it has designed a quad-core chip, with a 5th “companion core” to achieve this seemingly impossible goal.
The root of the problem can be found in the process used to built computer chips. It is a fact that every chip leaks some amount of power, this is a natural phenomenon. In the context of this article, power leakage is the electric current that is consumed when the chip is in idle state. Although undesirable, especially on mobile devices, it cannot be avoided and the semiconductor industry has developed techniques to reduce power leakage.
But chips optimized for low-power leakage in idle state tend to consume more power than non-optimized chips during intense workloads. It’s obvious that people want a mobile device that consumes a minimal amount of power at all times.
Introducing the 5th “companion” core
The semiconductor reality is not going anywhere, so NVIDIA took a rather interesting approach to the problem: Tegra 3 is using a 5th, companion core, to handle all the “low-power” tasks like running the operating system in sleep mode, checking emails and notification, and keeping the system alive when you are reading a book, playing media files. When that companion core is working the “normal” high-performance cores can be shut down.
Because this companion core is optimized for low-power, NVIDIA doesn’t want it to handle heavy workloads, or it would start consuming too much. To do so, its frequency has been set with a range of 0 to 0.5GHz. Whenever the companion core is overwhelmed by work, one or several high-performance cores wake up and pick up the work. This is NVIDIA’s definition and implementation of Variable Symmetric Multiprocessing (vSMP), which it has patented.
Automatic Core Switching
In its paper, NVIDIA says that the operating system (Android 3.0, aka Honeycomb) assumes that all CPU cores in the chip are identical instances, which is not true in this case. Therefore special management had to be devised at the hardware level, and the software level to make this heterogeneous group of cores completely transparent to the OS.
Cores are switched ON and OFF depending on a real-time analysis of the workload as the diagram above shows. The only “limitation” seems to be that “companion Core” cannot be activated when Core 1-4 are. NVIDIA says that not allowing the companion core and the high-performance cores to run at the same time simplifies the cache memory management and avoid performance penalties that would have hindered the high-performance cores.
Making this transparent to the OS is very important for many reasons, but for end-users, it means that OS updates don’t have to wait for NVIDIA to tweak some code.
vSMP Power benefits
Logically, Tegra 3 shows power benefits even when compared to the current generation Tegra 2 processor. According to NVIDIA, that is true during sleep state (LPO), media playback and even gaming. The graph above shows the power savings that NVIDIA has seen during its own tests. Remember that this shows only the power saved at the chip level, not at the system (including display) level.
NVIDIA also provides perf/Watt comparisons with other high-profile chips that are on the market such as the OMAP4 and the Qualcomm QC8660. Note that NVIDIA is using Coremark, a well-known benchmark that is very multi-core friendly (performance is more or less expected to scale with the number of cores). A quad-core Tegra 3 chip won’t have any difficulties winning the absolute score, but I find it very interesting to see that at comparable performance, Tegra 3 can consume only 1/3 of the electric power.
Conclusion
Today’s release provides a very interesting insight into the low-power/high-performance strategy that NVIDIA has adopted. It is quite innovative and while embedding CPU cores with different capabilities is not something new, it is the first time that it has been used in this way in a high-end mobile processor.
However you should also keep in mind that the performance figures above reflect a best-case scenario that relies on the fact that application can scale with more cores, which is not always true. Multi-threaded programming remains an active subject of research and it remains to be seen how much real apps will scale. At this point, NVIDIA has remained tight-lipped about the graphics performance, so there is much more to learn about Tegra 3.
Nevertheless, NVIDIA’s companion core approach should yield a tantalizing battery life for what every phone does the most: sleep. We can agree that every single mobile user cares about battery life, even if they may not care about high-performance. From that perspective, the architecture of Tegra 3 can bring real and significant benefits for all users.
. Read more about