|Samsung discussed details of their Exynos Octa processor design at the|
2013 IEEE International Solid State Circuits Conference
Samsung raised the bar for CPU core count in mobile application processors at the 2013 IEEE International Consumer Electronics Show in Las Vegas, with their announcement of the next-generation Exynos Octa. The Exynos Octa will incorporate ARM's big.LITTLE configuration times four, with one large quartet of high-performance Cortex-A15 cores trading off operation with a smaller quartet of power-saving Cortex A7s. This week, at the 60th International Solid State Circuits Conference (ISSCC) in San Francisco, Samsung's lead designer provided some more detail on the new SoC, in a presentation of "28nm High-K Metal Gate heterogeneous quad-core CPUs for high-performance and energy-efficient mobile application processor".
Samsung's Youngmin Shin said that the objective for the dual-quad architecture was to have the more energy-efficient A7s handle the majority of the workload in a mobile device, while the A15s take over for the most compute-intensive tasks. However, adding the second quad CPU for peak performance requirements comes at the expense of a large percentage of the SoC die area, with the A15s occupying 5X the silicon real estate (19mm2 vs 3.8mm2) of the A7 quad.
The cost in power appeared to be even greater in Samsung's ISSCC presentation, in which they rated the A15 CPU at nearly 6X the energy consumption of the A7 quad. Shin's power-performance graph indicated a higher (worse) power ratio, with the big quad hitting a maximum of 30,000 DMIPS at 6W, and the smaller quad maxing out in the neighborhood of 8,000 DMIPS at approximately 1/2 watt. The Exynos Octa design relies on dynamic voltage and frequency scaling in order to manage power, and each quad can be stepped up in performance from complete shutdown with all cores off, to turning on each core individually through power and clock gating.
As in the first 32nm quad-core Cortex A9-based Exynos, which Samsung described in much greater detail at ISSCC last year, Samsung makes extensive use of analog circuit techniques to tune power and performance with dynamic forward and reverse body-bias controls. To further optimize performance and lower power, the Exynos designers tweaked their Register-Transfer Level library to reduce gate count from logic synthesis, and customized their cell library for critical data paths.
Shin said that the "big" CPUs in Octa can operate from 200MHz to greater than 1.8GHz, while the "LITTLE" cores run from 200 MHz to greater than 1.2GHz, depending on manufacturing process variability. The two sets of processors share a cache coherent interconnect bus, with a 2MB L2 data cache for the A15s, and a 512KB L2 in the A7s. In a whitepaper on the benefits of the big.LITTLE architecture, Samsung says that the cache coherent bus enables applications to be switched between CPUs in "less than 20,000 clock cycles".
While designers in attendance at the ISSCC presentation questioned the cost-benefit of the big.LITTLE architecture, based on the described power-performance-area tradeoffs, data from the conference paper could be misleading. As the speaker stated during the closing Q&A session, his perspective in the presentation was as a circuit designer. The actual operation of the Exynos processor relies on the application software and operating system. The Samsung whitepaper authors provide more insight on the power tradeoffs in their discussion of peak versus average power. The Exynos big.LITTLE architecture depends on completing tasks faster, albeit at the expense of higher peak power. The assumption is that the ability to turn off cores for longer periods of time will yield a greater benefit.
Nevertheless, the Exynos design objectives stand out in sharp contrast to NVIDIA's Tegra-4 design, and Qualcomm's next-generation Snapdragon 600/800. The Samsung whitepaper authors share their perspective in stating that
"...applications such as 3D gaming and high-definition video playback are seldom used on mobile devices." (Samsung whitepaper "Benefits of the big.LITTLE architecture")NVIDIA is clearly going after their sweet spot in the gaming market with Cortex A15-based Tegra-4 and Project Shield, with the new processor integrating 72 GPU cores and a computational photography engine. Qualcomm is targeting UltraHD video payback with the Snapdragon 800. NVIDIA and Qualcomm both customize ARM cores for their unique architectures. However, NVIDIA has backed off to a Cortex-A9 based architecture for their first processor with integrated modem, the Tegra 4i. ARM's big.LITTLE architecture is yet to be tested in an actual mobile device, so time will tell as far as which configuration of heterogeneous CPUs wins out in the next generation of mobile devices.