ACADEMIA
Mali-T658 GPU Extends Graphics and GPU Compute Leadership for High Performance Devices
ARM announced the ARM Mali-T658 Graphics Processing Unit (GPU) - the latest member of the Midgard architecture-based GPU family targeting high performance devices, such as superphones, tablets and smart-TVs. The Mali-T658 GPU leverages the unique ARM system-level approach to multicore design that optimizes both performance and energy-efficiency. To address high-end consumer requirements, the Mali-T658 GPU delivers up to ten times the graphics performance of the Mali-400 MP GPU, found in a wide range of today's mainstream consumer products. It also features four times the GPU Compute performance of the Mali-T604 GPU, enabling a raft of new use-cases outside of traditional graphics processing, including computational photography, image-processing and augmented reality.
"Following the recent introduction of big.LITTLE processing and the ARMv8 architecture, the launch of Mali-T658 is another example of how ARM is seeking to redefine heterogeneous computing for the embedded space," said Jon Peddie, President of specialist graphics research consultancy, Jon Peddie Research. "This will provide high-performance graphics and compute systems for low-power applications."
Supported by lead ARM Partners, such as Fujitsu Semiconductor, LG Electronics, Nufront and Samsung, the Mali-T658 GPU enables an immersive visual computing experience on a range of 'always on, always connected' consumer devices. These include high-performance, energy-sensitive superphones, smartphones and tablets, as well as smart-TVs and automotive infotainment solutions. Building on the success of the Mali-T604 GPU, the Mali-T658 extends ARM performance leadership with scalability up to eight cores, and by doubling of the number of arithmetic pipelines within each of these cores. It is also compatible with the ARMv8 architecture.
"ARM is in a unique position to integrate CPU, GPU and interconnect technology into optimized, coherent systems, and by doing so improve performance and enable more efficient data sharing," said Rock Yang, VP Marketing, Nufront. "This will allow partners, such as Beijing Nufront, to make the most of the leading capabilities of each system component and maximize throughput in ARM technology-based compute sub-systems."
Delivery of a common set of compatible drivers for all Midgard architecture-based Mali GPUs enables faster time-to-market and minimizes software upgrade costs for future implementations. The ARM Mali-T658 GPU supports a wide range of graphics and compute APIs, including Microsoft DirectX 11, Khronos OpenGL ES, OpenVG, Khronos OpenCL, Google Renderscript and Microsoft DirectCompute.
The Mali-T658 has been designed to work seamlessly with the ARM Cortex-A15 and Cortex-A7 processors either in standalone modes or in big.LITTLE processing mode. The autonomous nature of the Mali Job Manager, and its ability to carry on graphics processing with a reduced load on the CPU, means it is very well suited to work alongside a big.LITTLE CPU system. By using the right processor for the right task the Mali-T658 is able to handle GPU compute tasks in parallel with the CPU handling the always-on always-connected tasks. The ability of the Mali-T658 GPU to scale up to eight cores provides unprecedented energy-efficiency, flexibility and scalability to match the CPU and GPU performance points through one coherent interface.
"Next generation consumer devices based on the Mali-T658 GPU will address the growing user expectation for slick user interfaces and desktop-class graphics," said Pete Hutton, general manager, Media Processing Division, ARM. "Intuitive user interfaces will mean that consumers can access the full functionality of their connected devices, for richer user experiences. This includes HD gaming and new compute-intensive applications, such as augmented reality."
Importance of cache coherency in multicore CPU/GPU designs
Cache coherency is essential in multicore computing devices to maintain the consistency of data stored in local caches of a shared resource. This ensures optimum performance and energy-efficiency of complex heterogeneous SoC designs, and is designed to address next generation, high-performance computing devices. ARM system IP, such as CoreLink, enables system level cache coherency across clusters of multicore processors, including the Cortex-A15 MPCore processors and Mali-T658 graphics processors.