"Accelerators are Key to Future Performance and Efficiency — but Amdahl’s Law Seems to Say Otherwise"
By Dr. Trevor E. CARLSON - National University of Singapore
Introduced by Arthur PERAIS (SLS team)
Abstract: While many computer architecture researchers now look only to accelerators as a way to improve the performance and efficiency for specific classes of workloads, the CPU remains one of the few solutions that continues to dominate the compute landscape, providing a level of flexibility and performance that is unmatched by other solutions. But, building a processor that can continue to find additional work and push the envelope in performance can to be a challenge, especially for general-purpose applications. What is needed is a balanced approach to continue the pace of improvement, such as a new way to look at the applications on hand to  enable fast and efficient application progress, as well as continuing to look at the enhancement of spatial accelerator designs in a holistic system that can supply acceleration that targets the workloads at hand.
 
 
In this talk, I will first discuss our recent work, NOREBA, a hardware-software co-designed processor that enables efficient non-speculative out-of-order commit execution. By exposing additional non-speculative work, this processor can execute, commit, and reclaim precious hardware resources to allow it to continue to make forward progress. To enable higher performance and efficiency, we inform the hardware about non-speculative work after branches by using an up-front compiler analysis pass. We also enable the processor to efficiently commit instructions out-of-order with lightweight instruction re-ordering hardware. Taken together, the resulting NOREBA core can improve performance by up to 217% (22% on average), with just a 4% increase in power.
 
I will also discuss some of our recent work on spatial accelerators. I will first discuss our new type of dynamic spatial accelerator that extends past traditional CGRAs, or coarse-grained reconfigurable arrays. In addition, I will also briefly discuss some other recent works that accelerate AI processing (MnF), graph processing (GraphWave) and neuromorphic computing (YOSO). Taken together, these new classes of accelerators, together with enhanced CPU designs, provide the potential to significantly improve performance and efficiency for future heterogenous system designs.
 
Bio: Trevor E. Carlson is an Assistant Professor at the department of Computer Science at the National University of Singapore. Dr. Carlson has received his PhD from Ghent University in 2014, his bachelor’s and master’s degrees from Carnegie Mellon University in 2002 and 2003, and completed a postdoc at Uppsala University in Sweden in 2017. Dr. Carlson’s research interests include several areas of computer architecture including efficient microarchitectures and accelerators, performance modeling, fast and scalable simulation methodologies, and secure processor designs. He co-designed the Sniper Multi-core Simulator which is being used by hundreds of researchers to evaluate the performance and power-efficiency of next generation systems which continues to be used to explore next-generation processor design. Dr. Carlson’s research has been published at leading journals and conferences in computer architecture and simulation such as the International Symposium on Computer Architecture (ISCA), the International Symposium on Microarchitecture (MICRO), the International Symposium on High Performance Computer Architecture (HPCA), IEEE Transactions on Computers (TC), USENIX Security, and others. He has recently been awarded Amazon, Intel and VMWare Research Awards, and his work has received six Best Paper Awards or Best Paper Nominations in conferences such as the International Symposium on Microarchitecture (MICRO) and the International Symposium on Performance Analysis of Systems and Software (ISPASS).