Workshop #4

Efficient AI Acceleration

"Efficient and Robust Hardware Acceleration for Neural Networks"

Prof. Bing LI (Technical University of Munich)

Abstract

Deep neural networks (DNNs) have achieved breakthroughs in recent years and become a cornerstone of AI (Artificial Intelligence). The large size of DNNs and the corresponding computing demand pose a huge challenge to computer architectures to execute them efficiently. In this talk, hardware architectures based on digital design and emerging devices including RRAM (Resistive Random Access Memory) as well as optical computing components will be discussed. Solutions will be presented to effectively enhance the computational efficiency and the robustness of these platforms.

Prof. Bing LI (Technical University of Munich)

"Efficient Multi-Modal AI Acceleration"

Prof. Meng LI (Peking University)

Abstract

Recent years have witnessed the fast evolution of multi-modal AI impelled by the pervasive sensors of different modalities. Multi-modal AI learns from human sensing and consolidates heterogeneous data from various data inputs into a unified AI algorithm. Though promising, multi-modal AI suffers from efficiency constraints due to the exponential network scaling and workload heterogeneity In this talk, I will discuss some of my recent works that leverage network/hardware co-design and co-optimization to improve the inference efficiency for multi-modal AI. I will highlight neural architecture search (NAS) as a general algorithm to enable fast yet effective optimization of vision, language, speech models as well as the hardware accelerator. We will also discuss interesting future directions to further improve the efficiency and security for multi-modal AI acceleration.

Prof. Meng LI (Peking University)

"Efficient Lookup-Table AI Acceleration"

Dr. Ngai WONG (HKU)

Abstract

Low-end processors may not come with neuro processing units or AI accelerators, but they often provide certain memory (say, up to 1 MB) and are able to do simple arithmetic operations. To enable edge AI on such resource-restrictive devices, a viable way is to cast the inference of a deep neural network (DNN) as table lookups. This has motivated us to study lookup table (LUT)-based DNNs by transferring the output values of a learned deep model to the LUT, and to explore how new tricks can further compress the LUT size to facilitate tinyML in the context of computer vision. As an example, this talk will walk through the design of a single image super-resolution (SISR) network, and showcase our recent development a family of models called Hundred-Kilobyte Lookup Tables (HKLUTs) that offer a compelling balance between memory and performance.

Dr. Ngai WONG (HKU)

Back