Sections
Text Area

Workshop #1

Architectural and System Integration for Efficient Edge AI Computing

"LLM Inference On Chip"

Prof. Hao YU (SUSTech)


Abstract

The emerging large-scale AI models such as ChatGPT and AlphaGo have surpassed the scale of Moore’s law of chips. It has become a challenge on how to localize the large-scale AI models for embodied applications. This talk will introduce the design of X-Edge chip, which utilizes heterogeneous integration of in-memory computing architecture on top of systolic cubic arrays, targeted to achieve both high energy efficiency yet bandwidth for the large-scale AI models on chip. Embodied AI applications such as robotics driven by multimodal models will be presented as well.

 

 

Image
Workshop 1_Prof. Hao YU

Prof. Hao YU (SUSTech)

"Reconfigurable AI Processor: Fundamental Concepts, Application, and Future Trends"

Prof. Shouyi YIN (THU)


Abstract

A reconfigurable AI processor increases hardware flexibility to accommodate various AI algorithms and speeds up processing time while consuming less power. Typically, a reconfigurable AI processor includes multiple reconfiguration hierarchies, such as chip-level, processing element array-level, and processing element-level reconfigurations. Chip-level reconfiguration dynamically adjusts the parallelism of multi-chip systems to minimize computation latency and data access. Processing element array-level reconfiguration changes the dataflow or mapping of the computing engine to fully reuse the on-chip data, reducing the memory access. Processing element-level reconfiguration changes the function of the computing unit, such as computing precision and sparsity processing pattern, to increase the bit-wise hardware utilization. This talk explores the fundamental concepts of reconfigurable technology, discusses its applications in both digital and analog AI processors, and prospects for future trends in reconfigurable technology.

 

 

Image
Workshop 1_Prof. Shouyi YIN

Prof. Shouyi YIN (THU)

"Reconfigurable Computing for Dynamic Vision Sensing in Edge Applications"

Prof. Hayden SO (HKU)


Abstract

Event-based vision represents a paradigm shift in how vision information is captured and processed. By only responding to dynamic intensity changes in the scene, event-based sensing produces far less data than conventional frame-based cameras, promising to springboard a new generation of high-speed, low-power machines for edge intelligence. However, to truly deliver the power-efficiency promise of event sensors, it is imperative to develop custom neuromorphic accelerators that can leverage the dynamic sparse data produced from the sensor. In this talk, I will briefly present two works that work toward that goal. In the first work, a composable dynamic sparse dataflow architecture called ESDA is presented. ESDA is a modular system that allows customized sparse DNN Accelerators to be constructed rapidly using a set of parametrizable modules. These modules share a uniform sparse token-feature interface and can thus be connected easily to compose an all-on-chip dataflow accelerator on FPGA for each user-supplied deep neural network model. In the second work, we exploit the underlying reconfigurable fabric of FPGAs to implement a novel asynchronously triggered spiking neural network (SNN). Leveraging this underlying asynchronous triggering mechanism, the temporally-coded leaky integrated-and-fire (LIF) neurons in the proposed SNN are able to respond to the event spikes from the event sensors asynchronously. By using asynchronous circuits for inference, we have demonstrated extreme low power and low latency inference operation on FPGAs. Finally, I will conclude with some thoughts on how the future hybrid systems may leverage the ideas from these two works to build better AIoT devices with event sensing in the future.

Image
Prof. Hayden SO

Prof. Hayden SO (HKU)

Text Area

Back