Workshop #4

Cross-layer Optimization and Demonstration for Targeted Applications

"Approximate and Stochastic Ising Machines"

Prof. Jie HAN (University of Alberta)

Abstract

The Ising model has shown promise as an efficient computing approach to the search of (sub)-optimal solutions for combinatorial optimization problems (COPs) [1]. The fully connected topology of the Ising model, in which each spin is connected to every other spin, provides flexibility in adapting a COP to the Ising model. However, this flexibility comes with significant hardware overhead when constructing a domain-specific architecture, as known as an Ising machine. Approximate computing, as a low-power technique, offers a way to reduce hardware complexity [2]. Additionally, the emerging paradigm of stochastic computing can be a highly efficient candidate for simulating the dynamics of the Ising model [3]. The approximations introduced by these methods may sometimes bring beneficial effects, such as helping the system to escape local minima. In this talk, we explore the potential of using approximate and stochastic computing to improve the performance of Ising machines.

Prof. Jie HAN (University of Alberta)

"Wafer and Pannel Scale Computing: from the Communication Perspective"

Prof. Kaisheng MA (THU)

Abstract

In recent years, workloads, especially AI workloads, have grown much faster than Moore's Law; as a result, there is a huge gap between workloads and hardware performance, calling for more scalable architectures. We observe three opportunities: 1) Advances in hardware technologies make a real difference; 2) networking from a comprehensive view is key; 3) domain-specific network architectures maximize benefits. With these observations, we can make innovations in computer architectures and chip design.

Prof. Kaisheng MA (THU)

"In-memory Nonlinear Function Approximation for efficient RNN and Transformers"

Prof. Arindam BASU (City University of Hong Kong)

Abstract

Analog In-memory computing (IMC) systems are very well suited to implement modern deep neural networks since they can perform matrix-vector multiplications at very high energy and area efficiencies. But many modern networks such as LSTM or Transformers require a large number of nonlinear operations which become a bottleneck in achieving high system efficiency.
In this talk, I will present an approach where In-memory analog digital converter (ADC) are designed to efficiently approximate scalar and vector nonlinear functions using both volatile (SRAM) and non-volatile (RRAM) memories. In the first work, we use an extra column of memristors to produce an appropriately pre-distorted ramp voltage such that the comparator output directly approximates the desired scalar nonlinear neural activation resulting in ~9.9X and ~4.5X better energy and area efficiencies for LSTM networks. In the second work, we embed a winner take all network in the IMADC array such that the digitization also selects the “k” largest dot products (top-k). Using this approach in a transformer block, only the k largest activations are sent to softmax calculation block, reducing the huge computational cost of softmax resulting in ~15X faster speed than conventional softmax.

Prof. Arindam BASU (City University of Hong Kong)

"Artificial General Intelligence Processor Design Using Processing-In-Memory Technologies"

Prof. Bonan YAN (PKU)

Abstract

Artificial General Intelligence (AGI) poses unprecedented demands for computer performance and efficiency. This talk focuses on the design challenges and cutting-edge explorations of transforming the cognitive architecture of AGI, that is, the cyclic system based on perception, understanding, decision-making, and actuation, into the microarchitecture of chips. In view of the "memory wall" bottleneck encountered by traditional CPUs and GPUs when dealing with diverse AI tasks, the lecture first explores the design of basic circuit modules that integrate memory and computing, as well as the hardware architecture based on new functional circuit modules, in order to more efficiently support complex computing tasks related to AGI, such as perceptual computing, graph computing, combinatorial optimization, and probabilistic computing. Furthermore, the lecture looks ahead to the future of specialized hardware design for AGI, discusses possible technical paths, challenges faced, and solutions, aiming to inspire thinking about the next-generation intelligent computing platform.

Prof. Bonan YAN (PKU)

Back