Research Programmes

Enabling Technology for Emerging Computing Systems

RP1

addresses memory and data bandwidth problems to alleviate the bottlenecks of AI hardware by exploring integration of silicon-compatible emerging technologies with scaled silicon chips.

Explore
Architecture and Heterogeneous System Integration

RP2

focuses on exploring different new architectural and system integration solutions for efficient neuromorphic computing on platforms ranging from cloud to smart Internet of Things (IoT).

Explore
AI-Assisted EDA (Electronic Design Automation) for AI Hardware

RP3

hopes to develop new design methodologies and design automation tools for AI chips.

Explore
Hardware-Accelerated AI Applications

RP4

emphasises selected emerging applications for hardware acceleration, exploring system architecture and new design tools specific to the target applications to achieve breakthroughs in AI hardware in speeds and energy efficiency.

Explore
Computing-in-Memory Architecture for Large-scale AI Models

RP1

Tackling the compute, memory, and memory bandwidth scaling problems, particularly focusing on advancing CIM-based chip architecture to enable the ultra-energy-efficient and scalable deployment of large-scale AI models.

Explore
Hardware-Software Co-Design for Energy-Efficient and High-Performance Edge AI

RP2

Advancing the field of algorithm-hardware co-design, addressing new challenges arising from the increasing complexity, scalability, energy efficiency, and latency requirements of embodied large-scale AI applications.

Explore
EDA (Electronic Design Automation) for Large-Scale AI Hardware

RP3

Exploring new frontiers in EDA for AI chip and system design, with the primary focus placed on supporting the chiplet- and 3DIC-based designs, overcoming the limitations of existing EDA tools, and exploring the employment of foundation AI models in various design tasks to enable more agile chip design.

Explore
Cross-layer Optimization and Demonstration for Targeted Applications

RP4

Leveraging the technologies developed in the above 3 research projects to perform cross-layer optimization and develop hardware-accelerated working systems for domain-specific, large-scale AI applications with revolutionary performance.

Explore

RP1 - Computing-in-Memory Architecture for Large-scale AI Models

RP1 Tackling the compute, memory, and memory bandwidth scaling problems, particularly focusing on advancing CIM-based chip architecture to enable the ultra-energy-efficient and scalable deployment of large-scale AI models.

RP1-1:
Hybrid ReRAM-SRAM-based Computing-in-Memory AI Processor for Large Language Models

Figure RP1-1 Hybrid CIM AI processor. (a) Conventional ReRAM memory. (b) Hybrid ReRAM-SRAM CIM cell. (c) Hybrid CIM-based processor architecture.
This project aims to design a hybrid CIM AI processor that leverages the advantages of both ReRAM and SRAM technologies. ReRAM offers benefits such as non-volatility and small footprint but faces challenges as in expensive conversions and resistance variability. SRAM, while mature and reliable, is large and requires significant data movement. By integrating ReRAM and SRAM into a single compute-memory cell, this project aims to create a high-performance, high-efficiency processor for large-scale AI model inference. The hybrid design will involve transistor-level design, hierarchical segmented arrays, and a fully-on-chip memory system, ensuring reliable and accurate MAC operations while reducing latency and energy consumption.

RP1-2:
Computing-in-Memory Enabled On-Device Large Language Model Personalization

Figure RP1-2 On-device learning of large-scale models with algorithm-CIM hardware co-design.
This project addresses the privacy and trust concerns associated with centralized cloud-based LLMs and LVLMs by enabling on-device personalization with CIM. Deploying these models on edge devices allows for personalized responses while keeping private data local. The project will explore techniques such as model compression, quantization, and data selection to adapt large-scale models for edge devices. Additionally, it will focus on accelerating and optimizing these models to reduce latency and training costs. By developing data refinement methods and self-learning frameworks, the aim is to enhance the personalization capabilities of edge-based models, creating a guidebook for future development.

RP1-3:
Reconfigurable Computing-in-Memory Architecture on 3D Non-Volatile Memory Technology

Figure RP1-3 Reconfigurable CIM based on 3D NVM. (a) 3D NAND FeFET memory and 3D FeFET crossbar array. (b) Reconfigurable CIM architecture.
This project aims to overcome the limitations of current CIM chips by developing reconfigurable CIM architectures using 3D non-volatile memory (NVM) technologies such as ferroelectric memory. These technologies offer higher memory bandwidth, lower write voltages, faster write speeds, and improved endurance, enhancing CIM performance and efficiency. The project will focus on creating a reconfigurable CIM architecture that can handle multiple data operations, such as parallel search and matrix multiplication, by re-imagining the use of NVM kernels and dynamic interconnections. This approach aims to accelerate a diverse array of AI computational tasks, unlocking the full potential of CIM technology.

RP1-4:
Memory-Centric Heterogenous Processor for Large-scale AI Models

Figure RP1-4 Memory-centric heterogeneous architecture with DRAM-based PnM and PuM.
This project will alleviate the data movement bottleneck that limits the efficiency and performance of modern computing systems. By shifting from a processor-centric design to a data-centric design, where processing elements are placed closer to the data, the project will explore Near-Data Processing (NDP) and Processing-in-Memory (PIM) architectures. These architectures will combine Processing-near-Memory (PnM) and Processing-using-Memory (PuM) approaches to maximize benefits. The goal is to significantly accelerate large-scale AI models and other data-intensive workloads, improving performance, energy efficiency, and scalability in a manner that can be easily used by various programmers and system designers.

RP1 - Computing-in-Memory Architecture for Large-scale AI Models

RP1-1: Hybrid ReRAM-SRAM-based Computing-in-Memory AI Processor for Large Language Models

RP1-2: Computing-in-Memory Enabled On-Device Large Language Model Personalization

RP1-3: Reconfigurable Computing-in-Memory Architecture on 3D Non-Volatile Memory Technology

RP1-4: Memory-Centric Heterogenous Processor for Large-scale AI Models

RP1-1:
Hybrid ReRAM-SRAM-based Computing-in-Memory AI Processor for Large Language Models

RP1-2:
Computing-in-Memory Enabled On-Device Large Language Model Personalization

RP1-3:
Reconfigurable Computing-in-Memory Architecture on 3D Non-Volatile Memory Technology

RP1-4:
Memory-Centric Heterogenous Processor for Large-scale AI Models