Research Programmes

Enabling Technology for Emerging Computing Systems

RP1

addresses memory and data bandwidth problems to alleviate the bottlenecks of AI hardware by exploring integration of silicon-compatible emerging technologies with scaled silicon chips.

Explore
Architecture and Heterogeneous System Integration

RP2

focuses on exploring different new architectural and system integration solutions for efficient neuromorphic computing on platforms ranging from cloud to smart Internet of Things (IoT).

Explore
AI-Assisted EDA (Electronic Design Automation) for AI Hardware

RP3

hopes to develop new design methodologies and design automation tools for AI chips.

Explore
Hardware-Accelerated AI Applications

RP4

emphasises selected emerging applications for hardware acceleration, exploring system architecture and new design tools specific to the target applications to achieve breakthroughs in AI hardware in speeds and energy efficiency.

Explore
Computing-in-Memory Architecture for Large-scale AI Models

RP1

Tackling the compute, memory, and memory bandwidth scaling problems, particularly focusing on advancing CIM-based chip architecture to enable the ultra-energy-efficient and scalable deployment of large-scale AI models.

Explore
Hardware-Software Co-Design for Energy-Efficient and High-Performance Edge AI

RP2

Advancing the field of algorithm-hardware co-design, addressing new challenges arising from the increasing complexity, scalability, energy efficiency, and latency requirements of embodied large-scale AI applications.

Explore
EDA (Electronic Design Automation) for Large-Scale AI Hardware

RP3

Exploring new frontiers in EDA for AI chip and system design, with the primary focus placed on supporting the chiplet- and 3DIC-based designs, overcoming the limitations of existing EDA tools, and exploring the employment of foundation AI models in various design tasks to enable more agile chip design.

Explore
Cross-layer Optimization and Demonstration for Targeted Applications

RP4

Leveraging the technologies developed in the above 3 research projects to perform cross-layer optimization and develop hardware-accelerated working systems for domain-specific, large-scale AI applications with revolutionary performance.

Explore

RP4 - Hardware-Accelerated AI Applications

RP4 emphasises selected emerging applications for hardware acceleration, exploring system architecture and new design tools specific to the target applications to achieve breakthroughs in AI hardware in speeds and energy efficiency.

RP4-1:
Intelligent Pixels Array for 3-D Imaging Applications

Figure RP4-1: Technology of integrating sensors and data-analytic chip
Project RP4-1 proposes to integrate intelligent 3-D feature extraction algorithm with new sensors designed for 3-D object detection and motion prediction. The system can be used for many new applications including autonomous vehicles, gesture control and augmented reality. A new type of sensors will be design to achieve high efficiency and low power data collection, which will be tightly integrated with the processing and identification circuits. To improve the effectiveness of the intelligent sensing system, the Back-Side Illumination (BSI) technology maybe used to bond a pre-fabricated image sensor array onto a grid-based ASIC chip.

RP4-2:
Design Secure Machine Learning Accelerator

Figure RP4-2: Compute-in-memory NN accelerator
Project RP4-2 covers the assessment of the security challenges in the ML accelerator with NVMs and the proposal of the corresponding countermeasure techniques. Non-volatile memory (NVM) based compute-in-memory (CIM) is promising in solving the design challenge on embedded memory and bandwidth. But the usage of NVMs may introduce the hardware security vulnerabilities, as the weights are stored in the NVMs. This project addresses this issue. This research will combine PI’s expertise in machine learning accelerator and hardware security.

RP4-3:
Artificial Intelligence Hardware in Ultra-Fast Medical Diagnostics

Figure RP4-3: A low-lantency FPGA board-level prototype for real-time cell labeling
The goal of this project is to exploit the unique capabilities of modern field-programmable gate arrays (FPGAs) and/or Internet of Things (IoT) boards to serve as a hybrid high-level computational platform and as a low-level integrated system to achieve the lowest possible processing latency for real-time image analytic applications. The system will be further optimized in terms of speed and power consumptions based on technologies developed in RP1-1 and RP1-2 to achieve chip level AI function integration.

RP4-4:
Hardware-Accelerated Federated Learning System

Figure RP4-4: An illustration of a secure and efficient federated learning system
Project RP4-4 addresses the challenges of privacy, ownership, and locality of data in today’s federated learning system from both algorithm design and hardware acceleration. This includes designing efficient algorithms to minimize the communication and hide the data transfer delay, building hardware acceleration with FPGA, offloading the computation intensive packet processing, encryption/decryption into data plane implemented on FPGA, as well as employing effective network protocols to facilitate the hardware acceleration for the packet processing and encryption.

RP4 - Hardware-Accelerated AI Applications

RP4-1: Intelligent Pixels Array for 3-D Imaging Applications

RP4-2: Design Secure Machine Learning Accelerator

RP4-3: Artificial Intelligence Hardware in Ultra-Fast Medical Diagnostics

RP4-4: Hardware-Accelerated Federated Learning System

RP4-1:
Intelligent Pixels Array for 3-D Imaging Applications

RP4-2:
Design Secure Machine Learning Accelerator

RP4-3:
Artificial Intelligence Hardware in Ultra-Fast Medical Diagnostics

RP4-4:
Hardware-Accelerated Federated Learning System