Sections
Research Programmes
Image Slideshow
Left Column
Right Column
Text Area

RP4 - Hardware-Accelerated AI Applications

RP4 emphasises selected emerging applications for hardware acceleration, exploring system architecture and new design tools specific to the target applications to achieve breakthroughs in AI hardware in speeds and energy efficiency.

 

Text Area

RP4-1:
Intelligent Pixels Array for 3-D Imaging Applications

Figure RP4-1: Technology of integrating sensors and data-analytic chip
Project RP4-1 proposes to integrate intelligent 3-D feature extraction algorithm with new sensors designed for 3-D object detection and motion prediction. The system can be used for many new applications including autonomous vehicles, gesture control and augmented reality. A new type of sensors will be design to achieve high efficiency and low power data collection, which will be tightly integrated with the processing and identification circuits. To improve the effectiveness of the intelligent sensing system, the Back-Side Illumination (BSI) technology maybe used to bond a pre-fabricated image sensor array onto a grid-based ASIC chip.

Text Area

RP4-2:
Design Secure Machine Learning Accelerator

Figure RP4-2: Compute-in-memory NN accelerator
Project RP4-2 covers the assessment of the security challenges in the ML accelerator with NVMs and the proposal of the corresponding countermeasure techniques. Non-volatile memory (NVM) based compute-in-memory (CIM) is promising in solving the design challenge on embedded memory and bandwidth. But the usage of NVMs may introduce the hardware security vulnerabilities, as the weights are stored in the NVMs. This project addresses this issue. This research will combine PI’s expertise in machine learning accelerator and hardware security.

Text Area

RP4-3:
Artificial Intelligence Hardware in Ultra-Fast Medical Diagnostics

Figure RP4-3: A low-lantency FPGA board-level prototype for real-time cell labeling
The goal of this project is to exploit the unique capabilities of modern field-programmable gate arrays (FPGAs) and/or Internet of Things (IoT) boards to serve as a hybrid high-level computational platform and as a low-level integrated system to achieve the lowest possible processing latency for real-time image analytic applications. The system will be further optimized in terms of speed and power consumptions based on technologies developed in RP1-1 and RP1-2 to achieve chip level AI function integration.

Text Area

RP4-4:
Hardware-Accelerated Federated Learning System

Figure RP4-4: An illustration of a secure and efficient federated learning system
Project RP4-4 addresses the challenges of privacy, ownership, and locality of data in today’s federated learning system from both algorithm design and hardware acceleration. This includes designing efficient algorithms to minimize the communication and hide the data transfer delay, building hardware acceleration with FPGA, offloading the computation intensive packet processing, encryption/decryption into data plane implemented on FPGA, as well as employing effective network protocols to facilitate the hardware acceleration for the packet processing and encryption.