Hardware implementation for secure computation on encrypted data

With the increasing adoption of “as a service” technologies, data-privacy and confidentiality are of increasing concerns in cloud computing platforms. Fully Homomorphic Encryption (FHE) schemes are a key tool in enabling privacy-preserving computing as they are able to perform homomorphic operati...

Full description

Saved in:
Bibliographic Details
Main Author: Ding, Dao Xian
Other Authors: Chang Chip Hong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/177065
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the increasing adoption of “as a service” technologies, data-privacy and confidentiality are of increasing concerns in cloud computing platforms. Fully Homomorphic Encryption (FHE) schemes are a key tool in enabling privacy-preserving computing as they are able to perform homomorphic operations over the ciphertext ring, eliminating the need for initial decryption. Despite their potential, FHE cryptosystems often observe huge computation cost and slow performance on general purpose platforms, thus limiting its deployment. In this project, we explore the use of hardware accelerator designs to accelerate a levelled Brakerski-Gentry-Vaikuntanathan (BGV) cryptosystem. We first perform a basic profiling of the BGV cryptosystem on a CPU platform to observe for computation bottlenecks. Based on the profiled results, an FPGA based accelerator is proposed to accelerate the Ring Multiplication Operation. This accelerator can then be used as a co-processor to a larger hardware/software codesign solution to accelerate the server-side operation. For this project, emphasis is placed on the hardware implementation of a 4096-point Residue Number System (RNS) based Number Theoretic Transform (NTT) unit. Within the NTT unit, either one or two butterfly units can used for computation, and a memory controller is used to facilitate communication between butterfly units and BRAMs. The proposed architecture is then pipelined to increase clock frequency, as well as throughput, and then tested on a Zync UltraScale+ MPSoc Evaluation Kit for area size, latency, and throughput estimation.