DyCuckoo: Dynamic hash tables on GPUs

The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored...

Full description

Saved in:
Bibliographic Details
Main Authors: LI, Yuchen, ZHU, Qiwei, LYU, Zheng, HUANG, Zhongdong, SUN, Jianling
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6206
https://ink.library.smu.edu.sg/context/sis_research/article/7209/viewcontent/dycuckoo.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored in hash tables get updated dynamically, and existing approaches use unnecessarily large memory resources. One naïve solution is to rebuild a hash table (known as rehashing) whenever it is either filled or mostly empty. However, this approach renders significant overheads for rehashing. In this paper, we propose a novel dynamic cuckoo hash table technique on GPUs, known as DyCuckoo. We devise a resizing strategy for dynamic scenarios without rehashing the entire table that ensures a guaranteed filled factor. The strategy trades search performance with resizing efficiency, and this tradeoff can be configured by users. To further improve efficiency, we propose a 2-in-d cuckoo hashing scheme that ensures a maximum of two lookups for find and delete operations, while retaining similar performance for insertions as a general cuckoo hash. Extensive experiments have validated the proposed design's effectiveness over several state-of-the-art hash table implementations on GPUs. DyCuckoo achieves superior efficiency while enables fine-grained memory control, which is not available in existing GPU hash table approaches.