DyCuckoo: Dynamic hash tables on GPUs

The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored...

Full description

Saved in:
Bibliographic Details
Main Authors: LI, Yuchen, ZHU, Qiwei, LYU, Zheng, HUANG, Zhongdong, SUN, Jianling
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6206
https://ink.library.smu.edu.sg/context/sis_research/article/7209/viewcontent/dycuckoo.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7209
record_format dspace
spelling sg-smu-ink.sis_research-72092021-10-14T06:58:37Z DyCuckoo: Dynamic hash tables on GPUs LI, Yuchen ZHU, Qiwei LYU, Zheng HUANG, Zhongdong SUN, Jianling The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored in hash tables get updated dynamically, and existing approaches use unnecessarily large memory resources. One naïve solution is to rebuild a hash table (known as rehashing) whenever it is either filled or mostly empty. However, this approach renders significant overheads for rehashing. In this paper, we propose a novel dynamic cuckoo hash table technique on GPUs, known as DyCuckoo. We devise a resizing strategy for dynamic scenarios without rehashing the entire table that ensures a guaranteed filled factor. The strategy trades search performance with resizing efficiency, and this tradeoff can be configured by users. To further improve efficiency, we propose a 2-in-d cuckoo hashing scheme that ensures a maximum of two lookups for find and delete operations, while retaining similar performance for insertions as a general cuckoo hash. Extensive experiments have validated the proposed design's effectiveness over several state-of-the-art hash table implementations on GPUs. DyCuckoo achieves superior efficiency while enables fine-grained memory control, which is not available in existing GPU hash table approaches. 2021-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6206 info:doi/10.1109/ICDE51399.2021.00070 https://ink.library.smu.edu.sg/context/sis_research/article/7209/viewcontent/dycuckoo.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Data Storage Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Data Storage Systems
spellingShingle Databases and Information Systems
Data Storage Systems
LI, Yuchen
ZHU, Qiwei
LYU, Zheng
HUANG, Zhongdong
SUN, Jianling
DyCuckoo: Dynamic hash tables on GPUs
description The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored in hash tables get updated dynamically, and existing approaches use unnecessarily large memory resources. One naïve solution is to rebuild a hash table (known as rehashing) whenever it is either filled or mostly empty. However, this approach renders significant overheads for rehashing. In this paper, we propose a novel dynamic cuckoo hash table technique on GPUs, known as DyCuckoo. We devise a resizing strategy for dynamic scenarios without rehashing the entire table that ensures a guaranteed filled factor. The strategy trades search performance with resizing efficiency, and this tradeoff can be configured by users. To further improve efficiency, we propose a 2-in-d cuckoo hashing scheme that ensures a maximum of two lookups for find and delete operations, while retaining similar performance for insertions as a general cuckoo hash. Extensive experiments have validated the proposed design's effectiveness over several state-of-the-art hash table implementations on GPUs. DyCuckoo achieves superior efficiency while enables fine-grained memory control, which is not available in existing GPU hash table approaches.
format text
author LI, Yuchen
ZHU, Qiwei
LYU, Zheng
HUANG, Zhongdong
SUN, Jianling
author_facet LI, Yuchen
ZHU, Qiwei
LYU, Zheng
HUANG, Zhongdong
SUN, Jianling
author_sort LI, Yuchen
title DyCuckoo: Dynamic hash tables on GPUs
title_short DyCuckoo: Dynamic hash tables on GPUs
title_full DyCuckoo: Dynamic hash tables on GPUs
title_fullStr DyCuckoo: Dynamic hash tables on GPUs
title_full_unstemmed DyCuckoo: Dynamic hash tables on GPUs
title_sort dycuckoo: dynamic hash tables on gpus
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/6206
https://ink.library.smu.edu.sg/context/sis_research/article/7209/viewcontent/dycuckoo.pdf
_version_ 1770575891200475136