MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection

Smart contracts in blockchains have been increasingly used for high-value business applications. It is essential to check smart contracts' reliability before and after deployment. Although various program analysis and deep learning techniques have been proposed to detect vulnerabilities in eith...

Full description

Saved in:
Bibliographic Details
Main Authors: NGUYEN, Huu Hoang, NGUYEN, Nhat Minh, XIE, Chunyao, AHMADI, Zahra, KUDENDO, Daniel, DOAN, Thanh-Nam, JIANG, Lingxiao
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8343
https://ink.library.smu.edu.sg/context/sis_research/article/9346/viewcontent/MSR2023MANDO.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9346
record_format dspace
spelling sg-smu-ink.sis_research-93462023-12-13T03:39:07Z MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection NGUYEN, Huu Hoang NGUYEN, Nhat Minh XIE, Chunyao AHMADI, Zahra KUDENDO, Daniel DOAN, Thanh-Nam JIANG, Lingxiao Smart contracts in blockchains have been increasingly used for high-value business applications. It is essential to check smart contracts' reliability before and after deployment. Although various program analysis and deep learning techniques have been proposed to detect vulnerabilities in either Ethereum smart contract source code or bytecode, their detection accuracy and scalability are still limited. This paper presents a novel framework named MANDO-HGT for detecting smart contract vulnerabilities. Given Ethereum smart contracts, either in source code or bytecode form, and vulnerable or clean, MANDO-HGT custom-builds heterogeneous contract graphs (HCGs) to represent control-flow and/or function-call information of the code. It then adapts heterogeneous graph transformers (HGTs) with customized meta relations for graph nodes and edges to learn their embeddings and train classifiers for detecting various vulnerability types in the nodes and graphs of the contracts more accurately. We have collected more than 55K Ethereum smart contracts from various data sources and verified the labels for 423 buggy and 2,742 clean contracts to evaluate MANDO-HGT. Our empirical results show that MANDO-HGT can significantly improve the detection accuracy of other state-of-the-art vulnerability detection techniques that are based on either machine learning or conventional analysis techniques. The accuracy improvements in terms of F1-score range from 0.7% to more than 76% at either the coarse-grained contract level or the fine-grained line level for various vulnerability types in either source code or bytecode. Our method is general and can be retrained easily for different vulnerability types without the need for manually defined vulnerability patterns. 2023-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8343 info:doi/10.1109/MSR59073.2023.00052 https://ink.library.smu.edu.sg/context/sis_research/article/9346/viewcontent/MSR2023MANDO.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Bytecode Graph transformer Heterogeneous graph learning Smart contracts Source code Vulnerability detection Databases and Information Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Bytecode
Graph transformer
Heterogeneous graph learning
Smart contracts
Source code
Vulnerability detection
Databases and Information Systems
Graphics and Human Computer Interfaces
spellingShingle Bytecode
Graph transformer
Heterogeneous graph learning
Smart contracts
Source code
Vulnerability detection
Databases and Information Systems
Graphics and Human Computer Interfaces
NGUYEN, Huu Hoang
NGUYEN, Nhat Minh
XIE, Chunyao
AHMADI, Zahra
KUDENDO, Daniel
DOAN, Thanh-Nam
JIANG, Lingxiao
MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection
description Smart contracts in blockchains have been increasingly used for high-value business applications. It is essential to check smart contracts' reliability before and after deployment. Although various program analysis and deep learning techniques have been proposed to detect vulnerabilities in either Ethereum smart contract source code or bytecode, their detection accuracy and scalability are still limited. This paper presents a novel framework named MANDO-HGT for detecting smart contract vulnerabilities. Given Ethereum smart contracts, either in source code or bytecode form, and vulnerable or clean, MANDO-HGT custom-builds heterogeneous contract graphs (HCGs) to represent control-flow and/or function-call information of the code. It then adapts heterogeneous graph transformers (HGTs) with customized meta relations for graph nodes and edges to learn their embeddings and train classifiers for detecting various vulnerability types in the nodes and graphs of the contracts more accurately. We have collected more than 55K Ethereum smart contracts from various data sources and verified the labels for 423 buggy and 2,742 clean contracts to evaluate MANDO-HGT. Our empirical results show that MANDO-HGT can significantly improve the detection accuracy of other state-of-the-art vulnerability detection techniques that are based on either machine learning or conventional analysis techniques. The accuracy improvements in terms of F1-score range from 0.7% to more than 76% at either the coarse-grained contract level or the fine-grained line level for various vulnerability types in either source code or bytecode. Our method is general and can be retrained easily for different vulnerability types without the need for manually defined vulnerability patterns.
format text
author NGUYEN, Huu Hoang
NGUYEN, Nhat Minh
XIE, Chunyao
AHMADI, Zahra
KUDENDO, Daniel
DOAN, Thanh-Nam
JIANG, Lingxiao
author_facet NGUYEN, Huu Hoang
NGUYEN, Nhat Minh
XIE, Chunyao
AHMADI, Zahra
KUDENDO, Daniel
DOAN, Thanh-Nam
JIANG, Lingxiao
author_sort NGUYEN, Huu Hoang
title MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection
title_short MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection
title_full MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection
title_fullStr MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection
title_full_unstemmed MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection
title_sort mando-hgt: heterogeneous graph transformers for smart contract vulnerability detection
publisher Institutional Knowledge at Singapore Management University
publishDate 2023
url https://ink.library.smu.edu.sg/sis_research/8343
https://ink.library.smu.edu.sg/context/sis_research/article/9346/viewcontent/MSR2023MANDO.pdf
_version_ 1787136837722046464