DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects
The Smart Contract Weakness Classification Registry (SWC Registry) is a widely recognized list of smart contract weaknesses specific to the Ethereum platform. Despite the SWC Registry not being updated with new entries since 2020, the sustained development of smart contract analysis tools for detect...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2024
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8971 https://ink.library.smu.edu.sg/context/sis_research/article/9974/viewcontent/DAppSCAN_sv.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9974 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-99742024-07-17T06:50:01Z DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects ZHENG, Zibin SU, Jianzhong CHEN, Jiachi LO, David ZHONG, Zhijie YE, Mingxi The Smart Contract Weakness Classification Registry (SWC Registry) is a widely recognized list of smart contract weaknesses specific to the Ethereum platform. Despite the SWC Registry not being updated with new entries since 2020, the sustained development of smart contract analysis tools for detecting SWC-listed weaknesses highlights their ongoing significance in the field. However, evaluating these tools has proven challenging due to the absence of a large, unbiased, real-world dataset. To address this problem, we aim to build a large-scale SWC weakness dataset from real-world DApp projects. We recruited 22 participants and spent 44 person-months analyzing 1,199 open-source audit reports from 29 security teams. In total, we identified 9,154 weaknesses and developed two distinct datasets, i.e., DAPPSCAN-SOURCE and DAPPSCAN-BYTECODE. The DAPPSCAN-SOURCE dataset comprises 39,904 Solidity files, featuring 1,618 SWC weaknesses sourced from 682 real-world DApp projects. However, the Solidity files in this dataset may not be directly compilable for further analysis. To facilitate automated analysis, we developed a tool capable of automatically identifying dependency relationships within DApp projects and completing missing public libraries. Using this tool, we created DAPPSCAN-BYTECODE dataset, which consists of 6,665 compiled smart contract with 888 SWC weaknesses. Based on DAPPSCAN-BYTECODE, we conducted an empirical study to evaluate the performance of state-of-the-art smart contract weakness detection tools. The evaluation results revealed sub-par performance for these tools in terms of both effectiveness and success detection rate, indicating that future development should prioritize real-world datasets over simplistic toy contracts. 2024-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8971 info:doi/10.1109/TSE.2024.3383422 https://ink.library.smu.edu.sg/context/sis_research/article/9974/viewcontent/DAppSCAN_sv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Empirical study Smart contracts SWC weakness dataset ethereum Finance and Financial Management Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Empirical study Smart contracts SWC weakness dataset ethereum Finance and Financial Management Software Engineering |
spellingShingle |
Empirical study Smart contracts SWC weakness dataset ethereum Finance and Financial Management Software Engineering ZHENG, Zibin SU, Jianzhong CHEN, Jiachi LO, David ZHONG, Zhijie YE, Mingxi DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects |
description |
The Smart Contract Weakness Classification Registry (SWC Registry) is a widely recognized list of smart contract weaknesses specific to the Ethereum platform. Despite the SWC Registry not being updated with new entries since 2020, the sustained development of smart contract analysis tools for detecting SWC-listed weaknesses highlights their ongoing significance in the field. However, evaluating these tools has proven challenging due to the absence of a large, unbiased, real-world dataset. To address this problem, we aim to build a large-scale SWC weakness dataset from real-world DApp projects. We recruited 22 participants and spent 44 person-months analyzing 1,199 open-source audit reports from 29 security teams. In total, we identified 9,154 weaknesses and developed two distinct datasets, i.e., DAPPSCAN-SOURCE and DAPPSCAN-BYTECODE. The DAPPSCAN-SOURCE dataset comprises 39,904 Solidity files, featuring 1,618 SWC weaknesses sourced from 682 real-world DApp projects. However, the Solidity files in this dataset may not be directly compilable for further analysis. To facilitate automated analysis, we developed a tool capable of automatically identifying dependency relationships within DApp projects and completing missing public libraries. Using this tool, we created DAPPSCAN-BYTECODE dataset, which consists of 6,665 compiled smart contract with 888 SWC weaknesses. Based on DAPPSCAN-BYTECODE, we conducted an empirical study to evaluate the performance of state-of-the-art smart contract weakness detection tools. The evaluation results revealed sub-par performance for these tools in terms of both effectiveness and success detection rate, indicating that future development should prioritize real-world datasets over simplistic toy contracts. |
format |
text |
author |
ZHENG, Zibin SU, Jianzhong CHEN, Jiachi LO, David ZHONG, Zhijie YE, Mingxi |
author_facet |
ZHENG, Zibin SU, Jianzhong CHEN, Jiachi LO, David ZHONG, Zhijie YE, Mingxi |
author_sort |
ZHENG, Zibin |
title |
DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects |
title_short |
DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects |
title_full |
DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects |
title_fullStr |
DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects |
title_full_unstemmed |
DAppSCAN: Building large-scale datasets for smart contract weaknesses in DApp Projects |
title_sort |
dappscan: building large-scale datasets for smart contract weaknesses in dapp projects |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2024 |
url |
https://ink.library.smu.edu.sg/sis_research/8971 https://ink.library.smu.edu.sg/context/sis_research/article/9974/viewcontent/DAppSCAN_sv.pdf |
_version_ |
1814047697502470144 |