Analyzing and revivifying function signature inference using deep learning

Function signature plays an important role in binary analysis and security enhancement, with typical examples in bug finding and control-flow integrity enforcement. However, recovery of function signatures by static binary analysis is challenging since crucial information vital for such recovery is...

全面介紹

Saved in:

書目詳細資料
Main Authors:	LIN, Yan, SINGHAL, Trisha, GAO, Debin, LO, David
格式:	text
語言:	English
出版:	Institutional Knowledge at Singapore Management University 2024
主題:	Function signature recurrent neural network compiler optimization control-flow integrity Software Engineering
在線閱讀:	https://ink.library.smu.edu.sg/sis_research/9621 https://ink.library.smu.edu.sg/context/sis_research/article/10621/viewcontent/emse24_FunctionSignature_av.pdf
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Singapore Management University
語言:	English

id	sg-smu-ink.sis_research-10621
record_format	dspace
spelling	sg-smu-ink.sis_research-106212024-11-23T15:37:30Z Analyzing and revivifying function signature inference using deep learning LIN, Yan SINGHAL, Trisha GAO, Debin LO, David Function signature plays an important role in binary analysis and security enhancement, with typical examples in bug finding and control-flow integrity enforcement. However, recovery of function signatures by static binary analysis is challenging since crucial information vital for such recovery is stripped off during compilation. Although function signature recovery using deep learning (DL) is proposed in an effort to handle such challenges, the reported accuracy is low for binaries compiled with optimizations. In this paper, we first perform a systematic study to quantify the extent to which compiler optimizations (negatively) impact the accuracy of existing DL techniques based on Recurrent Neural Network (RNN) for function signature recovery. Our experiments show that the state-of-the-art DL technique has its accuracy dropped from 98.7% to 87.7% when training and testing optimized binaries. We further investigate the type of instructions that existing RNN model deems most important in inferring function signatures with the help of saliency map. The results show that existing RNN model mistakenly considers non-argument-accessing instructions to infer the number of arguments, especially when dealing with optimized binaries. Finally, we identify specific weaknesses in such existing approaches and propose an enhanced DL approach named ReSIL to incorporate compiler-optimization-specific domain knowledge into the learning process. Our experimental results show that ReSIL significantly improves the accuracy and F1 score in inferring function signatures, e.g., with accuracy in inferring the number of arguments for callees compiled with optimization flag O1 from 84.83% to 92.68%. Meanwhile, ReSIL correctly considers the argument-accessing instructions as the most important ones to perform the inferencing. We also demonstrate security implications of ReSIL in Control-Flow Integrity enforcement in stopping potential Counterfeit Object-Oriented Programming (COOP) attacks. 2024-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9621 info:doi/10.1007/s10664-024-10453-9 https://ink.library.smu.edu.sg/context/sis_research/article/10621/viewcontent/emse24_FunctionSignature_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Function signature recurrent neural network compiler optimization control-flow integrity Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Function signature recurrent neural network compiler optimization control-flow integrity Software Engineering
spellingShingle	Function signature recurrent neural network compiler optimization control-flow integrity Software Engineering LIN, Yan SINGHAL, Trisha GAO, Debin LO, David Analyzing and revivifying function signature inference using deep learning
description	Function signature plays an important role in binary analysis and security enhancement, with typical examples in bug finding and control-flow integrity enforcement. However, recovery of function signatures by static binary analysis is challenging since crucial information vital for such recovery is stripped off during compilation. Although function signature recovery using deep learning (DL) is proposed in an effort to handle such challenges, the reported accuracy is low for binaries compiled with optimizations. In this paper, we first perform a systematic study to quantify the extent to which compiler optimizations (negatively) impact the accuracy of existing DL techniques based on Recurrent Neural Network (RNN) for function signature recovery. Our experiments show that the state-of-the-art DL technique has its accuracy dropped from 98.7% to 87.7% when training and testing optimized binaries. We further investigate the type of instructions that existing RNN model deems most important in inferring function signatures with the help of saliency map. The results show that existing RNN model mistakenly considers non-argument-accessing instructions to infer the number of arguments, especially when dealing with optimized binaries. Finally, we identify specific weaknesses in such existing approaches and propose an enhanced DL approach named ReSIL to incorporate compiler-optimization-specific domain knowledge into the learning process. Our experimental results show that ReSIL significantly improves the accuracy and F1 score in inferring function signatures, e.g., with accuracy in inferring the number of arguments for callees compiled with optimization flag O1 from 84.83% to 92.68%. Meanwhile, ReSIL correctly considers the argument-accessing instructions as the most important ones to perform the inferencing. We also demonstrate security implications of ReSIL in Control-Flow Integrity enforcement in stopping potential Counterfeit Object-Oriented Programming (COOP) attacks.
format	text
author	LIN, Yan SINGHAL, Trisha GAO, Debin LO, David
author_facet	LIN, Yan SINGHAL, Trisha GAO, Debin LO, David
author_sort	LIN, Yan
title	Analyzing and revivifying function signature inference using deep learning
title_short	Analyzing and revivifying function signature inference using deep learning
title_full	Analyzing and revivifying function signature inference using deep learning
title_fullStr	Analyzing and revivifying function signature inference using deep learning
title_full_unstemmed	Analyzing and revivifying function signature inference using deep learning
title_sort	analyzing and revivifying function signature inference using deep learning
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9621 https://ink.library.smu.edu.sg/context/sis_research/article/10621/viewcontent/emse24_FunctionSignature_av.pdf
_version_	1816859163022589952

Analyzing and revivifying function signature inference using deep learning

相似書籍