S-Looper: Automatic summarization for multipath string loops

Loops are important yet most challenging program constructs to analyze for various program analysis tasks. Existing loop analysis techniques mainly handle well loops that contain only integer variables with a single path in the loop body. The key challenge in summarizing a multiple-path loop is that...

Full description

Saved in:
Bibliographic Details
Main Authors: XIE, Xiaofei, LIU, Yang, LE, Wei, LI, Xiaohong, CHEN, Hongxu
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7103
https://ink.library.smu.edu.sg/context/sis_research/article/8106/viewcontent/2771783.2771815.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Loops are important yet most challenging program constructs to analyze for various program analysis tasks. Existing loop analysis techniques mainly handle well loops that contain only integer variables with a single path in the loop body. The key challenge in summarizing a multiple-path loop is that a loop traversal can yield a large number of possibilities due to the different execution orders of these paths located in the loop; when a loop contains a conditional branch related to string content, we potentially need to track every character in the string for loop summarization, which is expensive. In this paper, we propose an approach, named S-Looper, to automatically summarize a type of loops related to a string traversal. This type of loops can contain multiple paths, and the branch conditions in the loop can be related to string content. Our approach is to identify patterns of the string based on the branch conditions along each path in the loop. Based on such patterns, we then generate a loop summary that describes the path conditions of a loop traversal as well as the symbolic values of each variable at the exit of a loop. Combined with vulnerability conditions, we are thus able to generate test inputs that traverse a loop in a specific way and lead to exploitation. Our experiments show that handling such string loops can largely improve the buffer overflow detection capabilities of the existing symbolic analysis tool. We also compared our techniques with KLEE and PEX, and show that we can generate test inputs more effectively and efficiently.