Using finite-state models for log differencing

Much work has been published on extracting various kinds of models from logs that document the execution of running systems. In many cases, however, for example in the context of evolution, testing, or malware analysis, engineers are interested not only in a single log but in a set of several logs,...

Full description

Saved in:
Bibliographic Details
Main Authors: AMAR, Hen, BAO, Lingfeng, BUSANY, Nimrod, LO, David, MAOZ, Shahar
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4302
https://ink.library.smu.edu.sg/context/sis_research/article/5305/viewcontent/log_diff_fse18.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5305
record_format dspace
spelling sg-smu-ink.sis_research-53052020-03-27T01:43:28Z Using finite-state models for log differencing AMAR, Hen BAO, Lingfeng BUSANY, Nimrod LO, David MAOZ, Shahar Much work has been published on extracting various kinds of models from logs that document the execution of running systems. In many cases, however, for example in the context of evolution, testing, or malware analysis, engineers are interested not only in a single log but in a set of several logs, each of which originated from a different set of runs of the system at hand. Then, the difference between the logs is the main target of interest. In this work we investigate the use of finite-state models for log differencing. Rather than comparing the logs directly, we generate concise models to describe and highlight their differences. Specifically, we present two algorithms based on the classic k-Tails algorithm: 2KDiff, which computes and highlights simple traces containing sequences of k events that belong to one log but not the other, and nKDiff, which extends k-Tails from one to many logs, and distinguishes the sequences of length k that are common to all logs from the ones found in only some of them, all on top of a single, rich model. Both algorithms are sound and complete modulo the abstraction defined by the use of k-Tails. We implemented both algorithms and evaluated their performance on mutated logs that we generated based on models from the literature. We conducted a user study including 60 participants demonstrating the effectiveness of the approach in log differencing tasks. We have further performed a case study to examine the use of our approach in malware analysis. Finally, we have made our work available in a prototype web-application, for experiments. 2018-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4302 info:doi/10.1145/3236024.3236069 https://ink.library.smu.edu.sg/context/sis_research/article/5305/viewcontent/log_diff_fse18.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Log analysis Model inference Software Engineering Theory and Algorithms
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Log analysis
Model inference
Software Engineering
Theory and Algorithms
spellingShingle Log analysis
Model inference
Software Engineering
Theory and Algorithms
AMAR, Hen
BAO, Lingfeng
BUSANY, Nimrod
LO, David
MAOZ, Shahar
Using finite-state models for log differencing
description Much work has been published on extracting various kinds of models from logs that document the execution of running systems. In many cases, however, for example in the context of evolution, testing, or malware analysis, engineers are interested not only in a single log but in a set of several logs, each of which originated from a different set of runs of the system at hand. Then, the difference between the logs is the main target of interest. In this work we investigate the use of finite-state models for log differencing. Rather than comparing the logs directly, we generate concise models to describe and highlight their differences. Specifically, we present two algorithms based on the classic k-Tails algorithm: 2KDiff, which computes and highlights simple traces containing sequences of k events that belong to one log but not the other, and nKDiff, which extends k-Tails from one to many logs, and distinguishes the sequences of length k that are common to all logs from the ones found in only some of them, all on top of a single, rich model. Both algorithms are sound and complete modulo the abstraction defined by the use of k-Tails. We implemented both algorithms and evaluated their performance on mutated logs that we generated based on models from the literature. We conducted a user study including 60 participants demonstrating the effectiveness of the approach in log differencing tasks. We have further performed a case study to examine the use of our approach in malware analysis. Finally, we have made our work available in a prototype web-application, for experiments.
format text
author AMAR, Hen
BAO, Lingfeng
BUSANY, Nimrod
LO, David
MAOZ, Shahar
author_facet AMAR, Hen
BAO, Lingfeng
BUSANY, Nimrod
LO, David
MAOZ, Shahar
author_sort AMAR, Hen
title Using finite-state models for log differencing
title_short Using finite-state models for log differencing
title_full Using finite-state models for log differencing
title_fullStr Using finite-state models for log differencing
title_full_unstemmed Using finite-state models for log differencing
title_sort using finite-state models for log differencing
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/4302
https://ink.library.smu.edu.sg/context/sis_research/article/5305/viewcontent/log_diff_fse18.pdf
_version_ 1770574604516982784