Leveraging large language models and BERT for log parsing and anomaly detection
Computer systems and applications generate large amounts of logs to measure and record information, which is vital to protect the systems from malicious attacks and useful for repairing faults, especially with the rapid development of distributed computing. Among various logs, the anomaly log is ben...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181426 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181426 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1814262024-12-06T15:38:31Z Leveraging large language models and BERT for log parsing and anomaly detection Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao School of Computer Science and Engineering Computer and Information Science Anomaly log detection Large language models Computer systems and applications generate large amounts of logs to measure and record information, which is vital to protect the systems from malicious attacks and useful for repairing faults, especially with the rapid development of distributed computing. Among various logs, the anomaly log is beneficial for operations and maintenance (O&M) personnel to locate faults and improve efficiency. In this paper, we utilize a large language model, ChatGPT, for the log parser task. We choose the BERT model, a self-supervised framework for log anomaly detection. BERT, an embedded transformer encoder, with a self-attention mechanism can better handle context-dependent tasks such as anomaly log detection. Meanwhile, it is based on the masked language model task and next sentence prediction task in the pretraining period to capture the normal log sequence pattern. The experimental results on two log datasets show that the BERT model combined with an LLM performed better than other classical models such as Deelog and Loganomaly. Published version This research was sponsored in part by the National Natural Science Foundation of China (No. 62177046 and 62477046), Hunan 14th Five-Year Plan Educational Science Research Project (No. XJK23AJD022 and XJK23AJD021), Hunan Social Science Foundation (No. 22YBA012), Hunan Provincial Key Research and Development Project (No. 2021SK2022), and High Performance Computing Center of Central South University. 2024-12-02T04:47:20Z 2024-12-02T04:47:20Z 2024 Journal Article Zhou, Y., Chen, Y., Rao, X., Zhou, Y., Li, Y. & Hu, C. (2024). Leveraging large language models and BERT for log parsing and anomaly detection. Mathematics, 12(17), 12172758-. https://dx.doi.org/10.3390/math12172758 2227-7390 https://hdl.handle.net/10356/181426 10.3390/math12172758 2-s2.0-85203646702 17 12 12172758 en Mathematics © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Anomaly log detection Large language models |
spellingShingle |
Computer and Information Science Anomaly log detection Large language models Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao Leveraging large language models and BERT for log parsing and anomaly detection |
description |
Computer systems and applications generate large amounts of logs to measure and record information, which is vital to protect the systems from malicious attacks and useful for repairing faults, especially with the rapid development of distributed computing. Among various logs, the anomaly log is beneficial for operations and maintenance (O&M) personnel to locate faults and improve efficiency. In this paper, we utilize a large language model, ChatGPT, for the log parser task. We choose the BERT model, a self-supervised framework for log anomaly detection. BERT, an embedded transformer encoder, with a self-attention mechanism can better handle context-dependent tasks such as anomaly log detection. Meanwhile, it is based on the masked language model task and next sentence prediction task in the pretraining period to capture the normal log sequence pattern. The experimental results on two log datasets show that the BERT model combined with an LLM performed better than other classical models such as Deelog and Loganomaly. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao |
format |
Article |
author |
Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao |
author_sort |
Zhou, Yihan |
title |
Leveraging large language models and BERT for log parsing and anomaly detection |
title_short |
Leveraging large language models and BERT for log parsing and anomaly detection |
title_full |
Leveraging large language models and BERT for log parsing and anomaly detection |
title_fullStr |
Leveraging large language models and BERT for log parsing and anomaly detection |
title_full_unstemmed |
Leveraging large language models and BERT for log parsing and anomaly detection |
title_sort |
leveraging large language models and bert for log parsing and anomaly detection |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181426 |
_version_ |
1819113083643101184 |