Leveraging large language models and BERT for log parsing and anomaly detection

Computer systems and applications generate large amounts of logs to measure and record information, which is vital to protect the systems from malicious attacks and useful for repairing faults, especially with the rapid development of distributed computing. Among various logs, the anomaly log is ben...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhou, Yihan, Chen, Yan, Rao, Xuanming, Zhou, Yukang, Li, Yuxin, Hu, Chao
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science Anomaly log detection Large language models
Online Access:	https://hdl.handle.net/10356/181426
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-181426
record_format	dspace
spelling	sg-ntu-dr.10356-1814262024-12-06T15:38:31Z Leveraging large language models and BERT for log parsing and anomaly detection Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao School of Computer Science and Engineering Computer and Information Science Anomaly log detection Large language models Computer systems and applications generate large amounts of logs to measure and record information, which is vital to protect the systems from malicious attacks and useful for repairing faults, especially with the rapid development of distributed computing. Among various logs, the anomaly log is beneficial for operations and maintenance (O&M) personnel to locate faults and improve efficiency. In this paper, we utilize a large language model, ChatGPT, for the log parser task. We choose the BERT model, a self-supervised framework for log anomaly detection. BERT, an embedded transformer encoder, with a self-attention mechanism can better handle context-dependent tasks such as anomaly log detection. Meanwhile, it is based on the masked language model task and next sentence prediction task in the pretraining period to capture the normal log sequence pattern. The experimental results on two log datasets show that the BERT model combined with an LLM performed better than other classical models such as Deelog and Loganomaly. Published version This research was sponsored in part by the National Natural Science Foundation of China (No. 62177046 and 62477046), Hunan 14th Five-Year Plan Educational Science Research Project (No. XJK23AJD022 and XJK23AJD021), Hunan Social Science Foundation (No. 22YBA012), Hunan Provincial Key Research and Development Project (No. 2021SK2022), and High Performance Computing Center of Central South University. 2024-12-02T04:47:20Z 2024-12-02T04:47:20Z 2024 Journal Article Zhou, Y., Chen, Y., Rao, X., Zhou, Y., Li, Y. & Hu, C. (2024). Leveraging large language models and BERT for log parsing and anomaly detection. Mathematics, 12(17), 12172758-. https://dx.doi.org/10.3390/math12172758 2227-7390 https://hdl.handle.net/10356/181426 10.3390/math12172758 2-s2.0-85203646702 17 12 12172758 en Mathematics © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Anomaly log detection Large language models
spellingShingle	Computer and Information Science Anomaly log detection Large language models Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao Leveraging large language models and BERT for log parsing and anomaly detection
description	Computer systems and applications generate large amounts of logs to measure and record information, which is vital to protect the systems from malicious attacks and useful for repairing faults, especially with the rapid development of distributed computing. Among various logs, the anomaly log is beneficial for operations and maintenance (O&M) personnel to locate faults and improve efficiency. In this paper, we utilize a large language model, ChatGPT, for the log parser task. We choose the BERT model, a self-supervised framework for log anomaly detection. BERT, an embedded transformer encoder, with a self-attention mechanism can better handle context-dependent tasks such as anomaly log detection. Meanwhile, it is based on the masked language model task and next sentence prediction task in the pretraining period to capture the normal log sequence pattern. The experimental results on two log datasets show that the BERT model combined with an LLM performed better than other classical models such as Deelog and Loganomaly.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao
format	Article
author	Zhou, Yihan Chen, Yan Rao, Xuanming Zhou, Yukang Li, Yuxin Hu, Chao
author_sort	Zhou, Yihan
title	Leveraging large language models and BERT for log parsing and anomaly detection
title_short	Leveraging large language models and BERT for log parsing and anomaly detection
title_full	Leveraging large language models and BERT for log parsing and anomaly detection
title_fullStr	Leveraging large language models and BERT for log parsing and anomaly detection
title_full_unstemmed	Leveraging large language models and BERT for log parsing and anomaly detection
title_sort	leveraging large language models and bert for log parsing and anomaly detection
publishDate	2024
url	https://hdl.handle.net/10356/181426
_version_	1819113083643101184

Leveraging large language models and BERT for log parsing and anomaly detection

Similar Items