Exploratory Prompting of Large Language Models to Act as Co-Pilots for Augmenting Business Process Work in Document Classification

Businesses deal with different types of documents containing unstructured documents. The data in these documents must be converted into digital forms other automated systems could only process. One generic use case is document classification, which usually involves manual transformation due to human...

Full description

Saved in:
Bibliographic Details
Main Authors: Ilagan, Jose Ramon, Ilagan, Joseph Benjamin R, Basallo, Claire Louisse, Alabastro, Zachary Matthew
Format: text
Published: Archīum Ateneo 2024
Subjects:
AI
GPT
LLM
NLP
RPA
Online Access:https://archium.ateneo.edu/qmit-faculty-pubs/28
https://archium.ateneo.edu/context/qmit-faculty-pubs/article/1027/viewcontent/1_s2.0_S1877050924011396_main.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Ateneo De Manila University
Description
Summary:Businesses deal with different types of documents containing unstructured documents. The data in these documents must be converted into digital forms other automated systems could only process. One generic use case is document classification, which usually involves manual transformation due to human understanding needed in the process. These documents go beyond those generated through regular business transactions and operations and also include web-based content such as online news, blogs, e-mails, and various digital libraries. Recent developments in robotic process automation (RPA) and artificial intelligence (AI) aim to automate the otherwise expensive, time-consuming, and repetitive manual steps. Through more powerful natural language processing (NLP) and natural language understanding (NLU) capabilities, large language models (LLMs) may come as a big boost in applying AI to RPA initiatives. This study proposes a general approach to using LLMs as document classifier co-pilots for knowledge workers in charge of classifying documents to be useful. The manner of prompt engineering and refinement involving labeled health insurance documents to achieve better results is discussed and evaluated through early, iterative classification attempts. However, early tests with a complex sample use case show unsatisfactory results. The study ends with recommendations for future work to improve precision and recall performance.