Detection, recognition and understanding document layout

Effective management of personal finances is essential for financial stability. The traditional methods of expense tracking require manually inputting data into budgeting applications which are cumbersome and error prone. To encourage individuals to manage their personal finances, this project...

全面介紹

Saved in:
書目詳細資料
主要作者: Loh, Yi Ze
其他作者: Loke Yuan Ren
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2024
主題:
在線閱讀:https://hdl.handle.net/10356/175020
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:Effective management of personal finances is essential for financial stability. The traditional methods of expense tracking require manually inputting data into budgeting applications which are cumbersome and error prone. To encourage individuals to manage their personal finances, this project seeks to leverage advancements in DocumentAI to automate the extraction of key information from receipts. In this project, experiments were carried out with LayoutLMv3 and Donut models to determine a suitable approach to tackle this problem. Donut was chosen as the solution due to its end-to-end approach and entity linking capabilities. The first fine-tuned Donut model achieved F1 Score of 54% and Tree Edit Distance accuracy of 49%. To improve the performance of the model, data augmentation techniques were employed to increase the size of the dataset used for training. The second fine-tuned Donut model achieved F1 Score of 95% and Tree Edit Distance accuracy of 87%. To enable users to upload receipts and extract information for expense tracking, a Receipt Extraction bot was developed using Telegram API and MongoDB Atlas. The scope of this project includes comprehensive literature review on DocumentAI models, experiments on publicly available datasets, model fine-tuning and software development stages.