Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models
Numerous pre-training techniques for visual document understanding (VDU) have recently shown substantial improvements in performance across a wide range of document tasks. However, these pre-trained VDU models cannot guarantee continued success when the distribution of test data differs from the dis...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8145 https://ink.library.smu.edu.sg/context/sis_research/article/9148/viewcontent/3539618.3591670_pvoa.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9148 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-91482023-09-14T08:18:39Z Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models HE, Jiabang HU, Yi WANG, Lei XU, Xing LIU, Ning LIU, Hui Numerous pre-training techniques for visual document understanding (VDU) have recently shown substantial improvements in performance across a wide range of document tasks. However, these pre-trained VDU models cannot guarantee continued success when the distribution of test data differs from the distribution of training data. In this paper, to investigate how robust existing pre-trained VDU models are to various distribution shifts, we first develop an out-of-distribution (OOD) benchmark termed Do-GOOD for the fine-Grained analysis on Document image-related tasks specifically. The Do-GOOD benchmark defines the underlying mechanisms that result in different distribution shifts and contains 9 OOD datasets covering 3 VDU related tasks, e.g., document information extraction, classification and question answering. We then evaluate the robustness and perform a fine-grained analysis of 5 latest VDU pre-trained models and 2 typical OOD generalization algorithms on these OOD datasets. Results from the experiments demonstrate that there is a significant performance gap between the in-distribution (ID) and OOD settings for document images, and that fine-grained analysis of distribution shifts can reveal the brittle nature of existing pre-trained VDU models and OOD generalization algorithms. The code and datasets for our Do-GOOD benchmark can be found at https://github.com/MAEHCM/Do-GOOD. 2023-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8145 info:doi/10.1145/3539618.3591670 https://ink.library.smu.edu.sg/context/sis_research/article/9148/viewcontent/3539618.3591670_pvoa.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University out-of-distribution pre-trained models visual document understanding document information extraction Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
out-of-distribution pre-trained models visual document understanding document information extraction Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
out-of-distribution pre-trained models visual document understanding document information extraction Databases and Information Systems Numerical Analysis and Scientific Computing HE, Jiabang HU, Yi WANG, Lei XU, Xing LIU, Ning LIU, Hui Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models |
description |
Numerous pre-training techniques for visual document understanding (VDU) have recently shown substantial improvements in performance across a wide range of document tasks. However, these pre-trained VDU models cannot guarantee continued success when the distribution of test data differs from the distribution of training data. In this paper, to investigate how robust existing pre-trained VDU models are to various distribution shifts, we first develop an out-of-distribution (OOD) benchmark termed Do-GOOD for the fine-Grained analysis on Document image-related tasks specifically. The Do-GOOD benchmark defines the underlying mechanisms that result in different distribution shifts and contains 9 OOD datasets covering 3 VDU related tasks, e.g., document information extraction, classification and question answering. We then evaluate the robustness and perform a fine-grained analysis of 5 latest VDU pre-trained models and 2 typical OOD generalization algorithms on these OOD datasets. Results from the experiments demonstrate that there is a significant performance gap between the in-distribution (ID) and OOD settings for document images, and that fine-grained analysis of distribution shifts can reveal the brittle nature of existing pre-trained VDU models and OOD generalization algorithms. The code and datasets for our Do-GOOD benchmark can be found at https://github.com/MAEHCM/Do-GOOD. |
format |
text |
author |
HE, Jiabang HU, Yi WANG, Lei XU, Xing LIU, Ning LIU, Hui |
author_facet |
HE, Jiabang HU, Yi WANG, Lei XU, Xing LIU, Ning LIU, Hui |
author_sort |
HE, Jiabang |
title |
Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models |
title_short |
Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models |
title_full |
Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models |
title_fullStr |
Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models |
title_full_unstemmed |
Do-GOOD: Towards distribution shift evaluation for pre-trained visual document understanding models |
title_sort |
do-good: towards distribution shift evaluation for pre-trained visual document understanding models |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2023 |
url |
https://ink.library.smu.edu.sg/sis_research/8145 https://ink.library.smu.edu.sg/context/sis_research/article/9148/viewcontent/3539618.3591670_pvoa.pdf |
_version_ |
1779157180973318144 |