PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization
Charts are commonly used in data analysis to summarize key insights and answer complex queries. However, comprehending charts through tasks such as chart question answering (CQA) and chart summarization (CS) can be cognitively intensive. Existing state-of-the-art models rely on fine-tuning strategie...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166497 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-166497 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1664972023-05-08T15:39:20Z PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization Do, Xuan Long Joty Shafiq Rayhan Patrick Pun Chi Seng School of Physical and Mathematical Sciences cspun@ntu.edu.sg, srjoty@ntu.edu.sg Science::Mathematics Engineering::Computer science and engineering Charts are commonly used in data analysis to summarize key insights and answer complex queries. However, comprehending charts through tasks such as chart question answering (CQA) and chart summarization (CS) can be cognitively intensive. Existing state-of-the-art models rely on fine-tuning strategies and struggle with tasks that involve logical and arithmetic reasoning. Meanwhile, large language models (LLMs) such as InstructGPT have demonstrated strong capabilities to solve target tasks by conditioning on few-shot prompts. Nonetheless, exploring how chart comprehension tasks could benefit from LLMs has received limited attention, especially in the few-shot setting. To bridge this gap, we present PromptChart, the first multimodal framework for few-shot CQA and CS. It consists of three modules: (i) InstructGPT for generating the textual responses, (ii) Visual Data Table Generator (VDTG) aiming to output our proposed visual data tables which integrate data tables and visual colors from charts, and (iii) Prompt Constructor (PC) focusing on creating few-shot prompts. To construct the prompts, we analyze the chart benchmarks carefully, and define a set of attributes for each task to guide the prompt constructions. Compared with existing CQA and CS baselines, PromptChart achieves state-of-the-art performance with significant improvements in automatic metrics and human preferences. Bachelor of Science in Mathematical and Computer Sciences 2023-05-02T02:24:01Z 2023-05-02T02:24:01Z 2023 Final Year Project (FYP) Do, X. L. (2023). PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166497 https://hdl.handle.net/10356/166497 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Mathematics Engineering::Computer science and engineering |
spellingShingle |
Science::Mathematics Engineering::Computer science and engineering Do, Xuan Long PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization |
description |
Charts are commonly used in data analysis to summarize key insights and answer complex queries. However, comprehending charts through tasks such as chart question answering (CQA) and chart summarization (CS) can be cognitively intensive. Existing state-of-the-art models rely on fine-tuning strategies and struggle with tasks that involve logical and arithmetic reasoning. Meanwhile, large language models (LLMs) such as InstructGPT have demonstrated strong capabilities to solve target tasks by conditioning on few-shot prompts. Nonetheless, exploring how chart comprehension tasks could benefit from LLMs has received limited attention, especially in the few-shot setting. To bridge this gap, we present PromptChart, the first multimodal framework for few-shot CQA and CS. It consists of three modules: (i) InstructGPT for generating the textual responses, (ii) Visual Data Table Generator (VDTG) aiming to output our proposed visual data tables which integrate data tables and visual colors from charts, and (iii) Prompt Constructor (PC) focusing on creating few-shot prompts. To construct the prompts, we analyze the chart benchmarks carefully, and define a set of attributes for each task to guide the prompt constructions. Compared with existing CQA and CS baselines, PromptChart achieves state-of-the-art performance with significant improvements in automatic metrics and human preferences. |
author2 |
Joty Shafiq Rayhan |
author_facet |
Joty Shafiq Rayhan Do, Xuan Long |
format |
Final Year Project |
author |
Do, Xuan Long |
author_sort |
Do, Xuan Long |
title |
PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization |
title_short |
PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization |
title_full |
PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization |
title_fullStr |
PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization |
title_full_unstemmed |
PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization |
title_sort |
promptchart: prompting instructgpt for zero & few-shot chart question answering and summarization |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/166497 |
_version_ |
1770567298863595520 |