PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization

Charts are commonly used in data analysis to summarize key insights and answer complex queries. However, comprehending charts through tasks such as chart question answering (CQA) and chart summarization (CS) can be cognitively intensive. Existing state-of-the-art models rely on fine-tuning strategie...

Full description

Saved in:
Bibliographic Details
Main Author: Do, Xuan Long
Other Authors: Joty Shafiq Rayhan
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166497
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-166497
record_format dspace
spelling sg-ntu-dr.10356-1664972023-05-08T15:39:20Z PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization Do, Xuan Long Joty Shafiq Rayhan Patrick Pun Chi Seng School of Physical and Mathematical Sciences cspun@ntu.edu.sg, srjoty@ntu.edu.sg Science::Mathematics Engineering::Computer science and engineering Charts are commonly used in data analysis to summarize key insights and answer complex queries. However, comprehending charts through tasks such as chart question answering (CQA) and chart summarization (CS) can be cognitively intensive. Existing state-of-the-art models rely on fine-tuning strategies and struggle with tasks that involve logical and arithmetic reasoning. Meanwhile, large language models (LLMs) such as InstructGPT have demonstrated strong capabilities to solve target tasks by conditioning on few-shot prompts. Nonetheless, exploring how chart comprehension tasks could benefit from LLMs has received limited attention, especially in the few-shot setting. To bridge this gap, we present PromptChart, the first multimodal framework for few-shot CQA and CS. It consists of three modules: (i) InstructGPT for generating the textual responses, (ii) Visual Data Table Generator (VDTG) aiming to output our proposed visual data tables which integrate data tables and visual colors from charts, and (iii) Prompt Constructor (PC) focusing on creating few-shot prompts. To construct the prompts, we analyze the chart benchmarks carefully, and define a set of attributes for each task to guide the prompt constructions. Compared with existing CQA and CS baselines, PromptChart achieves state-of-the-art performance with significant improvements in automatic metrics and human preferences. Bachelor of Science in Mathematical and Computer Sciences 2023-05-02T02:24:01Z 2023-05-02T02:24:01Z 2023 Final Year Project (FYP) Do, X. L. (2023). PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166497 https://hdl.handle.net/10356/166497 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Mathematics
Engineering::Computer science and engineering
spellingShingle Science::Mathematics
Engineering::Computer science and engineering
Do, Xuan Long
PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization
description Charts are commonly used in data analysis to summarize key insights and answer complex queries. However, comprehending charts through tasks such as chart question answering (CQA) and chart summarization (CS) can be cognitively intensive. Existing state-of-the-art models rely on fine-tuning strategies and struggle with tasks that involve logical and arithmetic reasoning. Meanwhile, large language models (LLMs) such as InstructGPT have demonstrated strong capabilities to solve target tasks by conditioning on few-shot prompts. Nonetheless, exploring how chart comprehension tasks could benefit from LLMs has received limited attention, especially in the few-shot setting. To bridge this gap, we present PromptChart, the first multimodal framework for few-shot CQA and CS. It consists of three modules: (i) InstructGPT for generating the textual responses, (ii) Visual Data Table Generator (VDTG) aiming to output our proposed visual data tables which integrate data tables and visual colors from charts, and (iii) Prompt Constructor (PC) focusing on creating few-shot prompts. To construct the prompts, we analyze the chart benchmarks carefully, and define a set of attributes for each task to guide the prompt constructions. Compared with existing CQA and CS baselines, PromptChart achieves state-of-the-art performance with significant improvements in automatic metrics and human preferences.
author2 Joty Shafiq Rayhan
author_facet Joty Shafiq Rayhan
Do, Xuan Long
format Final Year Project
author Do, Xuan Long
author_sort Do, Xuan Long
title PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization
title_short PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization
title_full PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization
title_fullStr PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization
title_full_unstemmed PromptChart: prompting InstructGPT for zero & few-shot chart question answering and summarization
title_sort promptchart: prompting instructgpt for zero & few-shot chart question answering and summarization
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/166497
_version_ 1770567298863595520