T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering
Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2024
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8756 https://ink.library.smu.edu.sg/context/sis_research/article/9759/viewcontent/29884_Article_Text_33938_1_2_20240324_pvoa.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9759 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-97592024-05-03T06:48:31Z T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering WANG, Lei HU, Yi HE, Jiabang XU, Xing LIU, Ning LIU, Hui SHEN, Heng Tao Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals. The T-SciQ approach generates high-quality CoT rationales as teaching signals and is advanced to train much smaller models to perform CoT reasoning in complex modalities. Additionally, we introduce a novel data mixing strategy to produce more effective teaching data samples for simple and complex science question answer problems. Extensive experimental results show that our T-SciQ method achieves a new state-of-the-art performance on the ScienceQA benchmark, with an accuracy of 96.18%. Moreover, our approach outperforms the most powerful fine-tuned baseline by 4.5%. The code is publicly available at https://github.com/T-SciQ/T-SciQ. 2024-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8756 info:doi/10.1609/aaai.v38i17.29884 https://ink.library.smu.edu.sg/context/sis_research/article/9759/viewcontent/29884_Article_Text_33938_1_2_20240324_pvoa.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Complex problems High quality Language model Language processing Model signals Multi-modal Multimodal chains Natural languages Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Complex problems High quality Language model Language processing Model signals Multi-modal Multimodal chains Natural languages Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
Complex problems High quality Language model Language processing Model signals Multi-modal Multimodal chains Natural languages Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing WANG, Lei HU, Yi HE, Jiabang XU, Xing LIU, Ning LIU, Hui SHEN, Heng Tao T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering |
description |
Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals. The T-SciQ approach generates high-quality CoT rationales as teaching signals and is advanced to train much smaller models to perform CoT reasoning in complex modalities. Additionally, we introduce a novel data mixing strategy to produce more effective teaching data samples for simple and complex science question answer problems. Extensive experimental results show that our T-SciQ method achieves a new state-of-the-art performance on the ScienceQA benchmark, with an accuracy of 96.18%. Moreover, our approach outperforms the most powerful fine-tuned baseline by 4.5%. The code is publicly available at https://github.com/T-SciQ/T-SciQ. |
format |
text |
author |
WANG, Lei HU, Yi HE, Jiabang XU, Xing LIU, Ning LIU, Hui SHEN, Heng Tao |
author_facet |
WANG, Lei HU, Yi HE, Jiabang XU, Xing LIU, Ning LIU, Hui SHEN, Heng Tao |
author_sort |
WANG, Lei |
title |
T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering |
title_short |
T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering |
title_full |
T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering |
title_fullStr |
T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering |
title_full_unstemmed |
T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering |
title_sort |
t-sciq: teaching multimodal chain-of-thought reasoning via large language model signals for science question answering |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2024 |
url |
https://ink.library.smu.edu.sg/sis_research/8756 https://ink.library.smu.edu.sg/context/sis_research/article/9759/viewcontent/29884_Article_Text_33938_1_2_20240324_pvoa.pdf |
_version_ |
1814047520471384064 |