T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering

Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Lei, HU, Yi, HE, Jiabang, XU, Xing, LIU, Ning, LIU, Hui, SHEN, Heng Tao
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8756
https://ink.library.smu.edu.sg/context/sis_research/article/9759/viewcontent/29884_Article_Text_33938_1_2_20240324_pvoa.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9759
record_format dspace
spelling sg-smu-ink.sis_research-97592024-05-03T06:48:31Z T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering WANG, Lei HU, Yi HE, Jiabang XU, Xing LIU, Ning LIU, Hui SHEN, Heng Tao Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals. The T-SciQ approach generates high-quality CoT rationales as teaching signals and is advanced to train much smaller models to perform CoT reasoning in complex modalities. Additionally, we introduce a novel data mixing strategy to produce more effective teaching data samples for simple and complex science question answer problems. Extensive experimental results show that our T-SciQ method achieves a new state-of-the-art performance on the ScienceQA benchmark, with an accuracy of 96.18%. Moreover, our approach outperforms the most powerful fine-tuned baseline by 4.5%. The code is publicly available at https://github.com/T-SciQ/T-SciQ. 2024-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8756 info:doi/10.1609/aaai.v38i17.29884 https://ink.library.smu.edu.sg/context/sis_research/article/9759/viewcontent/29884_Article_Text_33938_1_2_20240324_pvoa.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Complex problems High quality Language model Language processing Model signals Multi-modal Multimodal chains Natural languages Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Complex problems
High quality
Language model
Language processing
Model signals
Multi-modal
Multimodal chains
Natural languages
Artificial Intelligence and Robotics
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Complex problems
High quality
Language model
Language processing
Model signals
Multi-modal
Multimodal chains
Natural languages
Artificial Intelligence and Robotics
Databases and Information Systems
Numerical Analysis and Scientific Computing
WANG, Lei
HU, Yi
HE, Jiabang
XU, Xing
LIU, Ning
LIU, Hui
SHEN, Heng Tao
T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering
description Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals. The T-SciQ approach generates high-quality CoT rationales as teaching signals and is advanced to train much smaller models to perform CoT reasoning in complex modalities. Additionally, we introduce a novel data mixing strategy to produce more effective teaching data samples for simple and complex science question answer problems. Extensive experimental results show that our T-SciQ method achieves a new state-of-the-art performance on the ScienceQA benchmark, with an accuracy of 96.18%. Moreover, our approach outperforms the most powerful fine-tuned baseline by 4.5%. The code is publicly available at https://github.com/T-SciQ/T-SciQ.
format text
author WANG, Lei
HU, Yi
HE, Jiabang
XU, Xing
LIU, Ning
LIU, Hui
SHEN, Heng Tao
author_facet WANG, Lei
HU, Yi
HE, Jiabang
XU, Xing
LIU, Ning
LIU, Hui
SHEN, Heng Tao
author_sort WANG, Lei
title T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering
title_short T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering
title_full T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering
title_fullStr T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering
title_full_unstemmed T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering
title_sort t-sciq: teaching multimodal chain-of-thought reasoning via large language model signals for science question answering
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/8756
https://ink.library.smu.edu.sg/context/sis_research/article/9759/viewcontent/29884_Article_Text_33938_1_2_20240324_pvoa.pdf
_version_ 1814047520471384064