Teacher-student networks with multiple decoders for solving math word problem

Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To add...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG, Jipeng, LEE, Roy Ka-Wei, LIM, Ee-peng, QIN, Wei, WANG, Lei, SHAO, Jie, SUN, Qianru
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5320
https://ink.library.smu.edu.sg/context/sis_research/article/6324/viewcontent/15._Teacher_Student_Networks_with_Multiple_Decoders_for_Solving_Math_Word_Problem__IJCAI2020_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-6324
record_format dspace
spelling sg-smu-ink.sis_research-63242021-02-19T03:22:37Z Teacher-student networks with multiple decoders for solving math word problem ZHANG, Jipeng LEE, Roy Ka-Wei LIM, Ee-peng QIN, Wei WANG, Lei SHAO, Jie SUN, Qianru Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution. 2020-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5320 info:doi/10.24963/ijcai.2020/555 https://ink.library.smu.edu.sg/context/sis_research/article/6324/viewcontent/15._Teacher_Student_Networks_with_Multiple_Decoders_for_Solving_Math_Word_Problem__IJCAI2020_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Databases and Information Systems Mathematics Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Artificial Intelligence and Robotics
Databases and Information Systems
Mathematics
Numerical Analysis and Scientific Computing
spellingShingle Artificial Intelligence and Robotics
Databases and Information Systems
Mathematics
Numerical Analysis and Scientific Computing
ZHANG, Jipeng
LEE, Roy Ka-Wei
LIM, Ee-peng
QIN, Wei
WANG, Lei
SHAO, Jie
SUN, Qianru
Teacher-student networks with multiple decoders for solving math word problem
description Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution.
format text
author ZHANG, Jipeng
LEE, Roy Ka-Wei
LIM, Ee-peng
QIN, Wei
WANG, Lei
SHAO, Jie
SUN, Qianru
author_facet ZHANG, Jipeng
LEE, Roy Ka-Wei
LIM, Ee-peng
QIN, Wei
WANG, Lei
SHAO, Jie
SUN, Qianru
author_sort ZHANG, Jipeng
title Teacher-student networks with multiple decoders for solving math word problem
title_short Teacher-student networks with multiple decoders for solving math word problem
title_full Teacher-student networks with multiple decoders for solving math word problem
title_fullStr Teacher-student networks with multiple decoders for solving math word problem
title_full_unstemmed Teacher-student networks with multiple decoders for solving math word problem
title_sort teacher-student networks with multiple decoders for solving math word problem
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/5320
https://ink.library.smu.edu.sg/context/sis_research/article/6324/viewcontent/15._Teacher_Student_Networks_with_Multiple_Decoders_for_Solving_Math_Word_Problem__IJCAI2020_.pdf
_version_ 1770575402130997248