Teacher-student networks with multiple decoders for solving math word problem
Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To add...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2020
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/5320 https://ink.library.smu.edu.sg/context/sis_research/article/6324/viewcontent/15._Teacher_Student_Networks_with_Multiple_Decoders_for_Solving_Math_Word_Problem__IJCAI2020_.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-6324 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-63242021-02-19T03:22:37Z Teacher-student networks with multiple decoders for solving math word problem ZHANG, Jipeng LEE, Roy Ka-Wei LIM, Ee-peng QIN, Wei WANG, Lei SHAO, Jie SUN, Qianru Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution. 2020-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5320 info:doi/10.24963/ijcai.2020/555 https://ink.library.smu.edu.sg/context/sis_research/article/6324/viewcontent/15._Teacher_Student_Networks_with_Multiple_Decoders_for_Solving_Math_Word_Problem__IJCAI2020_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Databases and Information Systems Mathematics Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Artificial Intelligence and Robotics Databases and Information Systems Mathematics Numerical Analysis and Scientific Computing |
spellingShingle |
Artificial Intelligence and Robotics Databases and Information Systems Mathematics Numerical Analysis and Scientific Computing ZHANG, Jipeng LEE, Roy Ka-Wei LIM, Ee-peng QIN, Wei WANG, Lei SHAO, Jie SUN, Qianru Teacher-student networks with multiple decoders for solving math word problem |
description |
Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution. |
format |
text |
author |
ZHANG, Jipeng LEE, Roy Ka-Wei LIM, Ee-peng QIN, Wei WANG, Lei SHAO, Jie SUN, Qianru |
author_facet |
ZHANG, Jipeng LEE, Roy Ka-Wei LIM, Ee-peng QIN, Wei WANG, Lei SHAO, Jie SUN, Qianru |
author_sort |
ZHANG, Jipeng |
title |
Teacher-student networks with multiple decoders for solving math word problem |
title_short |
Teacher-student networks with multiple decoders for solving math word problem |
title_full |
Teacher-student networks with multiple decoders for solving math word problem |
title_fullStr |
Teacher-student networks with multiple decoders for solving math word problem |
title_full_unstemmed |
Teacher-student networks with multiple decoders for solving math word problem |
title_sort |
teacher-student networks with multiple decoders for solving math word problem |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2020 |
url |
https://ink.library.smu.edu.sg/sis_research/5320 https://ink.library.smu.edu.sg/context/sis_research/article/6324/viewcontent/15._Teacher_Student_Networks_with_Multiple_Decoders_for_Solving_Math_Word_Problem__IJCAI2020_.pdf |
_version_ |
1770575402130997248 |