Learning and evaluating Chinese idiom embeddings

We study the task of learning and evaluating Chinese idiom embeddings. We first construct a new evaluation dataset that contains idiom synonyms and antonyms. Observing that existing Chinese word embedding methods may not be suitable for learning idiom embeddings, we further present a BERT-based meth...

Full description

Saved in:
Bibliographic Details
Main Authors: TAN, Minghuan, JIANG, Jing
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6723
https://ink.library.smu.edu.sg/context/sis_research/article/7726/viewcontent/2021.ranlp_main.155.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7726
record_format dspace
spelling sg-smu-ink.sis_research-77262022-01-27T11:12:43Z Learning and evaluating Chinese idiom embeddings TAN, Minghuan JIANG, Jing We study the task of learning and evaluating Chinese idiom embeddings. We first construct a new evaluation dataset that contains idiom synonyms and antonyms. Observing that existing Chinese word embedding methods may not be suitable for learning idiom embeddings, we further present a BERT-based method that directly learns embedding vectors for individual idioms. We empirically compare representative existing methods and our method. We find that our method substantially outperforms existing methods on the evaluation dataset we have constructed. 2021-09-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6723 info:doi/10.26615/978-954-452-072-4_155 https://ink.library.smu.edu.sg/context/sis_research/article/7726/viewcontent/2021.ranlp_main.155.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Programming Languages and Compilers
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Artificial Intelligence and Robotics
Programming Languages and Compilers
spellingShingle Artificial Intelligence and Robotics
Programming Languages and Compilers
TAN, Minghuan
JIANG, Jing
Learning and evaluating Chinese idiom embeddings
description We study the task of learning and evaluating Chinese idiom embeddings. We first construct a new evaluation dataset that contains idiom synonyms and antonyms. Observing that existing Chinese word embedding methods may not be suitable for learning idiom embeddings, we further present a BERT-based method that directly learns embedding vectors for individual idioms. We empirically compare representative existing methods and our method. We find that our method substantially outperforms existing methods on the evaluation dataset we have constructed.
format text
author TAN, Minghuan
JIANG, Jing
author_facet TAN, Minghuan
JIANG, Jing
author_sort TAN, Minghuan
title Learning and evaluating Chinese idiom embeddings
title_short Learning and evaluating Chinese idiom embeddings
title_full Learning and evaluating Chinese idiom embeddings
title_fullStr Learning and evaluating Chinese idiom embeddings
title_full_unstemmed Learning and evaluating Chinese idiom embeddings
title_sort learning and evaluating chinese idiom embeddings
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/6723
https://ink.library.smu.edu.sg/context/sis_research/article/7726/viewcontent/2021.ranlp_main.155.pdf
_version_ 1770576054406086656