On modeling sense relatedness in multi-prototype word embedding

To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness...

全面介紹

Saved in:

書目詳細資料
Main Authors:	CAO, Yixin, LI, Juanzi, SHI, Jiaxin, LIU, Zhiyuan, LI, Chengjiang
格式:	text
語言:	English
出版:	Institutional Knowledge at Singapore Management University 2017
主題:	Databases and Information Systems Graphics and Human Computer Interfaces
在線閱讀:	https://ink.library.smu.edu.sg/sis_research/7469 https://ink.library.smu.edu.sg/context/sis_research/article/8472/viewcontent/I17_1024.pdf
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

id	sg-smu-ink.sis_research-8472
record_format	dspace
spelling	sg-smu-ink.sis_research-84722022-10-20T07:07:10Z On modeling sense relatedness in multi-prototype word embedding CAO, Yixin LI, Juanzi SHI, Jiaxin LIU, Zhiyuan LI, Chengjiang To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel approach to capture word sense relatedness in multi-prototype word embedding model. Particularly, we differentiate the original sense and extended senses of a word by introducing their global occurrence information and model their relatedness through the local textual context information. Based on the idea of fuzzy clustering, we introduce a random process to integrate these two types of senses and design two non-parametric methods for word sense induction. To make our model more scalable and efficient, we use an online joint learning framework extended from the Skip-gram model. The experimental results demonstrate that our model outperforms both conventional single-prototype embedding models and other multi-prototype embedding models, and achieves more stable performance when trained on smaller data. 2017-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7469 https://ink.library.smu.edu.sg/context/sis_research/article/8472/viewcontent/I17_1024.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Graphics and Human Computer Interfaces
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems Graphics and Human Computer Interfaces
spellingShingle	Databases and Information Systems Graphics and Human Computer Interfaces CAO, Yixin LI, Juanzi SHI, Jiaxin LIU, Zhiyuan LI, Chengjiang On modeling sense relatedness in multi-prototype word embedding
description	To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel approach to capture word sense relatedness in multi-prototype word embedding model. Particularly, we differentiate the original sense and extended senses of a word by introducing their global occurrence information and model their relatedness through the local textual context information. Based on the idea of fuzzy clustering, we introduce a random process to integrate these two types of senses and design two non-parametric methods for word sense induction. To make our model more scalable and efficient, we use an online joint learning framework extended from the Skip-gram model. The experimental results demonstrate that our model outperforms both conventional single-prototype embedding models and other multi-prototype embedding models, and achieves more stable performance when trained on smaller data.
format	text
author	CAO, Yixin LI, Juanzi SHI, Jiaxin LIU, Zhiyuan LI, Chengjiang
author_facet	CAO, Yixin LI, Juanzi SHI, Jiaxin LIU, Zhiyuan LI, Chengjiang
author_sort	CAO, Yixin
title	On modeling sense relatedness in multi-prototype word embedding
title_short	On modeling sense relatedness in multi-prototype word embedding
title_full	On modeling sense relatedness in multi-prototype word embedding
title_fullStr	On modeling sense relatedness in multi-prototype word embedding
title_full_unstemmed	On modeling sense relatedness in multi-prototype word embedding
title_sort	on modeling sense relatedness in multi-prototype word embedding
publisher	Institutional Knowledge at Singapore Management University
publishDate	2017
url	https://ink.library.smu.edu.sg/sis_research/7469 https://ink.library.smu.edu.sg/context/sis_research/article/8472/viewcontent/I17_1024.pdf
_version_	1770576343475421184

On modeling sense relatedness in multi-prototype word embedding

相似書籍