Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa

Automatically constructing and clustering of words similarity have many important applications in Natural Language Processing (NLP) tasks, such as dictionary construction, statistical machine translation, named-entity recognition, functional labeling, word segmentation… In recent years, it is...

Full description

Saved in:

Bibliographic Details
Main Author:	Nguyễn, Kim Anh
Format:	Theses and Dissertations
Language:	other
Published:	Đại học Quốc gia Hà Nội 2016
Subjects:	Khoa học máy tính Xử lý ngôn ngữ tự nhiên Cụm từ Từ trái nghĩa
Online Access:	http://repository.vnu.edu.vn/handle/VNU_123/8258
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Vietnam National University, Hanoi
Language:	other

id	oai:112.137.131.14:VNU_123-8258
record_format	dspace
spelling	oai:112.137.131.14:VNU_123-82582016-04-13T20:02:04Z Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa Nguyễn, Kim Anh Khoa học máy tính Xử lý ngôn ngữ tự nhiên Cụm từ Từ trái nghĩa Automatically constructing and clustering of words similarity have many important applications in Natural Language Processing (NLP) tasks, such as dictionary construction, statistical machine translation, named-entity recognition, functional labeling, word segmentation… In recent years, it is a common trend that word clustering is researched in some languages as English, Germany, Chinese… However, the task of word clustering in Vietnamese is a more recent one. In this thesis, I use a large unlabeled data of Vietnamese of about 15 millions words which is equivalent to approximately 700 thousands of sentences. This unlabeled data is extracted from newspapers: Lao dong, PC World, Tuoi tre and then part-of-speech tagged. I investigated some approaches for constructing word clusters in Vietnamese, in which I mainly focus on two main methods by Brown and Dekang Lin. I use the same Vietnamese corpus and the same evaluating tool for these two methods so that I can compare and evaluate the effects of those methods in certain NLP tasks. Besides, I use the statistics method to suggest 20 frames of antonym which can be used to identify antonym classes in clusters. 2016-04-13T07:19:56Z 2016-04-13T07:19:56Z 2013 Thesis 5 tr. http://repository.vnu.edu.vn/handle/VNU_123/8258 other application/pdf Đại học Quốc gia Hà Nội
institution	Vietnam National University, Hanoi
building	VNU Library & Information Center
country	Vietnam
collection	VNU Digital Repository
language	other
topic	Khoa học máy tính Xử lý ngôn ngữ tự nhiên Cụm từ Từ trái nghĩa
spellingShingle	Khoa học máy tính Xử lý ngôn ngữ tự nhiên Cụm từ Từ trái nghĩa Nguyễn, Kim Anh Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa
description	Automatically constructing and clustering of words similarity have many important applications in Natural Language Processing (NLP) tasks, such as dictionary construction, statistical machine translation, named-entity recognition, functional labeling, word segmentation… In recent years, it is a common trend that word clustering is researched in some languages as English, Germany, Chinese… However, the task of word clustering in Vietnamese is a more recent one. In this thesis, I use a large unlabeled data of Vietnamese of about 15 millions words which is equivalent to approximately 700 thousands of sentences. This unlabeled data is extracted from newspapers: Lao dong, PC World, Tuoi tre and then part-of-speech tagged. I investigated some approaches for constructing word clusters in Vietnamese, in which I mainly focus on two main methods by Brown and Dekang Lin. I use the same Vietnamese corpus and the same evaluating tool for these two methods so that I can compare and evaluate the effects of those methods in certain NLP tasks. Besides, I use the statistics method to suggest 20 frames of antonym which can be used to identify antonym classes in clusters.
format	Theses and Dissertations
author	Nguyễn, Kim Anh
author_facet	Nguyễn, Kim Anh
author_sort	Nguyễn, Kim Anh
title	Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa
title_short	Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa
title_full	Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa
title_fullStr	Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa
title_full_unstemmed	Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa
title_sort	phân cụm từ tiếng việt và nhận diện từ trái nghĩa
publisher	Đại học Quốc gia Hà Nội
publishDate	2016
url	http://repository.vnu.edu.vn/handle/VNU_123/8258
_version_	1680964432382918656

Phân cụm từ Tiếng Việt và nhận diện từ trái nghĩa

Similar Items