Extraction of Vietnamese collocation from text corpora

Collocations have wide application in the fields of languages, compiled a dictionary as well as the problem of natural language processing. Therefore, the extraction of collocations in each language is really necessary, to improve the accuracy and the nature of the application of natural langu...

Full description

Saved in:
Bibliographic Details
Main Author: Đỗ, Thị Ngọc Quỳnh
Format: Theses and Dissertations
Language:other
Published: Đại học Quốc gia Hà Nội 2016
Subjects:
Online Access:http://repository.vnu.edu.vn/handle/VNU_123/8281
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Vietnam National University, Hanoi
Language: other
id oai:112.137.131.14:VNU_123-8281
record_format dspace
spelling oai:112.137.131.14:VNU_123-82812016-04-13T20:02:47Z Extraction of Vietnamese collocation from text corpora Đỗ, Thị Ngọc Quỳnh Xử lý ngôn ngữ Xử lý dữ liệu Ngôn ngữ tự nhiên Trí tuệ nhân tạo Collocations have wide application in the fields of languages, compiled a dictionary as well as the problem of natural language processing. Therefore, the extraction of collocations in each language is really necessary, to improve the accuracy and the nature of the application of natural language processing, as well as help to learn a new language easier. However, in Vietnam, the study of collocation is quite a new field. This paper focused on researching some method of extracting collocations methods to find efficient model for the Vietnamese collocations extraction. The mentioned methods were based on some classic statistical methods commonly used such as frequency, t-test, chi-square, mutual information... We also suggested some general method using linguistic measure to increase the accuracy of the process of extraction. Input data included the data has been through a POS-tagging and data has been parsed. By running the program with different methods and combination of multiple methods together, comparing the accuracy of the method, we draw out the efficient method of extracting of Vietnamese Collocation from Text Corpora. 2016-04-13T07:47:50Z 2016-04-13T07:47:50Z 2011 Thesis 5 tr. http://repository.vnu.edu.vn/handle/VNU_123/8281 other application/pdf Đại học Quốc gia Hà Nội
institution Vietnam National University, Hanoi
building VNU Library & Information Center
country Vietnam
collection VNU Digital Repository
language other
topic Xử lý ngôn ngữ
Xử lý dữ liệu
Ngôn ngữ tự nhiên
Trí tuệ nhân tạo
spellingShingle Xử lý ngôn ngữ
Xử lý dữ liệu
Ngôn ngữ tự nhiên
Trí tuệ nhân tạo
Đỗ, Thị Ngọc Quỳnh
Extraction of Vietnamese collocation from text corpora
description Collocations have wide application in the fields of languages, compiled a dictionary as well as the problem of natural language processing. Therefore, the extraction of collocations in each language is really necessary, to improve the accuracy and the nature of the application of natural language processing, as well as help to learn a new language easier. However, in Vietnam, the study of collocation is quite a new field. This paper focused on researching some method of extracting collocations methods to find efficient model for the Vietnamese collocations extraction. The mentioned methods were based on some classic statistical methods commonly used such as frequency, t-test, chi-square, mutual information... We also suggested some general method using linguistic measure to increase the accuracy of the process of extraction. Input data included the data has been through a POS-tagging and data has been parsed. By running the program with different methods and combination of multiple methods together, comparing the accuracy of the method, we draw out the efficient method of extracting of Vietnamese Collocation from Text Corpora.
format Theses and Dissertations
author Đỗ, Thị Ngọc Quỳnh
author_facet Đỗ, Thị Ngọc Quỳnh
author_sort Đỗ, Thị Ngọc Quỳnh
title Extraction of Vietnamese collocation from text corpora
title_short Extraction of Vietnamese collocation from text corpora
title_full Extraction of Vietnamese collocation from text corpora
title_fullStr Extraction of Vietnamese collocation from text corpora
title_full_unstemmed Extraction of Vietnamese collocation from text corpora
title_sort extraction of vietnamese collocation from text corpora
publisher Đại học Quốc gia Hà Nội
publishDate 2016
url http://repository.vnu.edu.vn/handle/VNU_123/8281
_version_ 1680965032269053952