Extraction of Vietnamese collocation from text corpora
Collocations have wide application in the fields of languages, compiled a dictionary as well as the problem of natural language processing. Therefore, the extraction of collocations in each language is really necessary, to improve the accuracy and the nature of the application of natural langu...
Saved in:
Main Author: | |
---|---|
Format: | Theses and Dissertations |
Language: | other |
Published: |
Đại học Quốc gia Hà Nội
2016
|
Subjects: | |
Online Access: | http://repository.vnu.edu.vn/handle/VNU_123/8281 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Vietnam National University, Hanoi |
Language: | other |
id |
oai:112.137.131.14:VNU_123-8281 |
---|---|
record_format |
dspace |
spelling |
oai:112.137.131.14:VNU_123-82812016-04-13T20:02:47Z Extraction of Vietnamese collocation from text corpora Đỗ, Thị Ngọc Quỳnh Xử lý ngôn ngữ Xử lý dữ liệu Ngôn ngữ tự nhiên Trí tuệ nhân tạo Collocations have wide application in the fields of languages, compiled a dictionary as well as the problem of natural language processing. Therefore, the extraction of collocations in each language is really necessary, to improve the accuracy and the nature of the application of natural language processing, as well as help to learn a new language easier. However, in Vietnam, the study of collocation is quite a new field. This paper focused on researching some method of extracting collocations methods to find efficient model for the Vietnamese collocations extraction. The mentioned methods were based on some classic statistical methods commonly used such as frequency, t-test, chi-square, mutual information... We also suggested some general method using linguistic measure to increase the accuracy of the process of extraction. Input data included the data has been through a POS-tagging and data has been parsed. By running the program with different methods and combination of multiple methods together, comparing the accuracy of the method, we draw out the efficient method of extracting of Vietnamese Collocation from Text Corpora. 2016-04-13T07:47:50Z 2016-04-13T07:47:50Z 2011 Thesis 5 tr. http://repository.vnu.edu.vn/handle/VNU_123/8281 other application/pdf Đại học Quốc gia Hà Nội |
institution |
Vietnam National University, Hanoi |
building |
VNU Library & Information Center |
country |
Vietnam |
collection |
VNU Digital Repository |
language |
other |
topic |
Xử lý ngôn ngữ Xử lý dữ liệu Ngôn ngữ tự nhiên Trí tuệ nhân tạo |
spellingShingle |
Xử lý ngôn ngữ Xử lý dữ liệu Ngôn ngữ tự nhiên Trí tuệ nhân tạo Đỗ, Thị Ngọc Quỳnh Extraction of Vietnamese collocation from text corpora |
description |
Collocations have wide application in the fields of languages, compiled a
dictionary as well as the problem of natural language processing. Therefore, the
extraction of collocations in each language is really necessary, to improve the
accuracy and the nature of the application of natural language processing, as well as
help to learn a new language easier. However, in Vietnam, the study of collocation is
quite a new field. This paper focused on researching some method of extracting
collocations methods to find efficient model for the Vietnamese collocations
extraction. The mentioned methods were based on some classic statistical methods
commonly used such as frequency, t-test, chi-square, mutual information... We also
suggested some general method using linguistic measure to increase the accuracy of
the process of extraction. Input data included the data has been through a POS-tagging and data has been parsed. By running the program with different methods
and combination of multiple methods together, comparing the accuracy of the
method, we draw out the efficient method of extracting of Vietnamese Collocation
from Text Corpora. |
format |
Theses and Dissertations |
author |
Đỗ, Thị Ngọc Quỳnh |
author_facet |
Đỗ, Thị Ngọc Quỳnh |
author_sort |
Đỗ, Thị Ngọc Quỳnh |
title |
Extraction of Vietnamese collocation from text corpora |
title_short |
Extraction of Vietnamese collocation from text corpora |
title_full |
Extraction of Vietnamese collocation from text corpora |
title_fullStr |
Extraction of Vietnamese collocation from text corpora |
title_full_unstemmed |
Extraction of Vietnamese collocation from text corpora |
title_sort |
extraction of vietnamese collocation from text corpora |
publisher |
Đại học Quốc gia Hà Nội |
publishDate |
2016 |
url |
http://repository.vnu.edu.vn/handle/VNU_123/8281 |
_version_ |
1680965032269053952 |