The study of word embedding representations in different domains
Word embedding has been a popular research topic since 2003 when Mikolov and his colleagues proposed a few new algorithms. These algorithms which were modified from the existing Machine Learning architectures. It allows machine to learn meaning behind words using an unsupervised manner. These propo...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/69145 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-69145 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-691452023-03-03T20:57:47Z The study of word embedding representations in different domains Seng, Jeremy Jie Min Chng Eng Siong School of Computer Engineering DRNTU::Engineering Word embedding has been a popular research topic since 2003 when Mikolov and his colleagues proposed a few new algorithms. These algorithms which were modified from the existing Machine Learning architectures. It allows machine to learn meaning behind words using an unsupervised manner. These proposed algorithms were able to determine how close two words are in a vector by measuring the cosine similarity distance. However, much work can be done to determine if these proposed methods can further to determine the context of a sentence or a paragraphs using these cosine distances. As the proposed algorithms requires a large dictionary of words or commonly referred to a corpus in this report, the author wishes to find out if the corpus supplied with articles found in Wikipedia are able to show the closeness of two words in different context. Bachelor of Engineering (Computer Science) 2016-11-11T06:40:25Z 2016-11-11T06:40:25Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/69145 en Nanyang Technological University 49 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering |
spellingShingle |
DRNTU::Engineering Seng, Jeremy Jie Min The study of word embedding representations in different domains |
description |
Word embedding has been a popular research topic since 2003 when Mikolov and his colleagues proposed a few new algorithms. These algorithms which were modified from the existing Machine Learning architectures. It allows machine to learn meaning behind words using an unsupervised manner.
These proposed algorithms were able to determine how close two words are in a vector by measuring the cosine similarity distance. However, much work can be done to determine if these proposed methods can further to determine the context of a sentence or a paragraphs using these cosine distances.
As the proposed algorithms requires a large dictionary of words or commonly referred to a corpus in this report, the author wishes to find out if the corpus supplied with articles found in Wikipedia are able to show the closeness of two words in different context. |
author2 |
Chng Eng Siong |
author_facet |
Chng Eng Siong Seng, Jeremy Jie Min |
format |
Final Year Project |
author |
Seng, Jeremy Jie Min |
author_sort |
Seng, Jeremy Jie Min |
title |
The study of word embedding representations in different domains |
title_short |
The study of word embedding representations in different domains |
title_full |
The study of word embedding representations in different domains |
title_fullStr |
The study of word embedding representations in different domains |
title_full_unstemmed |
The study of word embedding representations in different domains |
title_sort |
study of word embedding representations in different domains |
publishDate |
2016 |
url |
http://hdl.handle.net/10356/69145 |
_version_ |
1759854267264401408 |