The study of word embedding representations in different domains

Word embedding has been a popular research topic since 2003 when Mikolov and his colleagues proposed a few new algorithms. These algorithms which were modified from the existing Machine Learning architectures. It allows machine to learn meaning behind words using an unsupervised manner. These propo...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Seng, Jeremy Jie Min
مؤلفون آخرون: Chng Eng Siong
التنسيق: Final Year Project
اللغة:English
منشور في: 2016
الموضوعات:
الوصول للمادة أونلاين:http://hdl.handle.net/10356/69145
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:Word embedding has been a popular research topic since 2003 when Mikolov and his colleagues proposed a few new algorithms. These algorithms which were modified from the existing Machine Learning architectures. It allows machine to learn meaning behind words using an unsupervised manner. These proposed algorithms were able to determine how close two words are in a vector by measuring the cosine similarity distance. However, much work can be done to determine if these proposed methods can further to determine the context of a sentence or a paragraphs using these cosine distances. As the proposed algorithms requires a large dictionary of words or commonly referred to a corpus in this report, the author wishes to find out if the corpus supplied with articles found in Wikipedia are able to show the closeness of two words in different context.