Automatically generating a sentiment lexicon for the Malay language

This paper aims to propose an automated sentiment lexicon generation model specifically designed for the Malay language. Lexicon-based Sentiment Analysis (SA) models make use of a sentiment lexicon for SA tasks, which is a linguistic resource that comprises a priori information about the sentiment...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohammad Darwich, Shahrul Azman Mohd Noah, Nazlia Omar
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2016
Online Access:http://journalarticle.ukm.my/10056/1/11736-37831-1-PB.pdf
http://journalarticle.ukm.my/10056/
http://ejournals.ukm.my/apjitm/issue/view/709
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Kebangsaan Malaysia
Language: English
id my-ukm.journal.10056
record_format eprints
spelling my-ukm.journal.100562017-01-25T09:53:35Z http://journalarticle.ukm.my/10056/ Automatically generating a sentiment lexicon for the Malay language Mohammad Darwich, Shahrul Azman Mohd Noah, Nazlia Omar, This paper aims to propose an automated sentiment lexicon generation model specifically designed for the Malay language. Lexicon-based Sentiment Analysis (SA) models make use of a sentiment lexicon for SA tasks, which is a linguistic resource that comprises a priori information about the sentiment properties of words. A sentiment lexicon is an indispensable resource for SA tasks. This is evident in the emergence of a large volume of research focused on the development of sentiment lexicon generation algorithms. This is not the case for low-resource languages such as Malay, for which there is a lack of research focused on this particular area. This has brought up the motivation to propose a sentiment lexicon generation algorithm for this language. WordNet Bahasa was first mapped onto the English WordNet to construct a multilingual word network. A seed set of prototypical positive and negative terms was then automatically expanded by recursively adding terms linked via WordNet’s synonymy and antonymy semantic relations. The underlying intuition is that the sentiment properties of newly added terms via these relations are preserved. A supervised classifier was employed for the word-polarity tagging task, with textual representations of the expanded seed set as features. Evaluation of the model against the General Inquirer lexicon as a benchmark demonstrates that it performs with reasonable accuracy. This paper aims to provide a foundation for further research for the Malay language in this area. Penerbit Universiti Kebangsaan Malaysia 2016-06 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/10056/1/11736-37831-1-PB.pdf Mohammad Darwich, and Shahrul Azman Mohd Noah, and Nazlia Omar, (2016) Automatically generating a sentiment lexicon for the Malay language. Asia-Pacific Journal of Information Technology and Multimedia, 5 (1). pp. 49-59. ISSN 2289-2192 http://ejournals.ukm.my/apjitm/issue/view/709
institution Universiti Kebangsaan Malaysia
building Perpustakaan Tun Sri Lanang Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
url_provider http://journalarticle.ukm.my/
language English
description This paper aims to propose an automated sentiment lexicon generation model specifically designed for the Malay language. Lexicon-based Sentiment Analysis (SA) models make use of a sentiment lexicon for SA tasks, which is a linguistic resource that comprises a priori information about the sentiment properties of words. A sentiment lexicon is an indispensable resource for SA tasks. This is evident in the emergence of a large volume of research focused on the development of sentiment lexicon generation algorithms. This is not the case for low-resource languages such as Malay, for which there is a lack of research focused on this particular area. This has brought up the motivation to propose a sentiment lexicon generation algorithm for this language. WordNet Bahasa was first mapped onto the English WordNet to construct a multilingual word network. A seed set of prototypical positive and negative terms was then automatically expanded by recursively adding terms linked via WordNet’s synonymy and antonymy semantic relations. The underlying intuition is that the sentiment properties of newly added terms via these relations are preserved. A supervised classifier was employed for the word-polarity tagging task, with textual representations of the expanded seed set as features. Evaluation of the model against the General Inquirer lexicon as a benchmark demonstrates that it performs with reasonable accuracy. This paper aims to provide a foundation for further research for the Malay language in this area.
format Article
author Mohammad Darwich,
Shahrul Azman Mohd Noah,
Nazlia Omar,
spellingShingle Mohammad Darwich,
Shahrul Azman Mohd Noah,
Nazlia Omar,
Automatically generating a sentiment lexicon for the Malay language
author_facet Mohammad Darwich,
Shahrul Azman Mohd Noah,
Nazlia Omar,
author_sort Mohammad Darwich,
title Automatically generating a sentiment lexicon for the Malay language
title_short Automatically generating a sentiment lexicon for the Malay language
title_full Automatically generating a sentiment lexicon for the Malay language
title_fullStr Automatically generating a sentiment lexicon for the Malay language
title_full_unstemmed Automatically generating a sentiment lexicon for the Malay language
title_sort automatically generating a sentiment lexicon for the malay language
publisher Penerbit Universiti Kebangsaan Malaysia
publishDate 2016
url http://journalarticle.ukm.my/10056/1/11736-37831-1-PB.pdf
http://journalarticle.ukm.my/10056/
http://ejournals.ukm.my/apjitm/issue/view/709
_version_ 1643737990490488832