Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media

Research on Sentiment Analysis in social media by using Mesopotamian-Iraqi Dialect (MID) of Arabic language was rarely found, there is no reliable dataset developed in MID neither an annotated corpus for the sentiment analysis of social media in this dialect. Therefore, this gap was the main stumbli...

Full description

Saved in:

Bibliographic Details
Main Authors:	Askar, A. K. A. J., Sjarif, N. N. A.
Format:	Article
Language:	English
Published:	Science and Information Organization 2021
Subjects:	T Technology (General)
Online Access:	http://eprints.utm.my/id/eprint/94324/1/NilamNurAmirSjarif2021_AnnotatedCorpusofMesopotamianIraqiDialect.pdf http://eprints.utm.my/id/eprint/94324/ http://dx.doi.org/10.14569/IJACSA.2021.0120413
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Teknologi Malaysia
Language:	English

id	my.utm.94324
record_format	eprints
spelling	my.utm.943242022-03-31T14:45:14Z http://eprints.utm.my/id/eprint/94324/ Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media Askar, A. K. A. J. Sjarif, N. N. A. T Technology (General) Research on Sentiment Analysis in social media by using Mesopotamian-Iraqi Dialect (MID) of Arabic language was rarely found, there is no reliable dataset developed in MID neither an annotated corpus for the sentiment analysis of social media in this dialect. Therefore, this gap was the main stumbling block for researchers of sentiment analysis in MID, for this reason, this paper introduced the development of an annotated corpus of Mesopotamian-Iraqi Dialect for sentiment analysis in social media and named it as (ACMID) stands for (the annotated corpus of Mesopotamian-Iraqi Dialect) to help researchers in future for using this corpus for their studies, to the best of our knowledge this is the first annotated corpus that both classify polarity as well as emotion classification in MID. Likewise, Facebook as the most popular social platform among Iraqis was used to extract the data from its popular Iraqi pages. 5000 comments were extracted from these pages classified by its polarity (Positive, Negative, Neutral, Spam) by two Iraqi annotators, these annotators were simultaneously classifying the same comments according to Ekman seven universal emotions (Anger, Fear, Disgust, Happiness, Sadness, Surprise, Contempt) or no emotion. Cohen’s kappa coefficient was then used to compare the two annotators’ results to find the reliability of these results. The data shows a comparable value among the two annotators for the polarity classification as high as 0.82, while for the emotion classification the result was 0.65. Science and Information Organization 2021 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/94324/1/NilamNurAmirSjarif2021_AnnotatedCorpusofMesopotamianIraqiDialect.pdf Askar, A. K. A. J. and Sjarif, N. N. A. (2021) Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media. International Journal of Advanced Computer Science and Applications, 12 (4). ISSN 2158-107X http://dx.doi.org/10.14569/IJACSA.2021.0120413 DOI: 10.14569/IJACSA.2021.0120413
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English
topic	T Technology (General)
spellingShingle	T Technology (General) Askar, A. K. A. J. Sjarif, N. N. A. Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media
description	Research on Sentiment Analysis in social media by using Mesopotamian-Iraqi Dialect (MID) of Arabic language was rarely found, there is no reliable dataset developed in MID neither an annotated corpus for the sentiment analysis of social media in this dialect. Therefore, this gap was the main stumbling block for researchers of sentiment analysis in MID, for this reason, this paper introduced the development of an annotated corpus of Mesopotamian-Iraqi Dialect for sentiment analysis in social media and named it as (ACMID) stands for (the annotated corpus of Mesopotamian-Iraqi Dialect) to help researchers in future for using this corpus for their studies, to the best of our knowledge this is the first annotated corpus that both classify polarity as well as emotion classification in MID. Likewise, Facebook as the most popular social platform among Iraqis was used to extract the data from its popular Iraqi pages. 5000 comments were extracted from these pages classified by its polarity (Positive, Negative, Neutral, Spam) by two Iraqi annotators, these annotators were simultaneously classifying the same comments according to Ekman seven universal emotions (Anger, Fear, Disgust, Happiness, Sadness, Surprise, Contempt) or no emotion. Cohen’s kappa coefficient was then used to compare the two annotators’ results to find the reliability of these results. The data shows a comparable value among the two annotators for the polarity classification as high as 0.82, while for the emotion classification the result was 0.65.
format	Article
author	Askar, A. K. A. J. Sjarif, N. N. A.
author_facet	Askar, A. K. A. J. Sjarif, N. N. A.
author_sort	Askar, A. K. A. J.
title	Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media
title_short	Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media
title_full	Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media
title_fullStr	Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media
title_full_unstemmed	Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media
title_sort	annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media
publisher	Science and Information Organization
publishDate	2021
url	http://eprints.utm.my/id/eprint/94324/1/NilamNurAmirSjarif2021_AnnotatedCorpusofMesopotamianIraqiDialect.pdf http://eprints.utm.my/id/eprint/94324/ http://dx.doi.org/10.14569/IJACSA.2021.0120413
_version_	1729703156034568192

Annotated corpus of mesopotamian-iraqi dialect for sentiment analysis in social media

Similar Items