Searching for the X-Factor: Exploring corpus subjectivity for word embeddings

We explore the notion of subjectivity, and hypothesize that word embeddings learnt from input corpora of varying levels of subjectivity behave differently on natural language processing tasks such as classifying a sentence by sentiment, subjectivity, or topic. Through systematic comparative analyses...

Full description

Saved in:
Bibliographic Details
Main Authors: TKACHENKO, Maksim, CHIA, Chong Cher, LAUW, Hady W.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4229
https://ink.library.smu.edu.sg/context/sis_research/article/5232/viewcontent/acl2018.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:We explore the notion of subjectivity, and hypothesize that word embeddings learnt from input corpora of varying levels of subjectivity behave differently on natural language processing tasks such as classifying a sentence by sentiment, subjectivity, or topic. Through systematic comparative analyses, we establish this to be the case indeed. Moreover, based on the discovery of the outsized role that sentiment words play on subjectivity-sensitive tasks such as sentiment classification, we develop a novel word embedding SentiVec which is infused with sentiment information from a lexical resource, and is shown to outperform baselines on such tasks.