VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis

Detecting the sentiment expressed by a document is a key task for many applications, e.g., modeling user preferences, monitoring consumer behaviors, assessing product quality. Traditionally, the sentiment analysis task primarily relies on textual content. Fueled by the rise of mobile phones that are...

Full description

Saved in:

Bibliographic Details
Main Authors:	TRUONG, Quoc Tuan, LAUW, Hady Wirawan
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2019
Subjects:	sentiment analysis multimodal attention network Databases and Information Systems Numerical Analysis and Scientific Computing
Online Access:	https://ink.library.smu.edu.sg/sis_research/4700 https://ink.library.smu.edu.sg/context/sis_research/article/5703/viewcontent/aaai19a.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-5703
record_format	dspace
spelling	sg-smu-ink.sis_research-57032020-01-09T07:12:46Z VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis TRUONG, Quoc Tuan LAUW, Hady Wirawan Detecting the sentiment expressed by a document is a key task for many applications, e.g., modeling user preferences, monitoring consumer behaviors, assessing product quality. Traditionally, the sentiment analysis task primarily relies on textual content. Fueled by the rise of mobile phones that are often the only cameras on hand, documents on the Web (e.g., reviews, blog posts, tweets) are increasingly multimodal in nature, with photos in addition to textual content. A question arises whether the visual component could be useful for sentiment analysis as well. In this work, we propose Visual Aspect Attention Network or VistaNet, leveraging both textual and visual components. We observe that in many cases, with respect to sentiment detection, images play a supporting role to text, highlighting the salient aspects of an entity, rather than expressing sentiments independently of the text. Therefore, instead of using visual information as features, VistaNet relies on visual information as alignment for pointing out the important sentences of a document using attention. Experiments on restaurant reviews showcase the effectiveness of visual aspect attention, vis-a-vis visual features or textual attention. 2019-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4700 info:doi/10.1609/aaai.v33i01.3301305 https://ink.library.smu.edu.sg/context/sis_research/article/5703/viewcontent/aaai19a.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University sentiment analysis multimodal attention network Databases and Information Systems Numerical Analysis and Scientific Computing
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	sentiment analysis multimodal attention network Databases and Information Systems Numerical Analysis and Scientific Computing
spellingShingle	sentiment analysis multimodal attention network Databases and Information Systems Numerical Analysis and Scientific Computing TRUONG, Quoc Tuan LAUW, Hady Wirawan VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis
description	Detecting the sentiment expressed by a document is a key task for many applications, e.g., modeling user preferences, monitoring consumer behaviors, assessing product quality. Traditionally, the sentiment analysis task primarily relies on textual content. Fueled by the rise of mobile phones that are often the only cameras on hand, documents on the Web (e.g., reviews, blog posts, tweets) are increasingly multimodal in nature, with photos in addition to textual content. A question arises whether the visual component could be useful for sentiment analysis as well. In this work, we propose Visual Aspect Attention Network or VistaNet, leveraging both textual and visual components. We observe that in many cases, with respect to sentiment detection, images play a supporting role to text, highlighting the salient aspects of an entity, rather than expressing sentiments independently of the text. Therefore, instead of using visual information as features, VistaNet relies on visual information as alignment for pointing out the important sentences of a document using attention. Experiments on restaurant reviews showcase the effectiveness of visual aspect attention, vis-a-vis visual features or textual attention.
format	text
author	TRUONG, Quoc Tuan LAUW, Hady Wirawan
author_facet	TRUONG, Quoc Tuan LAUW, Hady Wirawan
author_sort	TRUONG, Quoc Tuan
title	VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis
title_short	VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis
title_full	VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis
title_fullStr	VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis
title_full_unstemmed	VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis
title_sort	vistanet: visual aspect attention network for multimodal sentiment analysis
publisher	Institutional Knowledge at Singapore Management University
publishDate	2019
url	https://ink.library.smu.edu.sg/sis_research/4700 https://ink.library.smu.edu.sg/context/sis_research/article/5703/viewcontent/aaai19a.pdf
_version_	1770574983139950592

VistaNet: Visual Aspect Attention Network for multimodal sentiment analysis

Similar Items