EXTRACTION, CATEGORIZATION, AND ASPECT BASED OPINION SUMMARIZATION

Customer reviews are increasingly available online for a wide range of products. However, it is almost impossible for consumers or potential customers to read the entire reviews. Therefore, summarizing opinions plays an important role in analyzing reviews and generating a summary of opinions. This d...

Full description

Saved in:
Bibliographic Details
Main Author: Maharani, Warih
Format: Dissertations
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/37145
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Customer reviews are increasingly available online for a wide range of products. However, it is almost impossible for consumers or potential customers to read the entire reviews. Therefore, summarizing opinions plays an important role in analyzing reviews and generating a summary of opinions. This dissertation develops an aspect-based opinion summarization system that can provide an informative summary. The system performs four tasks: 1) extraction of aspect’ expressions and sentiments; 2) sentiment classification; (3) aspect categorization; and 4) summary generation. There are three major contributions of this dissertation. The first contribution is the Clue Propagation method for extracting aspects’ expressions and sentiments. The second contribution is the development of semantic-based aspect categorization with distributed semantic models and the third contribution is the development of aspect-based opinion summarization methods, which include: a) aspect-based rating method; b) aspect-based rating visualization; and c) the criteria of rating system as an alternative reference to assess a rating system. Clue Propagation is a modification of the Double Propagation method, which is a propagation-based aspect extraction’ method, obtaining sentiments and aspects’ expression using dependency relations rules. Although the Double Propagation method is effective, it only extracts noun-and-adjective relations. The propagation that starts only with seed of adjectives can result in interrupted propagation so that several aspect’ expressions cannot be extracted. Potential aspects expression that are not successfully extracted can be reduced by adding the use of multiple clues at the propagation stage: clue of adjectives, clue of verbs expression, clue of entities expressions, and adverb’ clue. Dependency relations between aspects and sentiment allow aspects to be identified based on adjectives’ clue, verb expressions’ clue, and adverb’ clue, while sentiment can be identified based on expressions of entities’ clue. The experiments show that the use of this four clues and the rules of dependency relations on the Clue Propagation method can extract not only noun- and-adjectives relations, but can also extract noun-and-verb relations. This method can reduce the potential for broken propagation in the Double Propagation method. In the second contribution, this dissertation develops a method of aspect categorization based on semantic similarities. Semantic similarities based on the distributed semantic model (word2vec) are based on word co-occurrence of a corpus, where the corpus has OOV problems. This research utilizes the semantic similarity of WordNet to overcome these problems, by linearly combining distributed semantic model approaches and WordNet-based approaches for aspect categorization. In the third contribution, this dissertation develops an aspect-based opinion summarization system. Sometimes users want to know opinions about a combination of certain aspects of a product. This dissertation develops aspect-based product rating calculation methods, so users can obtain the opinions’ summary according to the desired combination of aspects, along with an informative rating visualization. In addition, In addition, this research also proposes an evaluation criterion that can be used to assess a rating system. The consistency and reliability test show that the rating method and its visualization have a high correlation value compared to the baseline method. In addition, the survey also indicated that the proposed rating visualization can be used as an alternative of aspect-based rating visualization.