Hidden Markov Model for Sentence Compression

Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence le...

Full description

Saved in:
Bibliographic Details
Main Author: WIBISONO (NIM 23505023), YUDI
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/9701
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence length without remove important information. Sentence compression in this thesis use Hidden Markov Model (HMM) adapted from statistical translation model and HMM-Hedge. Viterbi algorithm is also used to find optimal word sequence. There are two experiment goals. First, explore the influences of three parameters to quality of sentence compression. Second, compare our system performance with Knight-Marcu Noisy Channel performance. This experiment used Ziff-Davis corpus, that consists of 1067 pairs of sentences. Best HMM is constructed by adding numerical tags in preprocessing, Jelinek Mercer smoothing with = 0.1, and probability weight 0.1. This system is still worse than Knight Marcu Noisy Channel.