Hidden Markov Model for Sentence Compression

Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence le...

Full description

Saved in:
Bibliographic Details
Main Author: WIBISONO (NIM 23505023), YUDI
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/9701
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:9701
spelling id-itb.:97012017-09-27T15:37:07ZHidden Markov Model for Sentence Compression WIBISONO (NIM 23505023), YUDI Indonesia Theses INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/9701 Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence length without remove important information. Sentence compression in this thesis use Hidden Markov Model (HMM) adapted from statistical translation model and HMM-Hedge. Viterbi algorithm is also used to find optimal word sequence. There are two experiment goals. First, explore the influences of three parameters to quality of sentence compression. Second, compare our system performance with Knight-Marcu Noisy Channel performance. This experiment used Ziff-Davis corpus, that consists of 1067 pairs of sentences. Best HMM is constructed by adding numerical tags in preprocessing, Jelinek Mercer smoothing with = 0.1, and probability weight 0.1. This system is still worse than Knight Marcu Noisy Channel. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence length without remove important information. Sentence compression in this thesis use Hidden Markov Model (HMM) adapted from statistical translation model and HMM-Hedge. Viterbi algorithm is also used to find optimal word sequence. There are two experiment goals. First, explore the influences of three parameters to quality of sentence compression. Second, compare our system performance with Knight-Marcu Noisy Channel performance. This experiment used Ziff-Davis corpus, that consists of 1067 pairs of sentences. Best HMM is constructed by adding numerical tags in preprocessing, Jelinek Mercer smoothing with = 0.1, and probability weight 0.1. This system is still worse than Knight Marcu Noisy Channel.
format Theses
author WIBISONO (NIM 23505023), YUDI
spellingShingle WIBISONO (NIM 23505023), YUDI
Hidden Markov Model for Sentence Compression
author_facet WIBISONO (NIM 23505023), YUDI
author_sort WIBISONO (NIM 23505023), YUDI
title Hidden Markov Model for Sentence Compression
title_short Hidden Markov Model for Sentence Compression
title_full Hidden Markov Model for Sentence Compression
title_fullStr Hidden Markov Model for Sentence Compression
title_full_unstemmed Hidden Markov Model for Sentence Compression
title_sort hidden markov model for sentence compression
url https://digilib.itb.ac.id/gdl/view/9701
_version_ 1820664771581575168