Hidden Markov Model for Sentence Compression
Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence le...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/9701 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:9701 |
---|---|
spelling |
id-itb.:97012017-09-27T15:37:07ZHidden Markov Model for Sentence Compression WIBISONO (NIM 23505023), YUDI Indonesia Theses INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/9701 Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence length without remove important information. Sentence compression in this thesis use Hidden Markov Model (HMM) adapted from statistical translation model and HMM-Hedge. Viterbi algorithm is also used to find optimal word sequence. There are two experiment goals. First, explore the influences of three parameters to quality of sentence compression. Second, compare our system performance with Knight-Marcu Noisy Channel performance. This experiment used Ziff-Davis corpus, that consists of 1067 pairs of sentences. Best HMM is constructed by adding numerical tags in preprocessing, Jelinek Mercer smoothing with = 0.1, and probability weight 0.1. This system is still worse than Knight Marcu Noisy Channel. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Main problem in using handheld device is the smal screen size so that it makes difficult for user to find and get textual information. Document summarization can solve this problem, but sometimes the summary is still too long. Sentence compression can be applied to that summary to reduce sentence length without remove important information. Sentence compression in this thesis use Hidden Markov Model (HMM) adapted from statistical translation model and HMM-Hedge. Viterbi algorithm is also used to find optimal word sequence. There are two experiment goals. First, explore the influences of three parameters to quality of sentence compression. Second, compare our system performance with Knight-Marcu Noisy Channel performance. This experiment used Ziff-Davis corpus, that consists of 1067 pairs of sentences. Best HMM is constructed by adding numerical tags in preprocessing, Jelinek Mercer smoothing with = 0.1, and probability weight 0.1. This system is still worse than Knight Marcu Noisy Channel. |
format |
Theses |
author |
WIBISONO (NIM 23505023), YUDI |
spellingShingle |
WIBISONO (NIM 23505023), YUDI Hidden Markov Model for Sentence Compression |
author_facet |
WIBISONO (NIM 23505023), YUDI |
author_sort |
WIBISONO (NIM 23505023), YUDI |
title |
Hidden Markov Model for Sentence Compression |
title_short |
Hidden Markov Model for Sentence Compression |
title_full |
Hidden Markov Model for Sentence Compression |
title_fullStr |
Hidden Markov Model for Sentence Compression |
title_full_unstemmed |
Hidden Markov Model for Sentence Compression |
title_sort |
hidden markov model for sentence compression |
url |
https://digilib.itb.ac.id/gdl/view/9701 |
_version_ |
1820664771581575168 |