Hierarchical document representation for summarization

Most extractive summarization models usually employ a hierarchical encoder for document summarization. However, these extractive models are solely using document-level information to classify and select sentences which may not be the most effective way. In addition, most state-of-the-art (SOTA) mode...

Full description

Saved in:

Bibliographic Details
Main Author:	Tey, Rui Jie
Other Authors:	Lihui Chen
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Document and text processing Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/157571
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-157571
record_format	dspace
spelling	sg-ntu-dr.10356-1575712023-07-07T19:29:22Z Hierarchical document representation for summarization Tey, Rui Jie Lihui Chen School of Electrical and Electronic Engineering ELHCHEN@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Document and text processing Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Most extractive summarization models usually employ a hierarchical encoder for document summarization. However, these extractive models are solely using document-level information to classify and select sentences which may not be the most effective way. In addition, most state-of-the-art (SOTA) models will be using huge number of parameters to learn from a large amount of data, and this causes the computational costs to be very expensive. In this project, Hierarchical Weight Sharing Transformers for Summarization (HIWESTSUM) is proposed for document summarization. HIWESTSUM is very light in weight with parameter size over 10 times smaller than current existing models that fine-tune BERT for summarization. Moreover, the proposed model is faster than SOTA models with shorter training and inference time. It learns effectively from both sentence and document level representations with weight sharing mechanisms. By adopting weight sharing and hierarchical learning strategies, it is proven in this project that the proposed model HIWESTSUM may reduce the usage of computational resources for summarization and achieve comparable results as SOTA models when trained on smaller datasets. Bachelor of Engineering (Information Engineering and Media) 2022-05-20T02:36:29Z 2022-05-20T02:36:29Z 2022 Final Year Project (FYP) Tey, R. J. (2022). Hierarchical document representation for summarization. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/157571 https://hdl.handle.net/10356/157571 en A3043-211 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Document and text processing Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Document and text processing Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Tey, Rui Jie Hierarchical document representation for summarization
description	Most extractive summarization models usually employ a hierarchical encoder for document summarization. However, these extractive models are solely using document-level information to classify and select sentences which may not be the most effective way. In addition, most state-of-the-art (SOTA) models will be using huge number of parameters to learn from a large amount of data, and this causes the computational costs to be very expensive. In this project, Hierarchical Weight Sharing Transformers for Summarization (HIWESTSUM) is proposed for document summarization. HIWESTSUM is very light in weight with parameter size over 10 times smaller than current existing models that fine-tune BERT for summarization. Moreover, the proposed model is faster than SOTA models with shorter training and inference time. It learns effectively from both sentence and document level representations with weight sharing mechanisms. By adopting weight sharing and hierarchical learning strategies, it is proven in this project that the proposed model HIWESTSUM may reduce the usage of computational resources for summarization and achieve comparable results as SOTA models when trained on smaller datasets.
author2	Lihui Chen
author_facet	Lihui Chen Tey, Rui Jie
format	Final Year Project
author	Tey, Rui Jie
author_sort	Tey, Rui Jie
title	Hierarchical document representation for summarization
title_short	Hierarchical document representation for summarization
title_full	Hierarchical document representation for summarization
title_fullStr	Hierarchical document representation for summarization
title_full_unstemmed	Hierarchical document representation for summarization
title_sort	hierarchical document representation for summarization
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/157571
_version_	1772828589751795712

Hierarchical document representation for summarization

Similar Items