Advancing neural text generation

The current sequence-to-sequence with attention models, despite being successful, are inherently limited in encompassing the most appropriate inductive bias for the generation tasks, which gives rise to varied modifications to the framework to better model the task. In particular, content select...

Full description

Saved in:

Bibliographic Details
Main Author:	Han, Simeng
Other Authors:	Joty Shafiq Rayhan
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/147963
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-147963
record_format	dspace
spelling	sg-ntu-dr.10356-1479632021-04-20T08:14:39Z Advancing neural text generation Han, Simeng Joty Shafiq Rayhan School of Computer Science and Engineering srjoty@ntu.edu.sg Engineering::Computer science and engineering The current sequence-to-sequence with attention models, despite being successful, are inherently limited in encompassing the most appropriate inductive bias for the generation tasks, which gives rise to varied modifications to the framework to better model the task. In particular, content selection is an important aspect in summarization where one salient problem is the tendency of the models to repeat generating the same tokens or sequences over and over. Submodularity is desirable for a variety of objectives in content selection where the current neural encoder-decoder framework is inadequate. However, it has so far not been explored in the neural encoder-decoder system for text generation. The greedy algorithm approximating the solution to the submodular maximization problem is not suited to attention score optimization in auto-regressive generation. Therefore instead of following how submodular function has been widely used, we propose a simplified yet principled solution. The resulting attention module offers an architecturally simple and empirically effective method to improve the coverage of neural text generation. We run experiments on three directed text generation tasks with different levels of recovering rate, across two modalities, three different neural model architectures and two training strategy variations. The results and analyses demonstrate that our method generalizes well across these settings, produces texts of good quality and outperforms state-of-the-art baselines. In this project, we also explore low resource text generation, specifically, zero-shot and few-shot text summarization. Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a general method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner which makes use of characteristics of the target dataset such as the length and abstractiveness of the desired summaries. We achieve state-of-the-art, zero-shot abstractive summarization performance on the CNN-DailyMail dataset and demonstrate the effectiveness of our approach on three additional, diverse datasets. The models fine-tuned in this unsupervised manner are more robust to noisy data and also achieve better few-shot performance using 10 and 100 training examples. We perform ablation studies on the effect of the components of our unsupervised fine-tuning data and analyze the performance of these models in few-shot scenarios along with data augmentation techniques using both automatic and human evaluation. The work on zero and few-shot text summarization has been accepted by The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). The investigation of text generation with submodularity will be submitted to The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). Bachelor of Engineering (Computer Science) 2021-04-20T08:14:39Z 2021-04-20T08:14:39Z 2021 Final Year Project (FYP) Han, S. (2021). Advancing neural text generation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/147963 https://hdl.handle.net/10356/147963 en SCSE20-0102 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Han, Simeng Advancing neural text generation
description	The current sequence-to-sequence with attention models, despite being successful, are inherently limited in encompassing the most appropriate inductive bias for the generation tasks, which gives rise to varied modifications to the framework to better model the task. In particular, content selection is an important aspect in summarization where one salient problem is the tendency of the models to repeat generating the same tokens or sequences over and over. Submodularity is desirable for a variety of objectives in content selection where the current neural encoder-decoder framework is inadequate. However, it has so far not been explored in the neural encoder-decoder system for text generation. The greedy algorithm approximating the solution to the submodular maximization problem is not suited to attention score optimization in auto-regressive generation. Therefore instead of following how submodular function has been widely used, we propose a simplified yet principled solution. The resulting attention module offers an architecturally simple and empirically effective method to improve the coverage of neural text generation. We run experiments on three directed text generation tasks with different levels of recovering rate, across two modalities, three different neural model architectures and two training strategy variations. The results and analyses demonstrate that our method generalizes well across these settings, produces texts of good quality and outperforms state-of-the-art baselines. In this project, we also explore low resource text generation, specifically, zero-shot and few-shot text summarization. Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a general method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner which makes use of characteristics of the target dataset such as the length and abstractiveness of the desired summaries. We achieve state-of-the-art, zero-shot abstractive summarization performance on the CNN-DailyMail dataset and demonstrate the effectiveness of our approach on three additional, diverse datasets. The models fine-tuned in this unsupervised manner are more robust to noisy data and also achieve better few-shot performance using 10 and 100 training examples. We perform ablation studies on the effect of the components of our unsupervised fine-tuning data and analyze the performance of these models in few-shot scenarios along with data augmentation techniques using both automatic and human evaluation. The work on zero and few-shot text summarization has been accepted by The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). The investigation of text generation with submodularity will be submitted to The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).
author2	Joty Shafiq Rayhan
author_facet	Joty Shafiq Rayhan Han, Simeng
format	Final Year Project
author	Han, Simeng
author_sort	Han, Simeng
title	Advancing neural text generation
title_short	Advancing neural text generation
title_full	Advancing neural text generation
title_fullStr	Advancing neural text generation
title_full_unstemmed	Advancing neural text generation
title_sort	advancing neural text generation
publisher	Nanyang Technological University
publishDate	2021
url	https://hdl.handle.net/10356/147963
_version_	1698713738409934848

Advancing neural text generation

Similar Items