Abstractive summarization framework based on pre-training and contrastive learning
Abstractive summarization aims at generating sentences which can well cover the key information of the document. In this dissertation, we verify the effectiveness of a generation-evaluation model trained with contrastive learning, which generates a set of candidate summaries first and then evaluates...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/165534 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Abstractive summarization aims at generating sentences which can well cover the key information of the document. In this dissertation, we verify the effectiveness of a generation-evaluation model trained with contrastive learning, which generates a set of candidate summaries first and then evaluates the candidates to select the best one. Conventional methods directly introduce pre-trained models by default as the backbone of summary evaluation model. However, what pre-training task is helpful for improving the performance of pre-trained models on downstream summary evaluation task is still an open question. We conduct a study on Inverse Cloze Task (ICT) to answer the question. For the backbone of evaluation model, we compare the results of different pre-trained models. We further adopt ICT as additional pre-training task to pre-train the model and utilize it as the backbone of the evaluation model. We also verify and analyze how the masking rate in ICT affects the downstream evaluation task. Experiments on XSum and CNN/Daily Mail show that the model with additional ICT pre-training outperforms other pre-training baselines. |
---|