Extracting event knowledge from pretrained language models
The advent of large-scale Pretrained Language Models (PLM) in the field of Natural Language Processing (NLP) has allowed for the domain to reach new frontiers in language generation. This paper seeks to explore the idea of script knowledge probing in three PLMs - FLAN-T5, OPT, and GPT-3 (specific...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166081 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The advent of large-scale Pretrained Language Models (PLM) in the field of Natural
Language Processing (NLP) has allowed for the domain to reach new frontiers in
language generation. This paper seeks to explore the idea of script knowledge probing
in three PLMs - FLAN-T5, OPT, and GPT-3 (specifically, davinci). We perform
experiments with these prompts, generated from the WikiHow dataset with the three
aforementioned PLMs, getting the accuracy level of each model for each sub-task,
namely inclusive sub-event selection, sub-event temporal ordering and starting subevent selection, which when combined, can generate a complete script. We conclude
that FLAN-T5 and GPT performs better for all three sub-tasks.
We also investigate the linguistic features of the demonstrations and question used in
two-shot prompts, to find if certain features contribute to a higher accuracy. We find
that word-level differences (with GloVe embeddings) and sentence-level similarities
(with the Universal Sentence Encoder) between the demonstrations and question can
aid the models in predicting the correct label more often. |
---|