Extracting event knowledge from pretrained language models

The advent of large-scale Pretrained Language Models (PLM) in the field of Natural Language Processing (NLP) has allowed for the domain to reach new frontiers in language generation. This paper seeks to explore the idea of script knowledge probing in three PLMs - FLAN-T5, OPT, and GPT-3 (specific...

Full description

Saved in:

Bibliographic Details
Main Author:	Ong, Claudia Beth
Other Authors:	Li Boyang
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/166081
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	The advent of large-scale Pretrained Language Models (PLM) in the field of Natural Language Processing (NLP) has allowed for the domain to reach new frontiers in language generation. This paper seeks to explore the idea of script knowledge probing in three PLMs - FLAN-T5, OPT, and GPT-3 (specifically, davinci). We perform experiments with these prompts, generated from the WikiHow dataset with the three aforementioned PLMs, getting the accuracy level of each model for each sub-task, namely inclusive sub-event selection, sub-event temporal ordering and starting subevent selection, which when combined, can generate a complete script. We conclude that FLAN-T5 and GPT performs better for all three sub-tasks. We also investigate the linguistic features of the demonstrations and question used in two-shot prompts, to find if certain features contribute to a higher accuracy. We find that word-level differences (with GloVe embeddings) and sentence-level similarities (with the Universal Sentence Encoder) between the demonstrations and question can aid the models in predicting the correct label more often.

Extracting event knowledge from pretrained language models

Similar Items