Extracting event knowledge from pretrained language models
The advent of large-scale Pretrained Language Models (PLM) in the field of Natural Language Processing (NLP) has allowed for the domain to reach new frontiers in language generation. This paper seeks to explore the idea of script knowledge probing in three PLMs - FLAN-T5, OPT, and GPT-3 (specific...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166081 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-166081 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1660812023-04-21T15:39:38Z Extracting event knowledge from pretrained language models Ong, Claudia Beth Li Boyang School of Computer Science and Engineering boyang.li@ntu.edu.sg Engineering::Computer science and engineering The advent of large-scale Pretrained Language Models (PLM) in the field of Natural Language Processing (NLP) has allowed for the domain to reach new frontiers in language generation. This paper seeks to explore the idea of script knowledge probing in three PLMs - FLAN-T5, OPT, and GPT-3 (specifically, davinci). We perform experiments with these prompts, generated from the WikiHow dataset with the three aforementioned PLMs, getting the accuracy level of each model for each sub-task, namely inclusive sub-event selection, sub-event temporal ordering and starting subevent selection, which when combined, can generate a complete script. We conclude that FLAN-T5 and GPT performs better for all three sub-tasks. We also investigate the linguistic features of the demonstrations and question used in two-shot prompts, to find if certain features contribute to a higher accuracy. We find that word-level differences (with GloVe embeddings) and sentence-level similarities (with the Universal Sentence Encoder) between the demonstrations and question can aid the models in predicting the correct label more often. Bachelor of Science in Data Science and Artificial Intelligence 2023-04-19T08:56:39Z 2023-04-19T08:56:39Z 2023 Final Year Project (FYP) Ong, C. B. (2023). Extracting event knowledge from pretrained language models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166081 https://hdl.handle.net/10356/166081 en SCSE22-0368 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Ong, Claudia Beth Extracting event knowledge from pretrained language models |
description |
The advent of large-scale Pretrained Language Models (PLM) in the field of Natural
Language Processing (NLP) has allowed for the domain to reach new frontiers in
language generation. This paper seeks to explore the idea of script knowledge probing
in three PLMs - FLAN-T5, OPT, and GPT-3 (specifically, davinci). We perform
experiments with these prompts, generated from the WikiHow dataset with the three
aforementioned PLMs, getting the accuracy level of each model for each sub-task,
namely inclusive sub-event selection, sub-event temporal ordering and starting subevent selection, which when combined, can generate a complete script. We conclude
that FLAN-T5 and GPT performs better for all three sub-tasks.
We also investigate the linguistic features of the demonstrations and question used in
two-shot prompts, to find if certain features contribute to a higher accuracy. We find
that word-level differences (with GloVe embeddings) and sentence-level similarities
(with the Universal Sentence Encoder) between the demonstrations and question can
aid the models in predicting the correct label more often. |
author2 |
Li Boyang |
author_facet |
Li Boyang Ong, Claudia Beth |
format |
Final Year Project |
author |
Ong, Claudia Beth |
author_sort |
Ong, Claudia Beth |
title |
Extracting event knowledge from pretrained language models |
title_short |
Extracting event knowledge from pretrained language models |
title_full |
Extracting event knowledge from pretrained language models |
title_fullStr |
Extracting event knowledge from pretrained language models |
title_full_unstemmed |
Extracting event knowledge from pretrained language models |
title_sort |
extracting event knowledge from pretrained language models |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/166081 |
_version_ |
1764208174272872448 |