Composition distillation for semantic sentence embeddings
The increasing demand for Natural Language Processing (NLP) solutions is driven by an exponential growth in digital content, communication platforms, and the undeniable need for sophisticated language understanding. This surge in demand also reflects the critical role of NLP in enabling machines to...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/177524 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-177524 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1775242024-05-31T15:44:34Z Composition distillation for semantic sentence embeddings Vaanavan, Sezhiyan Lihui Chen School of Electrical and Electronic Engineering ELHCHEN@ntu.edu.sg Engineering NLP LLM Sentence embedding The increasing demand for Natural Language Processing (NLP) solutions is driven by an exponential growth in digital content, communication platforms, and the undeniable need for sophisticated language understanding. This surge in demand also reflects the critical role of NLP in enabling machines to comprehend, interpret, and generate human-like text, which makes it a crucial technology in modern AI applications. Semantics, the study of meaning in languages, plays a pivotal role in NLP, encompassing the understanding of context, relationships, and nuances within multiple textual data. In recent years, there has been remarkable progress in utilizing pre-trained language models like BERT (Bidirectional Encoder Representations from Transformers) and GPT-3/4 (Generative Pre-trained Transformer) for semantic embeddings in NLP tasks. This project identifies and addresses a critical challenge within NLP that is commonly overlooked. The intricate composition of semantics within sentences often gets lost during model training, resulting in a lack of depth and precision in understanding the input language, leading to potential misinterpretations of textual data. This gap is hence addressed by enhancing already existing methods to distil semantic information from texts into smaller and more efficient models. By building upon the foundation laid by previous models, this project aims to improve the performance and accuracy of NLP systems by enhancing the quality and depth of semantic embeddings. Bachelor's degree 2024-05-29T02:21:57Z 2024-05-29T02:21:57Z 2024 Final Year Project (FYP) Vaanavan, S. (2024). Composition distillation for semantic sentence embeddings. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/177524 https://hdl.handle.net/10356/177524 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering NLP LLM Sentence embedding |
spellingShingle |
Engineering NLP LLM Sentence embedding Vaanavan, Sezhiyan Composition distillation for semantic sentence embeddings |
description |
The increasing demand for Natural Language Processing (NLP) solutions is driven by an exponential growth in digital content, communication platforms, and the undeniable need for sophisticated language understanding. This surge in demand also reflects the critical role of NLP in enabling machines to comprehend, interpret, and generate human-like text, which makes it a crucial technology in modern AI applications.
Semantics, the study of meaning in languages, plays a pivotal role in NLP, encompassing the understanding of context, relationships, and nuances within multiple textual data. In recent years, there has been remarkable progress in utilizing pre-trained language models like BERT (Bidirectional Encoder Representations from Transformers) and GPT-3/4 (Generative Pre-trained Transformer) for semantic embeddings in NLP tasks. This project identifies and addresses a critical challenge within NLP that is commonly overlooked. The intricate composition of semantics within sentences often gets lost during model training, resulting in a lack of depth and precision in understanding the input language, leading to potential misinterpretations of textual data.
This gap is hence addressed by enhancing already existing methods to distil semantic information from texts into smaller and more efficient models. By building upon the foundation laid by previous models, this project aims to improve the performance and accuracy of NLP systems by enhancing the quality and depth of semantic embeddings. |
author2 |
Lihui Chen |
author_facet |
Lihui Chen Vaanavan, Sezhiyan |
format |
Final Year Project |
author |
Vaanavan, Sezhiyan |
author_sort |
Vaanavan, Sezhiyan |
title |
Composition distillation for semantic sentence embeddings |
title_short |
Composition distillation for semantic sentence embeddings |
title_full |
Composition distillation for semantic sentence embeddings |
title_fullStr |
Composition distillation for semantic sentence embeddings |
title_full_unstemmed |
Composition distillation for semantic sentence embeddings |
title_sort |
composition distillation for semantic sentence embeddings |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/177524 |
_version_ |
1800916109243711488 |