Composition distillation for semantic sentence embeddings

The increasing demand for Natural Language Processing (NLP) solutions is driven by an exponential growth in digital content, communication platforms, and the undeniable need for sophisticated language understanding. This surge in demand also reflects the critical role of NLP in enabling machines to...

Full description

Saved in:

Bibliographic Details
Main Author:	Vaanavan, Sezhiyan
Other Authors:	Lihui Chen
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering NLP LLM Sentence embedding
Online Access:	https://hdl.handle.net/10356/177524
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-177524
record_format	dspace
spelling	sg-ntu-dr.10356-1775242024-05-31T15:44:34Z Composition distillation for semantic sentence embeddings Vaanavan, Sezhiyan Lihui Chen School of Electrical and Electronic Engineering ELHCHEN@ntu.edu.sg Engineering NLP LLM Sentence embedding The increasing demand for Natural Language Processing (NLP) solutions is driven by an exponential growth in digital content, communication platforms, and the undeniable need for sophisticated language understanding. This surge in demand also reflects the critical role of NLP in enabling machines to comprehend, interpret, and generate human-like text, which makes it a crucial technology in modern AI applications. Semantics, the study of meaning in languages, plays a pivotal role in NLP, encompassing the understanding of context, relationships, and nuances within multiple textual data. In recent years, there has been remarkable progress in utilizing pre-trained language models like BERT (Bidirectional Encoder Representations from Transformers) and GPT-3/4 (Generative Pre-trained Transformer) for semantic embeddings in NLP tasks. This project identifies and addresses a critical challenge within NLP that is commonly overlooked. The intricate composition of semantics within sentences often gets lost during model training, resulting in a lack of depth and precision in understanding the input language, leading to potential misinterpretations of textual data. This gap is hence addressed by enhancing already existing methods to distil semantic information from texts into smaller and more efficient models. By building upon the foundation laid by previous models, this project aims to improve the performance and accuracy of NLP systems by enhancing the quality and depth of semantic embeddings. Bachelor's degree 2024-05-29T02:21:57Z 2024-05-29T02:21:57Z 2024 Final Year Project (FYP) Vaanavan, S. (2024). Composition distillation for semantic sentence embeddings. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/177524 https://hdl.handle.net/10356/177524 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering NLP LLM Sentence embedding
spellingShingle	Engineering NLP LLM Sentence embedding Vaanavan, Sezhiyan Composition distillation for semantic sentence embeddings
description	The increasing demand for Natural Language Processing (NLP) solutions is driven by an exponential growth in digital content, communication platforms, and the undeniable need for sophisticated language understanding. This surge in demand also reflects the critical role of NLP in enabling machines to comprehend, interpret, and generate human-like text, which makes it a crucial technology in modern AI applications. Semantics, the study of meaning in languages, plays a pivotal role in NLP, encompassing the understanding of context, relationships, and nuances within multiple textual data. In recent years, there has been remarkable progress in utilizing pre-trained language models like BERT (Bidirectional Encoder Representations from Transformers) and GPT-3/4 (Generative Pre-trained Transformer) for semantic embeddings in NLP tasks. This project identifies and addresses a critical challenge within NLP that is commonly overlooked. The intricate composition of semantics within sentences often gets lost during model training, resulting in a lack of depth and precision in understanding the input language, leading to potential misinterpretations of textual data. This gap is hence addressed by enhancing already existing methods to distil semantic information from texts into smaller and more efficient models. By building upon the foundation laid by previous models, this project aims to improve the performance and accuracy of NLP systems by enhancing the quality and depth of semantic embeddings.
author2	Lihui Chen
author_facet	Lihui Chen Vaanavan, Sezhiyan
format	Final Year Project
author	Vaanavan, Sezhiyan
author_sort	Vaanavan, Sezhiyan
title	Composition distillation for semantic sentence embeddings
title_short	Composition distillation for semantic sentence embeddings
title_full	Composition distillation for semantic sentence embeddings
title_fullStr	Composition distillation for semantic sentence embeddings
title_full_unstemmed	Composition distillation for semantic sentence embeddings
title_sort	composition distillation for semantic sentence embeddings
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/177524
_version_	1800916109243711488

Composition distillation for semantic sentence embeddings

Similar Items