Effective and efficient semantic representations and their applications
The proliferation of affordable and compact digital storage has also led to the creation of enormous databases of information, and much attention has been focused on the problem of processing unorganized and unstructured information into some form from which additional value can be extracted. Contem...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/etd_coll/536 https://ink.library.smu.edu.sg/context/etd_coll/article/1534/viewcontent/GPIS_AY2018_PhD_Chia_Chong_Cher.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.etd_coll-1534 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.etd_coll-15342024-02-14T06:37:54Z Effective and efficient semantic representations and their applications CHIA, Chong Cher The proliferation of affordable and compact digital storage has also led to the creation of enormous databases of information, and much attention has been focused on the problem of processing unorganized and unstructured information into some form from which additional value can be extracted. Contemporary approaches to this problem virtually necessitate the use of complex models running on computational systems due to the sheer volume of information to be processed. While it is possible for the model to be fed the actual data as input, typically a representation of the data is used instead. These representations are therefore of interest, as they act as intermediaries through which the database information are processed and therefore impact the resulting performance of the trained model. This dissertation is split into two parts: we first discuss in detail the effectiveness and efficiency of semantic data representations: Effective semantic representations focus on aspects generally related to the capabilities of these representations, such as task performance and interpretability. Efficient semantic representations encompass aspects which generally relate to the utilization of these representations, such as their storage size as well as generalizability across multiple tasks. Next, we explore an application of semantic representations in downstream tasks, before elaborating on multiple directions relating to such applications for future work. We present two works for discussion in the first part of the dissertation, where each work is focused on a specific form of semantic data. For textual data representations, we introduce a novel approach that improves efficiency through discarding representations, while limiting the impacts on downstream task effectiveness. For knowledge base representations, we explore a novel measure of node importance in knowledge graphs, and present a heuristic approach for selecting such nodes in large knowledge graphs. In the second part of the dissertation, we discuss the application of semantic representations in two downstream Natural Language Processing (NLP) tasks. We first describe the use of semantic representations generated by Large Language Models (LLMs) in an Information Retrieval (IR) system, and overcome the "cold-start" problem in the Legal NLP domain by introducing a novel heuristic for labelling "key" legal passages. We then propose a future research direction for generating summaries from long legal documents, which raises research questions regarding the input representation of such documents as well as the evaluation of such summarization models. 2023-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/etd_coll/536 https://ink.library.smu.edu.sg/context/etd_coll/article/1534/viewcontent/GPIS_AY2018_PhD_Chia_Chong_Cher.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Dissertations and Theses Collection (Open Access) eng Institutional Knowledge at Singapore Management University Data-driven Optimization Urban Logistics Dynamic Pickup and Delivery Problem Programming Languages and Compilers Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Data-driven Optimization Urban Logistics Dynamic Pickup and Delivery Problem Programming Languages and Compilers Software Engineering |
spellingShingle |
Data-driven Optimization Urban Logistics Dynamic Pickup and Delivery Problem Programming Languages and Compilers Software Engineering CHIA, Chong Cher Effective and efficient semantic representations and their applications |
description |
The proliferation of affordable and compact digital storage has also led to the creation of enormous databases of information, and much attention has been focused on the problem of processing unorganized and unstructured information into some form from which additional value can be extracted. Contemporary approaches to this problem virtually necessitate the use of complex models running on computational systems due to the sheer volume of information to be processed. While it is possible for the model to be fed the actual data as input, typically a representation of the data is used instead. These representations are therefore of interest, as they act as intermediaries through which the database information are processed and therefore impact the resulting performance of the trained model.
This dissertation is split into two parts: we first discuss in detail the effectiveness and efficiency of semantic data representations: Effective semantic representations focus on aspects generally related to the capabilities of these representations, such as task performance and interpretability. Efficient semantic representations encompass aspects which generally relate to the utilization of these representations, such as their storage size as well as generalizability across multiple tasks. Next, we explore an application of semantic representations in downstream tasks, before elaborating on multiple directions relating to such applications for future work.
We present two works for discussion in the first part of the dissertation, where each work is focused on a specific form of semantic data. For textual data representations, we introduce a novel approach that improves efficiency through discarding representations, while limiting the impacts on downstream task effectiveness. For knowledge base representations, we explore a novel measure of node importance in knowledge graphs, and present a heuristic approach for selecting such nodes in large knowledge graphs.
In the second part of the dissertation, we discuss the application of semantic representations in two downstream Natural Language Processing (NLP) tasks. We first describe the use of semantic representations generated by Large Language Models (LLMs) in an Information Retrieval (IR) system, and overcome the "cold-start" problem in the Legal NLP domain by introducing a novel heuristic for labelling "key" legal passages. We then propose a future research direction for generating summaries from long legal documents, which raises research questions regarding the input representation of such documents as well as the evaluation of such summarization models. |
format |
text |
author |
CHIA, Chong Cher |
author_facet |
CHIA, Chong Cher |
author_sort |
CHIA, Chong Cher |
title |
Effective and efficient semantic representations and their applications |
title_short |
Effective and efficient semantic representations and their applications |
title_full |
Effective and efficient semantic representations and their applications |
title_fullStr |
Effective and efficient semantic representations and their applications |
title_full_unstemmed |
Effective and efficient semantic representations and their applications |
title_sort |
effective and efficient semantic representations and their applications |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2023 |
url |
https://ink.library.smu.edu.sg/etd_coll/536 https://ink.library.smu.edu.sg/context/etd_coll/article/1534/viewcontent/GPIS_AY2018_PhD_Chia_Chong_Cher.pdf |
_version_ |
1794549506530869248 |