Effective and efficient semantic representations and their applications

The proliferation of affordable and compact digital storage has also led to the creation of enormous databases of information, and much attention has been focused on the problem of processing unorganized and unstructured information into some form from which additional value can be extracted. Contem...

Full description

Saved in:
Bibliographic Details
Main Author: CHIA, Chong Cher
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/etd_coll/536
https://ink.library.smu.edu.sg/context/etd_coll/article/1534/viewcontent/GPIS_AY2018_PhD_Chia_Chong_Cher.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.etd_coll-1534
record_format dspace
spelling sg-smu-ink.etd_coll-15342024-02-14T06:37:54Z Effective and efficient semantic representations and their applications CHIA, Chong Cher The proliferation of affordable and compact digital storage has also led to the creation of enormous databases of information, and much attention has been focused on the problem of processing unorganized and unstructured information into some form from which additional value can be extracted. Contemporary approaches to this problem virtually necessitate the use of complex models running on computational systems due to the sheer volume of information to be processed. While it is possible for the model to be fed the actual data as input, typically a representation of the data is used instead. These representations are therefore of interest, as they act as intermediaries through which the database information are processed and therefore impact the resulting performance of the trained model. This dissertation is split into two parts: we first discuss in detail the effectiveness and efficiency of semantic data representations: Effective semantic representations focus on aspects generally related to the capabilities of these representations, such as task performance and interpretability. Efficient semantic representations encompass aspects which generally relate to the utilization of these representations, such as their storage size as well as generalizability across multiple tasks. Next, we explore an application of semantic representations in downstream tasks, before elaborating on multiple directions relating to such applications for future work. We present two works for discussion in the first part of the dissertation, where each work is focused on a specific form of semantic data. For textual data representations, we introduce a novel approach that improves efficiency through discarding representations, while limiting the impacts on downstream task effectiveness. For knowledge base representations, we explore a novel measure of node importance in knowledge graphs, and present a heuristic approach for selecting such nodes in large knowledge graphs. In the second part of the dissertation, we discuss the application of semantic representations in two downstream Natural Language Processing (NLP) tasks. We first describe the use of semantic representations generated by Large Language Models (LLMs) in an Information Retrieval (IR) system, and overcome the "cold-start" problem in the Legal NLP domain by introducing a novel heuristic for labelling "key" legal passages. We then propose a future research direction for generating summaries from long legal documents, which raises research questions regarding the input representation of such documents as well as the evaluation of such summarization models. 2023-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/etd_coll/536 https://ink.library.smu.edu.sg/context/etd_coll/article/1534/viewcontent/GPIS_AY2018_PhD_Chia_Chong_Cher.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Dissertations and Theses Collection (Open Access) eng Institutional Knowledge at Singapore Management University Data-driven Optimization Urban Logistics Dynamic Pickup and Delivery Problem Programming Languages and Compilers Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Data-driven Optimization
Urban Logistics
Dynamic Pickup and Delivery Problem
Programming Languages and Compilers
Software Engineering
spellingShingle Data-driven Optimization
Urban Logistics
Dynamic Pickup and Delivery Problem
Programming Languages and Compilers
Software Engineering
CHIA, Chong Cher
Effective and efficient semantic representations and their applications
description The proliferation of affordable and compact digital storage has also led to the creation of enormous databases of information, and much attention has been focused on the problem of processing unorganized and unstructured information into some form from which additional value can be extracted. Contemporary approaches to this problem virtually necessitate the use of complex models running on computational systems due to the sheer volume of information to be processed. While it is possible for the model to be fed the actual data as input, typically a representation of the data is used instead. These representations are therefore of interest, as they act as intermediaries through which the database information are processed and therefore impact the resulting performance of the trained model. This dissertation is split into two parts: we first discuss in detail the effectiveness and efficiency of semantic data representations: Effective semantic representations focus on aspects generally related to the capabilities of these representations, such as task performance and interpretability. Efficient semantic representations encompass aspects which generally relate to the utilization of these representations, such as their storage size as well as generalizability across multiple tasks. Next, we explore an application of semantic representations in downstream tasks, before elaborating on multiple directions relating to such applications for future work. We present two works for discussion in the first part of the dissertation, where each work is focused on a specific form of semantic data. For textual data representations, we introduce a novel approach that improves efficiency through discarding representations, while limiting the impacts on downstream task effectiveness. For knowledge base representations, we explore a novel measure of node importance in knowledge graphs, and present a heuristic approach for selecting such nodes in large knowledge graphs. In the second part of the dissertation, we discuss the application of semantic representations in two downstream Natural Language Processing (NLP) tasks. We first describe the use of semantic representations generated by Large Language Models (LLMs) in an Information Retrieval (IR) system, and overcome the "cold-start" problem in the Legal NLP domain by introducing a novel heuristic for labelling "key" legal passages. We then propose a future research direction for generating summaries from long legal documents, which raises research questions regarding the input representation of such documents as well as the evaluation of such summarization models.
format text
author CHIA, Chong Cher
author_facet CHIA, Chong Cher
author_sort CHIA, Chong Cher
title Effective and efficient semantic representations and their applications
title_short Effective and efficient semantic representations and their applications
title_full Effective and efficient semantic representations and their applications
title_fullStr Effective and efficient semantic representations and their applications
title_full_unstemmed Effective and efficient semantic representations and their applications
title_sort effective and efficient semantic representations and their applications
publisher Institutional Knowledge at Singapore Management University
publishDate 2023
url https://ink.library.smu.edu.sg/etd_coll/536
https://ink.library.smu.edu.sg/context/etd_coll/article/1534/viewcontent/GPIS_AY2018_PhD_Chia_Chong_Cher.pdf
_version_ 1794549506530869248