Exploiting long context using joint distance and occurrence information for language modeling

This thesis investigates an approach to exploiting the long context based on the information about the distance and occurrence. By modeling the joint event of distance and occurrence, this approach attempts to incorporate the inter-dependencies into the model, such that information captured from the...

Full description

Saved in:

Bibliographic Details
Main Author:	Chong, Tze Yuang
Other Authors:	Chng Eng Siong
Format:	Theses and Dissertations
Language:	English
Published:	2018
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	http://hdl.handle.net/10356/75876
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-75876
record_format	dspace
spelling	sg-ntu-dr.10356-758762023-03-04T00:47:22Z Exploiting long context using joint distance and occurrence information for language modeling Chong, Tze Yuang Chng Eng Siong School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering This thesis investigates an approach to exploiting the long context based on the information about the distance and occurrence. By modeling the joint event of distance and occurrence, this approach attempts to incorporate the inter-dependencies into the model, such that information captured from the long context can be more optimally made use. This thesis addresses the problem with the conventional language modeling approaches that tend to neglect the inter-dependencies. Based on the proposed approach, a novel language model, referred to as the term-distance term-occurrence (TDTO) model, is formulated. The TDTO model estimates probabilities based on the events of term-distance (TD) and term-occurrence (TO) that correspond to the distances and occurrences of words in the context. By expressing the TDTO model in terms of a log-linear interpolation framework, the impact of the TD and TO towards the final estimation can be tuned. Specifically, as the TD events, i.e. positions, within a long context are likely rare or unseen, the weight of the TD component can be tuned down accordingly to alleviate the data scarcity problem. Through a series of experiments, the TDTO model has been shown to be capable of exploiting the long context to reduce the perplexities of the language models. On the BLLIP Wall Street Journal (WSJ) and Switchboard-1 (SWB) corpora, perplexity reductions up to 11.2% and 6.5% were obtained, with the context lengths of seven and eight, respectively. In addition, the TDTO model has been shown to outperform other conventional models used to exploit the long context, such as the distant-bigram, trigger and BOW models – the TDTO model consistently showed lower perplexities. The applicability of the TDTO model has been examined on several tasks, such as the speech recognition, text classiﬁcation and word prediction. The TDTO model has been shown to improve the baseline performance on all the considered tasks. Furthermore, this thesis proposes a neural network implementation of the TDTO model. The aim is to provide a better smoothing mechanism for TDTO modeling. The resulted model, referred to as the neural network based TDTO (NN-TDTO) model, has been empirically shown to outperform the baseline TDTO model in both perplexity and speech recognition accuracy. On the WSJ corpus, the NN-TDTO model yielded up to 9.2% lower perplexity as compared to the TDTO model. On the Aurora-4 speech recognition task, the NN-TDTO model obtained up to 12.9% relatively lower word error rate. Doctor of Philosophy (SCE) 2018-07-04T12:01:48Z 2018-07-04T12:01:48Z 2018 Thesis Chong, T. Y. (2018). Exploiting long context using joint distance and occurrence information for language modeling. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/75876 10.32657/10356/75876 en 121 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Chong, Tze Yuang Exploiting long context using joint distance and occurrence information for language modeling
description	This thesis investigates an approach to exploiting the long context based on the information about the distance and occurrence. By modeling the joint event of distance and occurrence, this approach attempts to incorporate the inter-dependencies into the model, such that information captured from the long context can be more optimally made use. This thesis addresses the problem with the conventional language modeling approaches that tend to neglect the inter-dependencies. Based on the proposed approach, a novel language model, referred to as the term-distance term-occurrence (TDTO) model, is formulated. The TDTO model estimates probabilities based on the events of term-distance (TD) and term-occurrence (TO) that correspond to the distances and occurrences of words in the context. By expressing the TDTO model in terms of a log-linear interpolation framework, the impact of the TD and TO towards the final estimation can be tuned. Specifically, as the TD events, i.e. positions, within a long context are likely rare or unseen, the weight of the TD component can be tuned down accordingly to alleviate the data scarcity problem. Through a series of experiments, the TDTO model has been shown to be capable of exploiting the long context to reduce the perplexities of the language models. On the BLLIP Wall Street Journal (WSJ) and Switchboard-1 (SWB) corpora, perplexity reductions up to 11.2% and 6.5% were obtained, with the context lengths of seven and eight, respectively. In addition, the TDTO model has been shown to outperform other conventional models used to exploit the long context, such as the distant-bigram, trigger and BOW models – the TDTO model consistently showed lower perplexities. The applicability of the TDTO model has been examined on several tasks, such as the speech recognition, text classiﬁcation and word prediction. The TDTO model has been shown to improve the baseline performance on all the considered tasks. Furthermore, this thesis proposes a neural network implementation of the TDTO model. The aim is to provide a better smoothing mechanism for TDTO modeling. The resulted model, referred to as the neural network based TDTO (NN-TDTO) model, has been empirically shown to outperform the baseline TDTO model in both perplexity and speech recognition accuracy. On the WSJ corpus, the NN-TDTO model yielded up to 9.2% lower perplexity as compared to the TDTO model. On the Aurora-4 speech recognition task, the NN-TDTO model obtained up to 12.9% relatively lower word error rate.
author2	Chng Eng Siong
author_facet	Chng Eng Siong Chong, Tze Yuang
format	Theses and Dissertations
author	Chong, Tze Yuang
author_sort	Chong, Tze Yuang
title	Exploiting long context using joint distance and occurrence information for language modeling
title_short	Exploiting long context using joint distance and occurrence information for language modeling
title_full	Exploiting long context using joint distance and occurrence information for language modeling
title_fullStr	Exploiting long context using joint distance and occurrence information for language modeling
title_full_unstemmed	Exploiting long context using joint distance and occurrence information for language modeling
title_sort	exploiting long context using joint distance and occurrence information for language modeling
publishDate	2018
url	http://hdl.handle.net/10356/75876
_version_	1759854485380792320

Exploiting long context using joint distance and occurrence information for language modeling

Similar Items