Word-streams for representing context in word maps

The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual do...

Full description

Saved in:
Bibliographic Details
Main Authors: Azcarraga, Arnulfo P., Gopez, Alfred Kenneth S., Yap, Teddy, Jr.
Format: text
Published: Animo Repository 2007
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/11960
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-14111
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-141112024-03-23T00:32:25Z Word-streams for representing context in word maps Azcarraga, Arnulfo P. Gopez, Alfred Kenneth S. Yap, Teddy, Jr. The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual documents into their respective document signatures (i.e. histogram of words) which form the basis for training a document map. This document map is the final text archive. WEBSOM has been shown to be a powerful and versatile text archiving system. However, it spends (wastes) enormous computer resources in the computation of the left and right context of each and every word that appears in any of the documents in the text corpus. This paper presents an alternative scheme for incorporating context in the encoding of the words in such a way that the computation of the probabilistic centroid, which is inherent in the SOM training algorithm, is taken full advantage of. Several experiments are conducted to compare this new scheme with WEBSOM’s context averaging scheme. 2007-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/11960 Faculty Research Work Animo Repository Context (Linguistics) Self-organizing maps Computer Sciences
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Context (Linguistics)
Self-organizing maps
Computer Sciences
spellingShingle Context (Linguistics)
Self-organizing maps
Computer Sciences
Azcarraga, Arnulfo P.
Gopez, Alfred Kenneth S.
Yap, Teddy, Jr.
Word-streams for representing context in word maps
description The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual documents into their respective document signatures (i.e. histogram of words) which form the basis for training a document map. This document map is the final text archive. WEBSOM has been shown to be a powerful and versatile text archiving system. However, it spends (wastes) enormous computer resources in the computation of the left and right context of each and every word that appears in any of the documents in the text corpus. This paper presents an alternative scheme for incorporating context in the encoding of the words in such a way that the computation of the probabilistic centroid, which is inherent in the SOM training algorithm, is taken full advantage of. Several experiments are conducted to compare this new scheme with WEBSOM’s context averaging scheme.
format text
author Azcarraga, Arnulfo P.
Gopez, Alfred Kenneth S.
Yap, Teddy, Jr.
author_facet Azcarraga, Arnulfo P.
Gopez, Alfred Kenneth S.
Yap, Teddy, Jr.
author_sort Azcarraga, Arnulfo P.
title Word-streams for representing context in word maps
title_short Word-streams for representing context in word maps
title_full Word-streams for representing context in word maps
title_fullStr Word-streams for representing context in word maps
title_full_unstemmed Word-streams for representing context in word maps
title_sort word-streams for representing context in word maps
publisher Animo Repository
publishDate 2007
url https://animorepository.dlsu.edu.ph/faculty_research/11960
_version_ 1800918898085724160