Word-streams for representing context in word maps
The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual do...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Published: |
Animo Repository
2007
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/faculty_research/11960 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
id |
oai:animorepository.dlsu.edu.ph:faculty_research-14111 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:faculty_research-141112024-03-23T00:32:25Z Word-streams for representing context in word maps Azcarraga, Arnulfo P. Gopez, Alfred Kenneth S. Yap, Teddy, Jr. The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual documents into their respective document signatures (i.e. histogram of words) which form the basis for training a document map. This document map is the final text archive. WEBSOM has been shown to be a powerful and versatile text archiving system. However, it spends (wastes) enormous computer resources in the computation of the left and right context of each and every word that appears in any of the documents in the text corpus. This paper presents an alternative scheme for incorporating context in the encoding of the words in such a way that the computation of the probabilistic centroid, which is inherent in the SOM training algorithm, is taken full advantage of. Several experiments are conducted to compare this new scheme with WEBSOM’s context averaging scheme. 2007-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/11960 Faculty Research Work Animo Repository Context (Linguistics) Self-organizing maps Computer Sciences |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
topic |
Context (Linguistics) Self-organizing maps Computer Sciences |
spellingShingle |
Context (Linguistics) Self-organizing maps Computer Sciences Azcarraga, Arnulfo P. Gopez, Alfred Kenneth S. Yap, Teddy, Jr. Word-streams for representing context in word maps |
description |
The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual documents into their respective document signatures (i.e. histogram of words) which form the basis for training a document map. This document map is the final text archive. WEBSOM has been shown to be a powerful and versatile text archiving system. However, it spends (wastes) enormous computer resources in the computation of the left and right context of each and every word that appears in any of the documents in the text corpus. This paper presents an alternative scheme for incorporating context in the encoding of the words in such a way that the computation of the probabilistic centroid, which is inherent in the SOM training algorithm, is taken full advantage of. Several experiments are conducted to compare this new scheme with WEBSOM’s context averaging scheme. |
format |
text |
author |
Azcarraga, Arnulfo P. Gopez, Alfred Kenneth S. Yap, Teddy, Jr. |
author_facet |
Azcarraga, Arnulfo P. Gopez, Alfred Kenneth S. Yap, Teddy, Jr. |
author_sort |
Azcarraga, Arnulfo P. |
title |
Word-streams for representing context in word maps |
title_short |
Word-streams for representing context in word maps |
title_full |
Word-streams for representing context in word maps |
title_fullStr |
Word-streams for representing context in word maps |
title_full_unstemmed |
Word-streams for representing context in word maps |
title_sort |
word-streams for representing context in word maps |
publisher |
Animo Repository |
publishDate |
2007 |
url |
https://animorepository.dlsu.edu.ph/faculty_research/11960 |
_version_ |
1800918898085724160 |