Evaluating keyword selection methods for WEBSOM text archives
The WEBSOM methodology, proven effective for building very large text archives, includes a method that extracts labels for each document cluster assigned to nodes in the map. However, the WEBSOM method needs to retrieve all the words of all the documents associated to each node. Since maps may have...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Published: |
Animo Repository
2004
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/faculty_research/553 https://animorepository.dlsu.edu.ph/context/faculty_research/article/1552/type/native/viewcontent |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
id |
oai:animorepository.dlsu.edu.ph:faculty_research-1552 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:faculty_research-15522022-07-29T06:48:32Z Evaluating keyword selection methods for WEBSOM text archives Azcarraga, Arnulfo P. Yap, Teddy N. Tan, Jonathan O. Chua, Tat Seng The WEBSOM methodology, proven effective for building very large text archives, includes a method that extracts labels for each document cluster assigned to nodes in the map. However, the WEBSOM method needs to retrieve all the words of all the documents associated to each node. Since maps may have more than 100,000 nodes and since the archive may contain up to seven million documents, the WEBSOM methodology needs a faster alternative method for keyword selection. Presented here is such an alternative method that is abie to quickly deduce meaningful labels per node in the map. It does this just by analyzing the relative weight distribution of the SOM weight vectors and by taking advantage of some characteristics of the random projection method used in dimensionality reduction. The effectiveness of this technique is demonstrated on news document collections. 2004-03-01T08:00:00Z text text/html https://animorepository.dlsu.edu.ph/faculty_research/553 https://animorepository.dlsu.edu.ph/context/faculty_research/article/1552/type/native/viewcontent Faculty Research Work Animo Repository Text processing (Computer science) Automatic indexing Computer Sciences |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
topic |
Text processing (Computer science) Automatic indexing Computer Sciences |
spellingShingle |
Text processing (Computer science) Automatic indexing Computer Sciences Azcarraga, Arnulfo P. Yap, Teddy N. Tan, Jonathan O. Chua, Tat Seng Evaluating keyword selection methods for WEBSOM text archives |
description |
The WEBSOM methodology, proven effective for building very large text archives, includes a method that extracts labels for each document cluster assigned to nodes in the map. However, the WEBSOM method needs to retrieve all the words of all the documents associated to each node. Since maps may have more than 100,000 nodes and since the archive may contain up to seven million documents, the WEBSOM methodology needs a faster alternative method for keyword selection. Presented here is such an alternative method that is abie to quickly deduce meaningful labels per node in the map. It does this just by analyzing the relative weight distribution of the SOM weight vectors and by taking advantage of some characteristics of the random projection method used in dimensionality reduction. The effectiveness of this technique is demonstrated on news document collections. |
format |
text |
author |
Azcarraga, Arnulfo P. Yap, Teddy N. Tan, Jonathan O. Chua, Tat Seng |
author_facet |
Azcarraga, Arnulfo P. Yap, Teddy N. Tan, Jonathan O. Chua, Tat Seng |
author_sort |
Azcarraga, Arnulfo P. |
title |
Evaluating keyword selection methods for WEBSOM text archives |
title_short |
Evaluating keyword selection methods for WEBSOM text archives |
title_full |
Evaluating keyword selection methods for WEBSOM text archives |
title_fullStr |
Evaluating keyword selection methods for WEBSOM text archives |
title_full_unstemmed |
Evaluating keyword selection methods for WEBSOM text archives |
title_sort |
evaluating keyword selection methods for websom text archives |
publisher |
Animo Repository |
publishDate |
2004 |
url |
https://animorepository.dlsu.edu.ph/faculty_research/553 https://animorepository.dlsu.edu.ph/context/faculty_research/article/1552/type/native/viewcontent |
_version_ |
1740844688701652992 |