Evaluating keyword selection methods for WEBSOM text archives

The WEBSOM methodology, proven effective for building very large text archives, includes a method that extracts labels for each document cluster assigned to nodes in the map. However, the WEBSOM method needs to retrieve all the words of all the documents associated to each node. Since maps may have...

Full description

Saved in:
Bibliographic Details
Main Authors: Azcarraga, Arnulfo P., Yap, Teddy N., Tan, Jonathan O., Chua, Tat Seng
Format: text
Published: Animo Repository 2004
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/553
https://animorepository.dlsu.edu.ph/context/faculty_research/article/1552/type/native/viewcontent
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-1552
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-15522022-07-29T06:48:32Z Evaluating keyword selection methods for WEBSOM text archives Azcarraga, Arnulfo P. Yap, Teddy N. Tan, Jonathan O. Chua, Tat Seng The WEBSOM methodology, proven effective for building very large text archives, includes a method that extracts labels for each document cluster assigned to nodes in the map. However, the WEBSOM method needs to retrieve all the words of all the documents associated to each node. Since maps may have more than 100,000 nodes and since the archive may contain up to seven million documents, the WEBSOM methodology needs a faster alternative method for keyword selection. Presented here is such an alternative method that is abie to quickly deduce meaningful labels per node in the map. It does this just by analyzing the relative weight distribution of the SOM weight vectors and by taking advantage of some characteristics of the random projection method used in dimensionality reduction. The effectiveness of this technique is demonstrated on news document collections. 2004-03-01T08:00:00Z text text/html https://animorepository.dlsu.edu.ph/faculty_research/553 https://animorepository.dlsu.edu.ph/context/faculty_research/article/1552/type/native/viewcontent Faculty Research Work Animo Repository Text processing (Computer science) Automatic indexing Computer Sciences
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Text processing (Computer science)
Automatic indexing
Computer Sciences
spellingShingle Text processing (Computer science)
Automatic indexing
Computer Sciences
Azcarraga, Arnulfo P.
Yap, Teddy N.
Tan, Jonathan O.
Chua, Tat Seng
Evaluating keyword selection methods for WEBSOM text archives
description The WEBSOM methodology, proven effective for building very large text archives, includes a method that extracts labels for each document cluster assigned to nodes in the map. However, the WEBSOM method needs to retrieve all the words of all the documents associated to each node. Since maps may have more than 100,000 nodes and since the archive may contain up to seven million documents, the WEBSOM methodology needs a faster alternative method for keyword selection. Presented here is such an alternative method that is abie to quickly deduce meaningful labels per node in the map. It does this just by analyzing the relative weight distribution of the SOM weight vectors and by taking advantage of some characteristics of the random projection method used in dimensionality reduction. The effectiveness of this technique is demonstrated on news document collections.
format text
author Azcarraga, Arnulfo P.
Yap, Teddy N.
Tan, Jonathan O.
Chua, Tat Seng
author_facet Azcarraga, Arnulfo P.
Yap, Teddy N.
Tan, Jonathan O.
Chua, Tat Seng
author_sort Azcarraga, Arnulfo P.
title Evaluating keyword selection methods for WEBSOM text archives
title_short Evaluating keyword selection methods for WEBSOM text archives
title_full Evaluating keyword selection methods for WEBSOM text archives
title_fullStr Evaluating keyword selection methods for WEBSOM text archives
title_full_unstemmed Evaluating keyword selection methods for WEBSOM text archives
title_sort evaluating keyword selection methods for websom text archives
publisher Animo Repository
publishDate 2004
url https://animorepository.dlsu.edu.ph/faculty_research/553
https://animorepository.dlsu.edu.ph/context/faculty_research/article/1552/type/native/viewcontent
_version_ 1740844688701652992