Blocking reduction strategies in hierarchical text classification
One common approach in hierarchical text classification involves associating classifiers with nodes in the category tree and classifying text documents in a top-down manner. Classification methods using this top-down approach can scale well and cope with changes to the category trees. However, all t...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2004
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/124 https://ink.library.smu.edu.sg/context/sis_research/article/1123/viewcontent/Blocking_reduction_strategies_in_hierarchical_text_classification.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-1123 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-11232018-06-29T02:09:14Z Blocking reduction strategies in hierarchical text classification LIM, Ee Peng SUN, Aixin NG, Wee-Keong SRIVASTAVA, Jaideep One common approach in hierarchical text classification involves associating classifiers with nodes in the category tree and classifying text documents in a top-down manner. Classification methods using this top-down approach can scale well and cope with changes to the category trees. However, all these methods suffer from blocking which refers to documents wrongly rejected by the classifiers at higher-levels and cannot be passed to the classifiers at lower-levels. We propose a classifier-centric performance measure known as blocking factor to determine the extent of the blocking. Three methods are proposed to address the blocking problem, namely, threshold reduction, restricted voting, and extended multiplicative. Our experiments using support vector machine (SVM) classifiers on the Reuters collection have shown that they all could reduce blocking and improve the classification accuracy. Our experiments have also shown that the Restricted Voting method delivered the best performance. 2004-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/124 info:doi/10.1109/TKDE.2004.50 https://ink.library.smu.edu.sg/context/sis_research/article/1123/viewcontent/Blocking_reduction_strategies_in_hierarchical_text_classification.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Data mining text mining classification Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Data mining text mining classification Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
Data mining text mining classification Databases and Information Systems Numerical Analysis and Scientific Computing LIM, Ee Peng SUN, Aixin NG, Wee-Keong SRIVASTAVA, Jaideep Blocking reduction strategies in hierarchical text classification |
description |
One common approach in hierarchical text classification involves associating classifiers with nodes in the category tree and classifying text documents in a top-down manner. Classification methods using this top-down approach can scale well and cope with changes to the category trees. However, all these methods suffer from blocking which refers to documents wrongly rejected by the classifiers at higher-levels and cannot be passed to the classifiers at lower-levels. We propose a classifier-centric performance measure known as blocking factor to determine the extent of the blocking. Three methods are proposed to address the blocking problem, namely, threshold reduction, restricted voting, and extended multiplicative. Our experiments using support vector machine (SVM) classifiers on the Reuters collection have shown that they all could reduce blocking and improve the classification accuracy. Our experiments have also shown that the Restricted Voting method delivered the best performance. |
format |
text |
author |
LIM, Ee Peng SUN, Aixin NG, Wee-Keong SRIVASTAVA, Jaideep |
author_facet |
LIM, Ee Peng SUN, Aixin NG, Wee-Keong SRIVASTAVA, Jaideep |
author_sort |
LIM, Ee Peng |
title |
Blocking reduction strategies in hierarchical text classification |
title_short |
Blocking reduction strategies in hierarchical text classification |
title_full |
Blocking reduction strategies in hierarchical text classification |
title_fullStr |
Blocking reduction strategies in hierarchical text classification |
title_full_unstemmed |
Blocking reduction strategies in hierarchical text classification |
title_sort |
blocking reduction strategies in hierarchical text classification |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2004 |
url |
https://ink.library.smu.edu.sg/sis_research/124 https://ink.library.smu.edu.sg/context/sis_research/article/1123/viewcontent/Blocking_reduction_strategies_in_hierarchical_text_classification.pdf |
_version_ |
1770568879436726272 |