Category hierarchy maintenance : a data-driven approach

Category hierarchies often evolve at a much slower pace than the documents reside in. With newly available documents kept adding into a hierarchy, new topics emerge and documents within the same category become less topically cohesive. In this paper, we propose a novel automatic approach to modifyin...

全面介紹

Saved in:
書目詳細資料
Main Authors: Yuan, Quan, Cong, Gao, Sun, Aixin, Lin, Chin-Yew, Magnenat-Thalmann, Nadia
其他作者: School of Computer Engineering
格式: Conference or Workshop Item
語言:English
出版: 2013
主題:
在線閱讀:https://hdl.handle.net/10356/97576
http://hdl.handle.net/10220/12082
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
id sg-ntu-dr.10356-97576
record_format dspace
spelling sg-ntu-dr.10356-975762020-05-28T07:17:21Z Category hierarchy maintenance : a data-driven approach Yuan, Quan Cong, Gao Sun, Aixin Lin, Chin-Yew Magnenat-Thalmann, Nadia School of Computer Engineering International conference on Research and development in information retrieval (35th : 2012) DRNTU::Engineering::Computer science and engineering Category hierarchies often evolve at a much slower pace than the documents reside in. With newly available documents kept adding into a hierarchy, new topics emerge and documents within the same category become less topically cohesive. In this paper, we propose a novel automatic approach to modifying a given category hierarchy by redistributing its documents into more topically cohesive categories. The modification is achieved with three operations (namely, sprout, merge, and assign) with reference to an auxiliary hierarchy for additional semantic information; the auxiliary hierarchy covers a similar set of topics as the hierarchy to be modified. Our user study shows that the modified category hierarchy is semantically meaningful. As an extrinsic evaluation, we conduct experiments on document classification using real data from Yahoo! Answers and AnswerBag hierarchies, and compare the classification accuracies obtained on the original and the modified hierarchies. Our experiments show that the proposed method achieves much larger classification accuracy improvement compared with several baseline methods for hierarchy modification. 2013-07-23T09:02:26Z 2019-12-06T19:44:14Z 2013-07-23T09:02:26Z 2019-12-06T19:44:14Z 2012 2012 Conference Paper Yuan, Q., Cong, G., Sun, A., Lin, C.-Y., & Magnenat-Thalmann, N. (2012). Category hierarchy maintenance: a data-driven approach. Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '12. https://hdl.handle.net/10356/97576 http://hdl.handle.net/10220/12082 10.1145/2348283.2348389 en © 2012 ACM.
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Yuan, Quan
Cong, Gao
Sun, Aixin
Lin, Chin-Yew
Magnenat-Thalmann, Nadia
Category hierarchy maintenance : a data-driven approach
description Category hierarchies often evolve at a much slower pace than the documents reside in. With newly available documents kept adding into a hierarchy, new topics emerge and documents within the same category become less topically cohesive. In this paper, we propose a novel automatic approach to modifying a given category hierarchy by redistributing its documents into more topically cohesive categories. The modification is achieved with three operations (namely, sprout, merge, and assign) with reference to an auxiliary hierarchy for additional semantic information; the auxiliary hierarchy covers a similar set of topics as the hierarchy to be modified. Our user study shows that the modified category hierarchy is semantically meaningful. As an extrinsic evaluation, we conduct experiments on document classification using real data from Yahoo! Answers and AnswerBag hierarchies, and compare the classification accuracies obtained on the original and the modified hierarchies. Our experiments show that the proposed method achieves much larger classification accuracy improvement compared with several baseline methods for hierarchy modification.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Yuan, Quan
Cong, Gao
Sun, Aixin
Lin, Chin-Yew
Magnenat-Thalmann, Nadia
format Conference or Workshop Item
author Yuan, Quan
Cong, Gao
Sun, Aixin
Lin, Chin-Yew
Magnenat-Thalmann, Nadia
author_sort Yuan, Quan
title Category hierarchy maintenance : a data-driven approach
title_short Category hierarchy maintenance : a data-driven approach
title_full Category hierarchy maintenance : a data-driven approach
title_fullStr Category hierarchy maintenance : a data-driven approach
title_full_unstemmed Category hierarchy maintenance : a data-driven approach
title_sort category hierarchy maintenance : a data-driven approach
publishDate 2013
url https://hdl.handle.net/10356/97576
http://hdl.handle.net/10220/12082
_version_ 1681058133000060928