Mining generalized associations of semantic relations from textual web content
Traditional text mining techniques transform free text into flat bags of words representation, which does not preserve sufficient semantics for the purpose of knowledge discovery. In this paper, we present a two-step procedure to mine generalized associations of semantic relations conveyed by the te...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2007
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/5228 https://ink.library.smu.edu.sg/context/sis_research/article/6231/viewcontent/Mining_TKDE07.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-6231 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-62312020-07-23T18:30:40Z Mining generalized associations of semantic relations from textual web content JIANG, Tao TAN, Ah-hwee WANG, We Traditional text mining techniques transform free text into flat bags of words representation, which does not preserve sufficient semantics for the purpose of knowledge discovery. In this paper, we present a two-step procedure to mine generalized associations of semantic relations conveyed by the textual content of Web documents. First, RDF (resource description framework) metadata representing semantic relations are extracted from raw text using a myriad of natural language processing techniques. The relation extraction process also creates a term taxonomy in the form of a sense hierarchy inferred from WordNet. Then, a novel generalized association pattern mining algorithm (GP-Close) is applied to discover the underlying relation association patterns on RDF metadata. For pruning the large number of redundant overgeneralized patterns in relation pattern search space, the GP-Close algorithm adopts the notion of generalization closure for systematic overgeneralization reduction. The efficacy of our approach is demonstrated through empirical experiments conducted on an online database of terrorist activities. 2007-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5228 info:doi/10.1109/TKDE.2007.36 https://ink.library.smu.edu.sg/context/sis_research/article/6231/viewcontent/Mining_TKDE07.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University RDF mining association rule mining relation association text mining Computer Engineering Databases and Information Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
RDF mining association rule mining relation association text mining Computer Engineering Databases and Information Systems |
spellingShingle |
RDF mining association rule mining relation association text mining Computer Engineering Databases and Information Systems JIANG, Tao TAN, Ah-hwee WANG, We Mining generalized associations of semantic relations from textual web content |
description |
Traditional text mining techniques transform free text into flat bags of words representation, which does not preserve sufficient semantics for the purpose of knowledge discovery. In this paper, we present a two-step procedure to mine generalized associations of semantic relations conveyed by the textual content of Web documents. First, RDF (resource description framework) metadata representing semantic relations are extracted from raw text using a myriad of natural language processing techniques. The relation extraction process also creates a term taxonomy in the form of a sense hierarchy inferred from WordNet. Then, a novel generalized association pattern mining algorithm (GP-Close) is applied to discover the underlying relation association patterns on RDF metadata. For pruning the large number of redundant overgeneralized patterns in relation pattern search space, the GP-Close algorithm adopts the notion of generalization closure for systematic overgeneralization reduction. The efficacy of our approach is demonstrated through empirical experiments conducted on an online database of terrorist activities. |
format |
text |
author |
JIANG, Tao TAN, Ah-hwee WANG, We |
author_facet |
JIANG, Tao TAN, Ah-hwee WANG, We |
author_sort |
JIANG, Tao |
title |
Mining generalized associations of semantic relations from textual web content |
title_short |
Mining generalized associations of semantic relations from textual web content |
title_full |
Mining generalized associations of semantic relations from textual web content |
title_fullStr |
Mining generalized associations of semantic relations from textual web content |
title_full_unstemmed |
Mining generalized associations of semantic relations from textual web content |
title_sort |
mining generalized associations of semantic relations from textual web content |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2007 |
url |
https://ink.library.smu.edu.sg/sis_research/5228 https://ink.library.smu.edu.sg/context/sis_research/article/6231/viewcontent/Mining_TKDE07.pdf |
_version_ |
1770575341042008064 |