Mining generalized associations of semantic relations from textual web content

Traditional text mining techniques transform free text into flat bags of words representation, which does not preserve sufficient semantics for the purpose of knowledge discovery. In this paper, we present a two-step procedure to mine generalized associations of semantic relations conveyed by the te...

Full description

Saved in:
Bibliographic Details
Main Authors: JIANG, Tao, TAN, Ah-hwee, WANG, We
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2007
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5228
https://ink.library.smu.edu.sg/context/sis_research/article/6231/viewcontent/Mining_TKDE07.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-6231
record_format dspace
spelling sg-smu-ink.sis_research-62312020-07-23T18:30:40Z Mining generalized associations of semantic relations from textual web content JIANG, Tao TAN, Ah-hwee WANG, We Traditional text mining techniques transform free text into flat bags of words representation, which does not preserve sufficient semantics for the purpose of knowledge discovery. In this paper, we present a two-step procedure to mine generalized associations of semantic relations conveyed by the textual content of Web documents. First, RDF (resource description framework) metadata representing semantic relations are extracted from raw text using a myriad of natural language processing techniques. The relation extraction process also creates a term taxonomy in the form of a sense hierarchy inferred from WordNet. Then, a novel generalized association pattern mining algorithm (GP-Close) is applied to discover the underlying relation association patterns on RDF metadata. For pruning the large number of redundant overgeneralized patterns in relation pattern search space, the GP-Close algorithm adopts the notion of generalization closure for systematic overgeneralization reduction. The efficacy of our approach is demonstrated through empirical experiments conducted on an online database of terrorist activities. 2007-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5228 info:doi/10.1109/TKDE.2007.36 https://ink.library.smu.edu.sg/context/sis_research/article/6231/viewcontent/Mining_TKDE07.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University RDF mining association rule mining relation association text mining Computer Engineering Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic RDF mining
association rule mining
relation association
text mining
Computer Engineering
Databases and Information Systems
spellingShingle RDF mining
association rule mining
relation association
text mining
Computer Engineering
Databases and Information Systems
JIANG, Tao
TAN, Ah-hwee
WANG, We
Mining generalized associations of semantic relations from textual web content
description Traditional text mining techniques transform free text into flat bags of words representation, which does not preserve sufficient semantics for the purpose of knowledge discovery. In this paper, we present a two-step procedure to mine generalized associations of semantic relations conveyed by the textual content of Web documents. First, RDF (resource description framework) metadata representing semantic relations are extracted from raw text using a myriad of natural language processing techniques. The relation extraction process also creates a term taxonomy in the form of a sense hierarchy inferred from WordNet. Then, a novel generalized association pattern mining algorithm (GP-Close) is applied to discover the underlying relation association patterns on RDF metadata. For pruning the large number of redundant overgeneralized patterns in relation pattern search space, the GP-Close algorithm adopts the notion of generalization closure for systematic overgeneralization reduction. The efficacy of our approach is demonstrated through empirical experiments conducted on an online database of terrorist activities.
format text
author JIANG, Tao
TAN, Ah-hwee
WANG, We
author_facet JIANG, Tao
TAN, Ah-hwee
WANG, We
author_sort JIANG, Tao
title Mining generalized associations of semantic relations from textual web content
title_short Mining generalized associations of semantic relations from textual web content
title_full Mining generalized associations of semantic relations from textual web content
title_fullStr Mining generalized associations of semantic relations from textual web content
title_full_unstemmed Mining generalized associations of semantic relations from textual web content
title_sort mining generalized associations of semantic relations from textual web content
publisher Institutional Knowledge at Singapore Management University
publishDate 2007
url https://ink.library.smu.edu.sg/sis_research/5228
https://ink.library.smu.edu.sg/context/sis_research/article/6231/viewcontent/Mining_TKDE07.pdf
_version_ 1770575341042008064