Rule Identification from Web Pages by the XRML Approach

In the world of Web pages, there are oceans of documents in natural language texts and tables. To extract rules from Web pages and maintain consistency between them, we have developed the framework of XRML (eXtensible Rule Markup Language). XRML allows the identification of rules on Web pages and ge...

Full description

Saved in:

Bibliographic Details
Main Authors:	KANG, Juyoung, LEE, Jae Kyu
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2005
Subjects:	Rule identification Rule acquisition Knowledge engineering Knowledge acquisition XRML RuleML XML Computer Sciences Management Information Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/1181 http://dx.doi.org/10.1016/j.dss.2005.01.004
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-2180
record_format	dspace
spelling	sg-smu-ink.sis_research-21802010-12-22T08:24:06Z Rule Identification from Web Pages by the XRML Approach KANG, Juyoung LEE, Jae Kyu In the world of Web pages, there are oceans of documents in natural language texts and tables. To extract rules from Web pages and maintain consistency between them, we have developed the framework of XRML (eXtensible Rule Markup Language). XRML allows the identification of rules on Web pages and generates the identified rules automatically. For this purpose, we have designed the Rule Identification Markup Language (RIML), which is similar to the formal Rule Structure Markup Language (RSML), both as parts of XRML. RIML 2.0 is designed to identify rules not only from texts, but also from tables on Web pages, and to transform to the formal rules in RSML syntax automatically. While designing RIML 2.0, we considered the features of sharing variables and values, omitted terms, and synonyms. We have conducted an experiment to evaluate the potential benefit of the XRML approach with real world Web pages of Amazon.com, BarnesandNoble.com, and Powells.com. We found that 100.0% of the rules and 99.7% of the rule components could be identified and automatically generated if we do not count the statements for linkages, which generically do not exist on the Web pages. Since the linkage components occupy 11.2% of all components in the rule base, the overall limitation of automatic rule generation is 88.8%. In this setting, 88.5% of the overall rule components could be generated from the identified rules from the Web pages. The result provides solid proof that XRML can facilitate the extraction and maintenance of rules from Web pages while building expert systems in the Semantic Web environment. 2005-01-01T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/1181 info:doi/10.1016/j.dss.2005.01.004 http://dx.doi.org/10.1016/j.dss.2005.01.004 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Rule identification Rule acquisition Knowledge engineering Knowledge acquisition XRML RuleML XML Computer Sciences Management Information Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Rule identification Rule acquisition Knowledge engineering Knowledge acquisition XRML RuleML XML Computer Sciences Management Information Systems
spellingShingle	Rule identification Rule acquisition Knowledge engineering Knowledge acquisition XRML RuleML XML Computer Sciences Management Information Systems KANG, Juyoung LEE, Jae Kyu Rule Identification from Web Pages by the XRML Approach
description	In the world of Web pages, there are oceans of documents in natural language texts and tables. To extract rules from Web pages and maintain consistency between them, we have developed the framework of XRML (eXtensible Rule Markup Language). XRML allows the identification of rules on Web pages and generates the identified rules automatically. For this purpose, we have designed the Rule Identification Markup Language (RIML), which is similar to the formal Rule Structure Markup Language (RSML), both as parts of XRML. RIML 2.0 is designed to identify rules not only from texts, but also from tables on Web pages, and to transform to the formal rules in RSML syntax automatically. While designing RIML 2.0, we considered the features of sharing variables and values, omitted terms, and synonyms. We have conducted an experiment to evaluate the potential benefit of the XRML approach with real world Web pages of Amazon.com, BarnesandNoble.com, and Powells.com. We found that 100.0% of the rules and 99.7% of the rule components could be identified and automatically generated if we do not count the statements for linkages, which generically do not exist on the Web pages. Since the linkage components occupy 11.2% of all components in the rule base, the overall limitation of automatic rule generation is 88.8%. In this setting, 88.5% of the overall rule components could be generated from the identified rules from the Web pages. The result provides solid proof that XRML can facilitate the extraction and maintenance of rules from Web pages while building expert systems in the Semantic Web environment.
format	text
author	KANG, Juyoung LEE, Jae Kyu
author_facet	KANG, Juyoung LEE, Jae Kyu
author_sort	KANG, Juyoung
title	Rule Identification from Web Pages by the XRML Approach
title_short	Rule Identification from Web Pages by the XRML Approach
title_full	Rule Identification from Web Pages by the XRML Approach
title_fullStr	Rule Identification from Web Pages by the XRML Approach
title_full_unstemmed	Rule Identification from Web Pages by the XRML Approach
title_sort	rule identification from web pages by the xrml approach
publisher	Institutional Knowledge at Singapore Management University
publishDate	2005
url	https://ink.library.smu.edu.sg/sis_research/1181 http://dx.doi.org/10.1016/j.dss.2005.01.004
_version_	1770570871369367552

Rule Identification from Web Pages by the XRML Approach

Similar Items