Re-engineering structures from web documents

To realize a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by the use of XML standard. In this paper, we propose a general framework for reverse engineering (or re-engineering) the u...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: HUE, Moh Chuang, LIM, Ee Peng, NG, Wee-Keong
التنسيق: text
اللغة:English
منشور في: Institutional Knowledge at Singapore Management University 2000
الموضوعات:
الوصول للمادة أونلاين:https://ink.library.smu.edu.sg/sis_research/966
https://ink.library.smu.edu.sg/context/sis_research/article/1965/viewcontent/p67_moh.pdf
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Singapore Management University
اللغة: English
id sg-smu-ink.sis_research-1965
record_format dspace
spelling sg-smu-ink.sis_research-19652018-06-20T03:13:39Z Re-engineering structures from web documents HUE, Moh Chuang LIM, Ee Peng NG, Wee-Keong To realize a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by the use of XML standard. In this paper, we propose a general framework for reverse engineering (or re-engineering) the underlying structures i.e.,the DTD from a collection of similarly structured XML documents when they share some common but unknown DTDs. The essential data structures and algorithms for the DTD generation have been delveloped and experiments on real Web collections have been conducted to demonstrate their feasibilty. In addition, we also proposed a method ofimposing a constraint on the repetitiveness on the element in a DTD rule to further simplify the generated DTD without compromising their correctness. 2000-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/966 info:doi/10.1145/336597.336638 https://ink.library.smu.edu.sg/context/sis_research/article/1965/viewcontent/p67_moh.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Databases and Information Systems
Numerical Analysis and Scientific Computing
HUE, Moh Chuang
LIM, Ee Peng
NG, Wee-Keong
Re-engineering structures from web documents
description To realize a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by the use of XML standard. In this paper, we propose a general framework for reverse engineering (or re-engineering) the underlying structures i.e.,the DTD from a collection of similarly structured XML documents when they share some common but unknown DTDs. The essential data structures and algorithms for the DTD generation have been delveloped and experiments on real Web collections have been conducted to demonstrate their feasibilty. In addition, we also proposed a method ofimposing a constraint on the repetitiveness on the element in a DTD rule to further simplify the generated DTD without compromising their correctness.
format text
author HUE, Moh Chuang
LIM, Ee Peng
NG, Wee-Keong
author_facet HUE, Moh Chuang
LIM, Ee Peng
NG, Wee-Keong
author_sort HUE, Moh Chuang
title Re-engineering structures from web documents
title_short Re-engineering structures from web documents
title_full Re-engineering structures from web documents
title_fullStr Re-engineering structures from web documents
title_full_unstemmed Re-engineering structures from web documents
title_sort re-engineering structures from web documents
publisher Institutional Knowledge at Singapore Management University
publishDate 2000
url https://ink.library.smu.edu.sg/sis_research/966
https://ink.library.smu.edu.sg/context/sis_research/article/1965/viewcontent/p67_moh.pdf
_version_ 1770570797294813184