A knowledge based system for automatic classification of web pages

The paper describes design and implementation of a new knowledge based system for Automatic Information Retrieval DataBase (AIRDB).AIRDB helps the end-user to cluster and classify web pages on the basis of information filtering combined with an Artificial Neural Network (ANN).The classification de...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Fathy, Sherif Kassem
التنسيق: Conference or Workshop Item
اللغة:English
منشور في: 2006
الموضوعات:
الوصول للمادة أونلاين:http://repo.uum.edu.my/11537/1/564.pdf
http://repo.uum.edu.my/11537/
http://www.kmice.cms.net.my/
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:The paper describes design and implementation of a new knowledge based system for Automatic Information Retrieval DataBase (AIRDB).AIRDB helps the end-user to cluster and classify web pages on the basis of information filtering combined with an Artificial Neural Network (ANN).The classification depends mainly on keyword indexes.A large sample set consists of 11043 web pages of several formats are collected automatically and randomly from various resources.The AIRDB feature selection algorithm is summarized.The feature selection depends upon stemming words of web page. Each stem word is generated with local profile. This local profile contains information that indicates the weight of each stem with the possible related classes of web pages.A statistical analysis process is illustrated to reduce the noise stems.The various components of the AIRDB are described.The knowledge based system is tested with various web pages that disseminate their content in English.The average discrimination performance of the AIRDB reaches 84%.