Text content analysis for illicit web pages by using neural networks

Illicit web contents such as pornography, violence, and gambling have greatly polluted the mind of web users especially children and teenagers. Due to the ineffectiveness of some popular web filtering techniques like Uniform Resource Locator (URL) blocking and Platform for Internet Content Selection...

Full description

Saved in:
Bibliographic Details
Main Authors: Lee, Zhi Sam, Maarof, Mohd. Aizaini, Selamat, Ali, Shamsuddin, Siti Mariyam
Format: Article
Language:English
English
Published: Penerbit UTM Press 2009
Subjects:
Online Access:http://eprints.utm.my/id/eprint/21030/1/AliSelamat2009_TextContentAnalysisforIllicit.pdf
http://eprints.utm.my/id/eprint/21030/2/jurnalteknologi/article/view/168
http://eprints.utm.my/id/eprint/21030/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
English
id my.utm.21030
record_format eprints
spelling my.utm.210302017-11-01T04:17:20Z http://eprints.utm.my/id/eprint/21030/ Text content analysis for illicit web pages by using neural networks Lee, Zhi Sam Maarof, Mohd. Aizaini Selamat, Ali Shamsuddin, Siti Mariyam Q Science (General) QA75 Electronic computers. Computer science Illicit web contents such as pornography, violence, and gambling have greatly polluted the mind of web users especially children and teenagers. Due to the ineffectiveness of some popular web filtering techniques like Uniform Resource Locator (URL) blocking and Platform for Internet Content Selection (PICS) checking against today's dynamic web contents, content based analysis techniques with effective model are highly desired. In this paper, we have proposed a textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We have examined the entropy scheme with two other common term weighting schemes that are TFIDF and Glasgow. Those techniques have been tested with artificial neural network using small class dataset. In this study, we found that our proposed model has achieved better performance in terms accuracy, convergence speed, and stability compared to the other techniques. Penerbit UTM Press 2009-06 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/21030/1/AliSelamat2009_TextContentAnalysisforIllicit.pdf text/html en http://eprints.utm.my/id/eprint/21030/2/jurnalteknologi/article/view/168 Lee, Zhi Sam and Maarof, Mohd. Aizaini and Selamat, Ali and Shamsuddin, Siti Mariyam (2009) Text content analysis for illicit web pages by using neural networks. Jurnal Teknologi, 50 (D). pp. 73-91. ISSN 2180-3722 DOI:10.11113/jt.v50.168
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
English
topic Q Science (General)
QA75 Electronic computers. Computer science
spellingShingle Q Science (General)
QA75 Electronic computers. Computer science
Lee, Zhi Sam
Maarof, Mohd. Aizaini
Selamat, Ali
Shamsuddin, Siti Mariyam
Text content analysis for illicit web pages by using neural networks
description Illicit web contents such as pornography, violence, and gambling have greatly polluted the mind of web users especially children and teenagers. Due to the ineffectiveness of some popular web filtering techniques like Uniform Resource Locator (URL) blocking and Platform for Internet Content Selection (PICS) checking against today's dynamic web contents, content based analysis techniques with effective model are highly desired. In this paper, we have proposed a textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We have examined the entropy scheme with two other common term weighting schemes that are TFIDF and Glasgow. Those techniques have been tested with artificial neural network using small class dataset. In this study, we found that our proposed model has achieved better performance in terms accuracy, convergence speed, and stability compared to the other techniques.
format Article
author Lee, Zhi Sam
Maarof, Mohd. Aizaini
Selamat, Ali
Shamsuddin, Siti Mariyam
author_facet Lee, Zhi Sam
Maarof, Mohd. Aizaini
Selamat, Ali
Shamsuddin, Siti Mariyam
author_sort Lee, Zhi Sam
title Text content analysis for illicit web pages by using neural networks
title_short Text content analysis for illicit web pages by using neural networks
title_full Text content analysis for illicit web pages by using neural networks
title_fullStr Text content analysis for illicit web pages by using neural networks
title_full_unstemmed Text content analysis for illicit web pages by using neural networks
title_sort text content analysis for illicit web pages by using neural networks
publisher Penerbit UTM Press
publishDate 2009
url http://eprints.utm.my/id/eprint/21030/1/AliSelamat2009_TextContentAnalysisforIllicit.pdf
http://eprints.utm.my/id/eprint/21030/2/jurnalteknologi/article/view/168
http://eprints.utm.my/id/eprint/21030/
_version_ 1643647249754882048