Confidentiality based file attributes and data classification using TsF-KNN

Machine Leaning (ML) plays an important role in the electronic data management. It is always costly and difficult to manage the data manually without adopting ML or with ML using metadata. Many ML algorithms have been proposed to solve different data management issues, but the prediction of the conf...

Full description

Saved in:
Bibliographic Details
Main Authors: Ali, M., Jung, L.T.
Format: Conference or Workshop Item
Published: Institute of Electrical and Electronics Engineers Inc. 2015
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-84961718321&doi=10.1109%2fICITCS.2015.7292963&partnerID=40&md5=116cec69b61b578ad955d27cec18a780
http://eprints.utp.edu.my/31637/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Petronas
Description
Summary:Machine Leaning (ML) plays an important role in the electronic data management. It is always costly and difficult to manage the data manually without adopting ML or with ML using metadata. Many ML algorithms have been proposed to solve different data management issues, but the prediction of the confidential data and non- confidential data in a data file is still a challenging research gap. A file cannot be categorized into a single category/class because the data in one simply file may fall into different categories/classes. The main objective of this study is to predict the confidential and non-confidential data of a file using K-NN algorithm. We also proposed a method called Training dataset Filtration Key Nearest Neighbour (TsF-KNN) classifier which classifies the data of file based on the confidentiality level of the schema of a file (file attributes). The proposed algorithm, TsF-KNN, is efficient in the context of time and has a higher accuracy as compared to the traditional K-NN algorithms. © 2015 IEEE.