Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification
Recent research in digital forensic attempts to classify image clusters into JPEG or non-JPEG clusters before recovering JPEG image files. This issue might improve the recovering JPEG image accuracy and reduce the processing time. In this work, three content-based feature extraction methods are used...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Conference Paper |
Published: |
Springer Science and Business Media Deutschland GmbH
2023
|
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Tenaga Nasional |
Summary: | Recent research in digital forensic attempts to classify image clusters into JPEG or non-JPEG clusters before recovering JPEG image files. This issue might improve the recovering JPEG image accuracy and reduce the processing time. In this work, three content-based feature extraction methods are used. The Rate of Change (RoC) is used for tracking relevant bytes in the appropriate groups of their orders. Entropy and Byte Frequency Distribution (BFD) are used to produce an image cluster histogram based on the size of the byte value. Subsequently, we deploy the Extreme Learning Machine (ELM) classifier to evaluate these three features. The ELM identifies the type based on the generated feature vector, whether a JPEG file or a non-JPEG file type. The proposed method is implemented in MATLAB 2017a software and tested and evaluated by using the DFRWS dataset. The test results show that the ELM produces high classification accuracy in identifying the file type. The difference in accuracy between the combinations of the tested features is relatively small. The worst accuracy is generated when the entropy method is used, which is 72.62%, and the best accuracy of 93.46% is generated when using a combination of the three features. � 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
---|