SOFTWARE DEVELOPMENT OF STREAMING ENSEMBLE ALGORITHM (SEA): ALGORITHM FOR LARGE-SCALE CLASSIFICATION

From the beginning, data mining can't handle data with stream characteristic (data that's continuously increase over time and the increase become much faster), nowadays the needs for such data have emerge. This has made many researchers done research in data mining so it can handle data st...

Full description

Saved in:
Bibliographic Details
Main Author: DHANESWARA (NIM 23505012), GIRI
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/7978
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:From the beginning, data mining can't handle data with stream characteristic (data that's continuously increase over time and the increase become much faster), nowadays the needs for such data have emerge. This has made many researchers done research in data mining so it can handle data streams problems. This is also known as mining data streams. In data streams there is also a concept drift problems (the change of concept in the model that have been built). In the year 2001, an algorithm that was named Streaming Ensemble Algorithm (SEA), made by W. Nick Street and YongSeog Kim was developed to handle data streams and concept drift problems in data mining for classification task. SEA was developed from ensemble method. In handling data streams, the newest data blocks are learned by SEA. Moreover in handling concept drift, SEA replaces inappropriate base classifier with an appropriate classifier. SEA Software has been built using unified process development paradigm. SEA software is tested using black-box method. Experiments were conducted on census data. The objective of the experiments is to know how the parameter in SEA affects the accuration, the concept drift occurrence, training time, and the memory. The results of the experiment show an accuration of 83,09% for the training data, and 82,46% for the test data.