#TITLE_ALTERNATIVE#
Data mining is a core process in KDD (Knowledge Discovery in Databases) that extract the knowledge from a large databases. One of the methods in data mining is classification that is the searching process for model classification that can differentiate its class object label. Bayesian Networks is on...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/11389 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Data mining is a core process in KDD (Knowledge Discovery in Databases) that extract the knowledge from a large databases. One of the methods in data mining is classification that is the searching process for model classification that can differentiate its class object label. Bayesian Networks is one of the technique that can be use to build classification model. Bayesian Network component consist of two DAG structures that describe the causality between data attribute and a table fill with condition probability base on the previous attribute. Many algorithms have been expanded to construct Bayesian Network structure, either for complete databases or incomplete databases (missing value is existed).<p> <br />
<br />
<br />
<br />
<br />
<br />
<br />
CB* Algorithm which combine two approaching, dependency analysis and search and scoring, is one of the algorithm to construct Bayesian Network structure from incomplete databases. This algorithm consist of two phases where phase one to produce node ordering and phase two to construct DAG structure from Bayesian Network.<p> <br />
<br />
<br />
<br />
<br />
<br />
<br />
The objective of this thesis is to evaluate CB* Algorithm from its function point of view that are able to generate node ordering that produce structure that Markov equivalent to original structure, able to construct Bayesian Network from incomplete databases, the amount of missing value have no influence to Bayesian Network structure and able to construct Bayesian Network structure without prior information.<p> <br />
<br />
<br />
<br />
<br />
<br />
<br />
Base on the experiment with software and evaluation that had been done, Algorithm CB* alone can generate node ordering which is produce on the first phase and this node ordering can be use as an entrance to the second phase. Algorithm CB* is able to construct Bayesian Network structure from incomplete databases, this is due to the working method of phase two that can handle missing value. Whereas the amount of missing value have no significant influence to Bayesian Network structure. Big amount or small amount of missing value in database, can produce the same or different Bayesian Network structure with Bayesian Network structure with complete data. This is due to influence of the number of data sampling and combination or pattern of data in database. But the emphasized this thesis is, no matter how many missing value occured in database, CB* Algorithm can still produce Bayesian Network structure. CB* Algorithm is also able to work without prior information on handling the missing value, it means this algorithm does not working by filling the missing value but directly construct Bayesian Network structure from the available data. |
---|