Problem restructuring in interger programming for reduct searching
Standard Integer Programming / Decision Related Integer Programming (SIP/DRIP) is a reduct searching system that finds the reducts in an information system. These reducts are the minimal attributes of the information system that are useful in classificatory task. They can describe the whole infor...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2003
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/8702/1/FSKTM_2003_2%20IR.pdf http://psasir.upm.edu.my/id/eprint/8702/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Putra Malaysia |
Language: | English English |
Summary: | Standard Integer Programming / Decision Related Integer Programming (SIP/DRIP) is a
reduct searching system that finds the reducts in an information system. These reducts
are the minimal attributes of the information system that are useful in classificatory task.
They can describe the whole information system when implementing discernment. In
effect, they are very useful in generating rules when solving the classification problem
that is inherent in data mining.
The thesis emphasizes mainly on the improvement of the original SIP/DRIP algorithm in
term of performance. By using problem restructuring, the searching time and memory
are minimized. Simultaneously the approach adheres to an essential criterion of the
original method. That is, to improve performance without sacrificing the quality of the
reduct.Problem restructuring changes the input of the SIP/DRIP without losing any of inpufs
essential properties. In other words, no lost of knowledge occurs with problem
restructuring. Only the structural order changes, with the contents kept intact. This
hypothetically ensures that no adverse distortion transpired within SIP/DRIP.
Restructuring is done by speculating a promising structure for the input to SIP/DRIP
based on the potentiality of the attributes in the information system. It uses a nonexpensive
approach to predict potentiality. Simply, based on the total covering of each
attributes within the information system. Although this measurement is just an
approximation, it can be proven to work.
To implement the experiment, five data sets were taken. They are gathered from the
UCI machine learning repositories. Results are measured by comparing the
performance of SIP/DRIP with and without problem restructuring. In addition, the length
of reducts generated by both approaches are also compared to ensure that no quality
deterioration occurred along the way.
Experimental results have shown that problem restructuring generally improves
SIP/DRIP for all the data sets. This means that on average, it would enhance the
performance of SIP/DRIP. The consumption of time and space were minimized quite
significantly. Furthermore, the quality of the solutions was also successfully maintained.
There was no increase in reduct length when using it.
The concept offered by the approach is an additional component to SIP/DRIP. It
complements the process of searching done. By giving more consideration to the initial problem space and not just the searching of the solution, the performance of SIP/DRIP
has been humbly improved. |
---|