Development of data mining and knowledge discovery system
This project implemented a data mining web application powered by open sourced data mining package such as Weka and Mahout. Traditionally users who want to use those softwares must download and setup on their own machine. This method has two drawbacks: firstly, the setup process can be quite involve...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/59125 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This project implemented a data mining web application powered by open sourced data mining package such as Weka and Mahout. Traditionally users who want to use those softwares must download and setup on their own machine. This method has two drawbacks: firstly, the setup process can be quite involved and secondly, the user cannot process large data set using a single machine. This calls for a need for a data mining web application that is ready to be used by the user, provides a friendly user interface and leverages on the wide range of capabilities and scalability of open source packages. There exists some websites with the same purpose such as BigML, however these websites offer very limited number of data mining algorithms. This project has implemented a data mining solution comprised of 1) Weka - a data mining package developed by University of Waikato, 2) Apache Mahout - a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms and 3) A user friendly web interface. Through this project, we have found that the it is very promising to combine the capabilities of these softwares into a web application. The system can be further improved by 1) allowing the user to interact with the result of the data mining process, 2) migrate the CSS framework to the latest version, 3) include a notification system and 4) include more type of plots for visualization.
|
---|