KEYWORD SEARCH SYSTEM BASED ON KLUSTREE METHOD
KlusTree is a method that focused on better user interpretation of keyword search results. KlusTree perform clustering on keyword search results using language model. This method work best for graph data so data modelling is necessary for data that doesn’t have graph structure. This study aim for se...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/39881 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | KlusTree is a method that focused on better user interpretation of keyword search results. KlusTree perform clustering on keyword search results using language model. This method work best for graph data so data modelling is necessary for data that doesn’t have graph structure. This study aim for searching the best method to model a non-graph data structure to graph structure. The experiment analyze three possible modelling graph methods. We implement relational semantic modelling to solve the defined problem. In addition to graph limitation, KlusTree return a tree as clustering results of keyword search in graph. We propose another representation for a cluster in the form of sentences to improve user understanding. The sentence representation is generated by adapting TextRank and Depth First Search (DFS) technique.
Relational semantic modelling has the privilege of transforming relational model into graph without losing the semantic. The experiment treat CSV file as relational table to fit requirements of the semantic modelling. TextRank use graph-based ranking method to extract keywords as summary from a document. To be able to perform graph-based ranking, TextRank sees a document as a structured graph. A cluster which contain keyword search results in a tree form can be seen as a representation of a document. We use graph-based ranking method in TextRank to search the center of our cluster. Our approach to build sentence representation of a cluster is using DFS to trace a path from the center of cluster. The DFS method will return sentences until it reach nodes that belong to relationship queries.
The experiment use DBLP data which contain information of computer science journal. We use filtered DBLP data as well as the whole DBLP data to test the relational semantic mapping method. We evaluate the graph structure produce by implementing semantic mapping to those data. The second test we perform is to evaluate the sentence representation returned by TextRank and DFS method. Relationship queries is given into the KlusTree system and we compare the original results in a tree form with our approach of sentence form.
The solution we propose can solve the problem of modelling CSV files into graph data and generating sentence representation of a tree cluster from keyword search results. Although annotation of CSV files affect the graph annotation quality thus affect the sentence output, the sentence still understandable to user and give another view of the keyword search results. |
---|