THE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS

Social media is often used nowadays. Therefore, there is a great potential in the data generated by users. One of its uses is by grouping data containing uniform information. To process it, data stream techniques can be used to process data into small pieces, but immediately react to changes in d...

Full description

Saved in:
Bibliographic Details
Main Author: Daffa Dinaya, Muhammad
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/72011
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:72011
spelling id-itb.:720112023-03-01T15:13:18ZTHE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS Daffa Dinaya, Muhammad Indonesia Final Project Data stream, text data, CluStream, application framework INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/72011 Social media is often used nowadays. Therefore, there is a great potential in the data generated by users. One of its uses is by grouping data containing uniform information. To process it, data stream techniques can be used to process data into small pieces, but immediately react to changes in data. In previous research, a data stream processing engine named Apache Flink was used which was quite difficult because it required sending program compilations to a distributed system and using the Java language. Thus, it would be difficult if you want to process text data, which is more developed in the Python language. From this problem, a framework was developed to facilitate the development of clustering applications for data streams on Apache Flink Statefun and FastAPI. This framework works by reading user configurations and then running the process formed to process the data stream. The framework also provides process customization if users want to implement it independently and use it as a process. The framework can speed up development by efficiently generating source code that needs to be generated by users. Acceleration assistance is available if users involve processes, both those provided by the application framework and customized processes. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Social media is often used nowadays. Therefore, there is a great potential in the data generated by users. One of its uses is by grouping data containing uniform information. To process it, data stream techniques can be used to process data into small pieces, but immediately react to changes in data. In previous research, a data stream processing engine named Apache Flink was used which was quite difficult because it required sending program compilations to a distributed system and using the Java language. Thus, it would be difficult if you want to process text data, which is more developed in the Python language. From this problem, a framework was developed to facilitate the development of clustering applications for data streams on Apache Flink Statefun and FastAPI. This framework works by reading user configurations and then running the process formed to process the data stream. The framework also provides process customization if users want to implement it independently and use it as a process. The framework can speed up development by efficiently generating source code that needs to be generated by users. Acceleration assistance is available if users involve processes, both those provided by the application framework and customized processes.
format Final Project
author Daffa Dinaya, Muhammad
spellingShingle Daffa Dinaya, Muhammad
THE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS
author_facet Daffa Dinaya, Muhammad
author_sort Daffa Dinaya, Muhammad
title THE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS
title_short THE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS
title_full THE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS
title_fullStr THE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS
title_full_unstemmed THE DATA STREAM CLUSTERING APPLICATION FRAMEWORK FOR TEXT ANALYSIS
title_sort data stream clustering application framework for text analysis
url https://digilib.itb.ac.id/gdl/view/72011
_version_ 1822006740232699904