DETERMINING HOME-BASED WORK TRIP BASED ON TWITTER DATA USING MACHINE LEARNING

This research aims to predict the number of Home-Based Work (HBW) trips at zonal level using Twitter data and Machine Learning approaches. The conclusion of this research shows that using Twitter data alone is not effective, and the integration of Twitter data with with Home-Interview (HI) survey...

Full description

Saved in:
Bibliographic Details
Main Author: Sora Rayat, Rempu
Format: Dissertations
Language:Indonesia
Subjects:
Online Access:https://digilib.itb.ac.id/gdl/view/87177
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:This research aims to predict the number of Home-Based Work (HBW) trips at zonal level using Twitter data and Machine Learning approaches. The conclusion of this research shows that using Twitter data alone is not effective, and the integration of Twitter data with with Home-Interview (HI) survey data shows better model performance, namely being able to increase the accuracy of the model predicting worker trip-rates per zone. The PM approach is used to predict the value of explanatory variables in the prediction model, where the target variable is worker trip-rate per zone. Predicting the amount of HBW production per zone with an unbalanced amount of data in urban zones uses Oridinary Least Square (OLS). In this research, Twitter data from 2018 to 2021 was used to obtain information on residence location, workplace location, employment status and type of user's employment, education level, income level, ownership of 2-wheeled vehicles (motorbikes), 4-wheeled vehicles (cars), distance from residence location to work location, and number of daily HBW trips. As support, 2018 HI data is used to provide more comprehensive socio-economic information and trip patterns. The data integration process involved matching individual origin zones in both Twitter data HI survey data, employment status, occupation type, education level, income level, vehicle ownership (two-wheelers: motor bike and four-wheelers: car), the distance between home and work location, and the number of daily HBW trips. The prediction model based on Twitter data integrated with 2018 HI data shows superior performance compared to using only Twitter data. The OLS method provides coefficients for each explanatory variable in the model, which cannot be obtained when using the ANN method. The model in question was then used to estimate the number of HBW trip production for each zone in the research area based on Twitter data for 2018, 2019, 2020 and 2021. The case study was conducted in Serang City, Indonesia.