A fair evaluation of the potential of machine learning in maritime transportation

Machine learning (ML) techniques are extensively applied to practical maritime transportation issues. Due to the difficulty and high cost of collecting large volumes of data in the maritime industry, in many maritime studies, ML models are trained with small training datasets. The relative predictiv...

Full description

Saved in:
Bibliographic Details
Main Authors: Luo, Xi, Yan, Ran, Wang, Shuaian, Zhen, Lu
Other Authors: School of Civil and Environmental Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173568
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Machine learning (ML) techniques are extensively applied to practical maritime transportation issues. Due to the difficulty and high cost of collecting large volumes of data in the maritime industry, in many maritime studies, ML models are trained with small training datasets. The relative predictive performances of these trained ML models are then compared with each other and with the conventional model using the same test set. The ML model that performs the best out of the ML models and better than the conventional model on the test set is regarded as the most effective in terms of this prediction task. However, in scenarios with small datasets, this common process may lead to an unfair comparison between the ML and the conventional model. Therefore, we propose a novel process to fairly compare multiple ML models and the conventional model. We first select the best ML model in terms of predictive performance for the validation set. Then, we combine the training and the validation sets to retrain the best ML model and compare it with the conventional model on the same test set. Based on historical port state control (PSC) inspection data, we examine both the common process and the novel process in terms of their ability to fairly compare ML models and the conventional model. The results show that the novel process is more effective at fairly comparing the ML models with the conventional model on different test sets. Therefore, the novel process enables a fair assessment of ML models’ ability to predict key performance indicators in the context of limited data availability in the maritime industry, such as predicting the ship fuel consumption and port traffic volume, thereby enhancing their reliability for real-world applications.