Deep reinforcement learning for optimal resource allocation (II)

As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimiz...

Full description

Saved in:
Bibliographic Details
Main Author: Uday, Nihal Arya
Other Authors: Zhang Jie
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/174967
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-174967
record_format dspace
spelling sg-ntu-dr.10356-1749672024-04-19T15:46:25Z Deep reinforcement learning for optimal resource allocation (II) Uday, Nihal Arya Zhang Jie School of Computer Science and Engineering ZhangJ@ntu.edu.sg Computer and Information Science As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimizing routes for limited-capacity vehicles to meet demand without exceeding capacity, while minimizing costs. Traditional heuristics such as LKH-3 have shown robust performance in CVRP but face limitations in scalability and adaptability. This study compares two advanced learning-based approaches—L2D, employing deep reinforcement learning, and NCO, with its innovative light encoder and heavy decoder architecture—against the LKH-3 algorithm. Through detailed experimentation, we evaluate their scalability, computational efficiency, and solution quality. Our findings reveal that while L2D and NCO exhibit superior generalization capabilities and demonstrate promising scalability to large-scale problem instances, nuances in performance and efficiency metrics highlight their respective strengths and areas for improvement. The comparative analysis not only underscores the potential of learning-based models in overcoming the limitations of traditional heuristics but also delineates the path for future research in integrating the computational intelligence of machine learning with the intuitive problem-solving prowess of heuristic algorithms. This synthesis aims to pave the way for innovative solutions to CVRP and other combinatorial optimization challenges, marking a significant stride toward leveraging artificial intelligence in operational research. Bachelor's degree 2024-04-17T07:39:00Z 2024-04-17T07:39:00Z 2024 Final Year Project (FYP) Uday, N. A. (2024). Deep reinforcement learning for optimal resource allocation (II). Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174967 https://hdl.handle.net/10356/174967 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
spellingShingle Computer and Information Science
Uday, Nihal Arya
Deep reinforcement learning for optimal resource allocation (II)
description As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimizing routes for limited-capacity vehicles to meet demand without exceeding capacity, while minimizing costs. Traditional heuristics such as LKH-3 have shown robust performance in CVRP but face limitations in scalability and adaptability. This study compares two advanced learning-based approaches—L2D, employing deep reinforcement learning, and NCO, with its innovative light encoder and heavy decoder architecture—against the LKH-3 algorithm. Through detailed experimentation, we evaluate their scalability, computational efficiency, and solution quality. Our findings reveal that while L2D and NCO exhibit superior generalization capabilities and demonstrate promising scalability to large-scale problem instances, nuances in performance and efficiency metrics highlight their respective strengths and areas for improvement. The comparative analysis not only underscores the potential of learning-based models in overcoming the limitations of traditional heuristics but also delineates the path for future research in integrating the computational intelligence of machine learning with the intuitive problem-solving prowess of heuristic algorithms. This synthesis aims to pave the way for innovative solutions to CVRP and other combinatorial optimization challenges, marking a significant stride toward leveraging artificial intelligence in operational research.
author2 Zhang Jie
author_facet Zhang Jie
Uday, Nihal Arya
format Final Year Project
author Uday, Nihal Arya
author_sort Uday, Nihal Arya
title Deep reinforcement learning for optimal resource allocation (II)
title_short Deep reinforcement learning for optimal resource allocation (II)
title_full Deep reinforcement learning for optimal resource allocation (II)
title_fullStr Deep reinforcement learning for optimal resource allocation (II)
title_full_unstemmed Deep reinforcement learning for optimal resource allocation (II)
title_sort deep reinforcement learning for optimal resource allocation (ii)
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/174967
_version_ 1814047087284715520