Deep reinforcement learning for optimal resource allocation (II)

As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimiz...

Full description

Saved in:

Bibliographic Details
Main Author:	Uday, Nihal Arya
Other Authors:	Zhang Jie
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science
Online Access:	https://hdl.handle.net/10356/174967
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-174967
record_format	dspace
spelling	sg-ntu-dr.10356-1749672024-04-19T15:46:25Z Deep reinforcement learning for optimal resource allocation (II) Uday, Nihal Arya Zhang Jie School of Computer Science and Engineering ZhangJ@ntu.edu.sg Computer and Information Science As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimizing routes for limited-capacity vehicles to meet demand without exceeding capacity, while minimizing costs. Traditional heuristics such as LKH-3 have shown robust performance in CVRP but face limitations in scalability and adaptability. This study compares two advanced learning-based approaches—L2D, employing deep reinforcement learning, and NCO, with its innovative light encoder and heavy decoder architecture—against the LKH-3 algorithm. Through detailed experimentation, we evaluate their scalability, computational efficiency, and solution quality. Our findings reveal that while L2D and NCO exhibit superior generalization capabilities and demonstrate promising scalability to large-scale problem instances, nuances in performance and efficiency metrics highlight their respective strengths and areas for improvement. The comparative analysis not only underscores the potential of learning-based models in overcoming the limitations of traditional heuristics but also delineates the path for future research in integrating the computational intelligence of machine learning with the intuitive problem-solving prowess of heuristic algorithms. This synthesis aims to pave the way for innovative solutions to CVRP and other combinatorial optimization challenges, marking a significant stride toward leveraging artificial intelligence in operational research. Bachelor's degree 2024-04-17T07:39:00Z 2024-04-17T07:39:00Z 2024 Final Year Project (FYP) Uday, N. A. (2024). Deep reinforcement learning for optimal resource allocation (II). Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174967 https://hdl.handle.net/10356/174967 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science
spellingShingle	Computer and Information Science Uday, Nihal Arya Deep reinforcement learning for optimal resource allocation (II)
description	As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimizing routes for limited-capacity vehicles to meet demand without exceeding capacity, while minimizing costs. Traditional heuristics such as LKH-3 have shown robust performance in CVRP but face limitations in scalability and adaptability. This study compares two advanced learning-based approaches—L2D, employing deep reinforcement learning, and NCO, with its innovative light encoder and heavy decoder architecture—against the LKH-3 algorithm. Through detailed experimentation, we evaluate their scalability, computational efficiency, and solution quality. Our findings reveal that while L2D and NCO exhibit superior generalization capabilities and demonstrate promising scalability to large-scale problem instances, nuances in performance and efficiency metrics highlight their respective strengths and areas for improvement. The comparative analysis not only underscores the potential of learning-based models in overcoming the limitations of traditional heuristics but also delineates the path for future research in integrating the computational intelligence of machine learning with the intuitive problem-solving prowess of heuristic algorithms. This synthesis aims to pave the way for innovative solutions to CVRP and other combinatorial optimization challenges, marking a significant stride toward leveraging artificial intelligence in operational research.
author2	Zhang Jie
author_facet	Zhang Jie Uday, Nihal Arya
format	Final Year Project
author	Uday, Nihal Arya
author_sort	Uday, Nihal Arya
title	Deep reinforcement learning for optimal resource allocation (II)
title_short	Deep reinforcement learning for optimal resource allocation (II)
title_full	Deep reinforcement learning for optimal resource allocation (II)
title_fullStr	Deep reinforcement learning for optimal resource allocation (II)
title_full_unstemmed	Deep reinforcement learning for optimal resource allocation (II)
title_sort	deep reinforcement learning for optimal resource allocation (ii)
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/174967
_version_	1814047087284715520

Deep reinforcement learning for optimal resource allocation (II)

Similar Items