Lean evolutionary machine learning by multitasking simpler and hard tasks

Many decisions in a machine learning (ML) pipeline involve non-differentiable and discontinuous objectives and search spaces. Examples include feature selection, model selection, and reinforcement learning where candidate solutions must be evaluated via a learning subsystem or through interactions w...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Nick Shihui
Other Authors: Ong Yew Soon
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/164300
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-164300
record_format dspace
spelling sg-ntu-dr.10356-1643002023-02-01T03:20:55Z Lean evolutionary machine learning by multitasking simpler and hard tasks Zhang, Nick Shihui Ong Yew Soon School of Computer Science and Engineering ASYSOng@ntu.edu.sg Engineering::Computer science and engineering Many decisions in a machine learning (ML) pipeline involve non-differentiable and discontinuous objectives and search spaces. Examples include feature selection, model selection, and reinforcement learning where candidate solutions must be evaluated via a learning subsystem or through interactions with complex environments. Evolutionary algorithms (EAs), a simple and largely problem-independent approach, are prominent gradient-free methods to handle such tasks. However, the sample complexity of iteratively evaluating populations of candidate solutions poses a steep challenge in terms of computational cost, especially when the learning subsystem is computationally expensive. Given the increasing volume of data in recent years, the resource needed to train machine learning models is rising steadily. As such, the tractability issue marks a barrier to the widespread adoption of evolutionary computation (EC) for ML, a subject that has sparked significant research interest in recent years. To this end, an emerging computational paradigm known as evolutionary multi-tasking (EMT) is making inroads towards resolving the computational tractability issue through the simultaneous handling of multiple optimization tasks in a single evolutionary run. The exploitation of shared information across tasks often leads to accelerated global convergence since the performant solutions found by similar tasks could serve as informative priors that rapidly and favourably bias the main task’s search. As opposed to prior works that often fall back on parallel computing hardware to resolve this big data problem of EAs, we aim for a software-centric solution. In particular, we first propose a unified probabilistic formulation for multitasking in evolutionary machine learning (EML) and then design algorithmic realizations in different contexts to achieve accelerated convergence characteristics. In this thesis, the basic premise is to achieve lean EML by concurrently evolving and transferring useful information from simpler to hard tasks. The EMT framework leverages the general idea of using a number of artificially generated (computationally cheaper) auxiliary tasks in a single multitask setting, with the objective of boosting evolutionary search on the main (target) task of interest. The unified probabilistic formulation for EMT serves to enable effective exploitation of transferable information while simultaneously guarding against threats of negative transfer. In addition, a novel strategy is used to allocate computational resources (i.e., size of the evolving population) to each of the tasks in a principled manner, minimizing the wastage of computational resources on tasks that are not useful to the target. Algorithmic realizations of the general idea are designed and implemented for two different domains of ML, namely, evolutionary feature selection and evolutionary reinforcement learning. Extensive experimental studies have been conducted to verify the efficacy of our proposed framework and algorithms. The results demonstrate that the proposed methods can achieve a competitive level of performance as existing algorithms while performing substantially fewer sample evaluations, thereby approaching the goal of lean EML. Doctor of Philosophy 2023-01-16T04:23:51Z 2023-01-16T04:23:51Z 2023 Thesis-Doctor of Philosophy Zhang, N. S. (2023). Lean evolutionary machine learning by multitasking simpler and hard tasks. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/164300 https://hdl.handle.net/10356/164300 10.32657/10356/164300 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Zhang, Nick Shihui
Lean evolutionary machine learning by multitasking simpler and hard tasks
description Many decisions in a machine learning (ML) pipeline involve non-differentiable and discontinuous objectives and search spaces. Examples include feature selection, model selection, and reinforcement learning where candidate solutions must be evaluated via a learning subsystem or through interactions with complex environments. Evolutionary algorithms (EAs), a simple and largely problem-independent approach, are prominent gradient-free methods to handle such tasks. However, the sample complexity of iteratively evaluating populations of candidate solutions poses a steep challenge in terms of computational cost, especially when the learning subsystem is computationally expensive. Given the increasing volume of data in recent years, the resource needed to train machine learning models is rising steadily. As such, the tractability issue marks a barrier to the widespread adoption of evolutionary computation (EC) for ML, a subject that has sparked significant research interest in recent years. To this end, an emerging computational paradigm known as evolutionary multi-tasking (EMT) is making inroads towards resolving the computational tractability issue through the simultaneous handling of multiple optimization tasks in a single evolutionary run. The exploitation of shared information across tasks often leads to accelerated global convergence since the performant solutions found by similar tasks could serve as informative priors that rapidly and favourably bias the main task’s search. As opposed to prior works that often fall back on parallel computing hardware to resolve this big data problem of EAs, we aim for a software-centric solution. In particular, we first propose a unified probabilistic formulation for multitasking in evolutionary machine learning (EML) and then design algorithmic realizations in different contexts to achieve accelerated convergence characteristics. In this thesis, the basic premise is to achieve lean EML by concurrently evolving and transferring useful information from simpler to hard tasks. The EMT framework leverages the general idea of using a number of artificially generated (computationally cheaper) auxiliary tasks in a single multitask setting, with the objective of boosting evolutionary search on the main (target) task of interest. The unified probabilistic formulation for EMT serves to enable effective exploitation of transferable information while simultaneously guarding against threats of negative transfer. In addition, a novel strategy is used to allocate computational resources (i.e., size of the evolving population) to each of the tasks in a principled manner, minimizing the wastage of computational resources on tasks that are not useful to the target. Algorithmic realizations of the general idea are designed and implemented for two different domains of ML, namely, evolutionary feature selection and evolutionary reinforcement learning. Extensive experimental studies have been conducted to verify the efficacy of our proposed framework and algorithms. The results demonstrate that the proposed methods can achieve a competitive level of performance as existing algorithms while performing substantially fewer sample evaluations, thereby approaching the goal of lean EML.
author2 Ong Yew Soon
author_facet Ong Yew Soon
Zhang, Nick Shihui
format Thesis-Doctor of Philosophy
author Zhang, Nick Shihui
author_sort Zhang, Nick Shihui
title Lean evolutionary machine learning by multitasking simpler and hard tasks
title_short Lean evolutionary machine learning by multitasking simpler and hard tasks
title_full Lean evolutionary machine learning by multitasking simpler and hard tasks
title_fullStr Lean evolutionary machine learning by multitasking simpler and hard tasks
title_full_unstemmed Lean evolutionary machine learning by multitasking simpler and hard tasks
title_sort lean evolutionary machine learning by multitasking simpler and hard tasks
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/164300
_version_ 1757048189437870080