THE DEVELOPMENT OF HYBRID QUANTUM ANNEALING ALGORITHM FOR OPTIMIZING ENSEMBLE LEARNING

Quantum annealing (QA) is a quantum computing approach widely used to address optimization problems and probabilistic sampling. Despite being relatively new, this approach has been extensively applied to optimize machine learning problems such as clustering, support vector machines, and others. M...

Full description

Saved in:
Bibliographic Details
Main Author: Putri Yulianti, Lenny
Format: Dissertations
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/81790
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Quantum annealing (QA) is a quantum computing approach widely used to address optimization problems and probabilistic sampling. Despite being relatively new, this approach has been extensively applied to optimize machine learning problems such as clustering, support vector machines, and others. Most studies implementing QA in the machine leaQuantum annealing (QA) is a quantum computing approach widely used to address optimization problems and probabilistic sampling. Despite being relatively new, this approach has been extensively applied to optimize machine learning problems such as clustering, support vector machines, and others. Most studies implementing QA in the machine learning domain indicate that QA provides better predictive performance compared to classical state-of-the-art methods. However, QA optimization in machine learning typically focuses on problems involving a single learner. QA holds promising potential for addressing machine learning problems with multiple learners, namely ensemble learning. The fundamental concept behind ensemble model creation involves the "perturb and combine" strategy, where a good ensemble model must carefully consider the optimal trade-off between accuracy and diversity of trained learners. One widely used state-of-the-art method to enhance the diversity of trained learners in ensemble models is the clustering balancing method with over-sampling. However, there are drawbacks to the existing clustering balancing method, such as 1) clusters that are not always strong and balanced, 2) higher similarity cluster percentages, and 3) higher correlation percentages among trained learners due to the addition of data in the minor class by duplicating some samples, affecting the training process. Not all trained learners generated to form the ensemble necessarily contribute positively to accuracy improvement. The selection of an optimal set of trained learners is crucial for enhancing ensemble performance. This presents an opportunity for optimization using QA because QA has the potential to offer better accuracy and efficiency in optimization problems compared to classical state-of- the-art methods. However, QA implementation also has some drawbacks, including 1) the possibility of getting stuck in local minima, 2) potential overfitting in initial solutions, and 3) sensitivity to parameters. Therefore, there is a need for quality improvement in QA implementation. Based on the analyzed challenges and opportunities, this research proposes the development of a hybrid QA algorithm focusing on addressing three ensemble learning problems: 1) creating strong and balanced clusters using a hybrid QA algorithm that combines clustering balancing and QA approaches; 2) selecting optimal clusters using a QA algorithm; and 3) choosing optimal trained learners using a QA algorithm. These three proposed methods form a unified process to produce an optimal ensemble model. Additionally, in the process of selecting clusters and trained learners, a re-sampling process is applied to the proposed QA algorithm to address three weaknesses in QA implementation and improve ensemble quality. The proposed algorithm was evaluated using four datasets from the UCI repository, one dataset from the Airbus – BMW Group, and one real-world dataset. The evaluation focused on four main aspects: size, accuracy, diversity, and ensemble computation time. The proposed algorithm was compared with several benchmark ensemble methods, including bagging, AdaBoost, clustering, clustering balancing, and ensemble methods using particle swarm optimization. Additionally, the experimental results were evaluated using six single learners as base classifiers: artificial neural network, support vector machines, linear discriminant analysis, decision trees, k-nearest neighbors, and Naïve Bayes. The evaluation results showed that the proposed algorithm achieved the highest average accuracy, at 72.40%, with a 95% confidence interval. This study also analyzed three factors that influence and are influenced by this accuracy improvement: ensemble size, diversity value, and computation time. The proposed algorithm was found to reduce the initial ensemble size, although the reduction percentage was not as significant as that of the particle swarm optimization benchmark method. Moreover, the proposed algorithm achieved the highest average diversity value compared to all benchmark methods, where high diversity values accompanied by reduced bias can lead to increased accuracy. The proposed algorithm also demonstrated faster computation times compared to benchmark methods that use classical particle swarm optimization for pruning.rning domain indicate that QA provides better predictive performance compared to classical state-of-the-art methods. However, QA optimization in machine learning typically focuses on problems involving a single learner. QA holds promising potential for addressing machine learning problems with multiple learners, namely ensemble learning. The fundamental concept behind ensemble model creation involves the "perturb and combine" strategy, where a good ensemble model must carefully consider the optimal trade-off between accuracy and diversity of trained learners. One widely used state-of-the-art method to enhance the diversity of trained learners in ensemble models is the clustering balancing method with over-sampling. However, there are drawbacks to the existing clustering balancing method, such as 1) clusters that are not always strong and balanced, 2) higher similarity cluster percentages, and 3) higher correlation percentages among trained learners due to the addition of data in the minor class by duplicating some samples, affecting the training process. Not all trained learners generated to form the ensemble necessarily contribute positively to accuracy improvement. The selection of an optimal set of trained learners is crucial for enhancing ensemble performance. This presents an opportunity for optimization using QA because QA has the potential to offer better accuracy and efficiency in optimization problems compared to classical state-ofthe- art methods. However, QA implementation also has some drawbacks, including 1) the possibility of getting stuck in local minima, 2) potential overfitting in initial solutions, and 3) sensitivity to parameters. Therefore, there is a need for quality improvement in QA implementation. Based on the analyzed challenges and opportunities, this research proposes the development of a hybrid QA algorithm focusing on addressing three ensemble learning problems: 1) creating strong and balanced clusters using a hybrid QA algorithm that combines clustering balancing and QA approaches; 2) selecting optimal clusters using a QA algorithm; and 3) choosing optimal trained learners iv using a QA algorithm. These three proposed methods form a unified process to produce an optimal ensemble model. Additionally, in the process of selecting clusters and trained learners, a re-sampling process is applied to the proposed QA algorithm to address three weaknesses in QA implementation and improve ensemble quality. The proposed algorithm was evaluated using four datasets from the UCI repository, one dataset from the Airbus – BMW Group, and one real-world dataset. The evaluation focused on four main aspects: size, accuracy, diversity, and ensemble computation time. The proposed algorithm was compared with several benchmark ensemble methods, including bagging, AdaBoost, clustering, clustering balancing, and ensemble methods using particle swarm optimization. Additionally, the experimental results were evaluated using six single learners as base classifiers: artificial neural network, support vector machines, linear discriminant analysis, decision trees, k-nearest neighbors, and Naïve Bayes. The evaluation results showed that the proposed algorithm achieved the highest average accuracy, at 72.40%, with a 95% confidence interval. This study also analyzed three factors that influence and are influenced by this accuracy improvement: ensemble size, diversity value, and computation time. The proposed algorithm was found to reduce the initial ensemble size, although the reduction percentage was not as significant as that of the particle swarm optimization benchmark method. Moreover, the proposed algorithm achieved the highest average diversity value compared to all benchmark methods, where high diversity values accompanied by reduced bias can lead to increased accuracy. The proposed algorithm also demonstrated faster computation times compared to benchmark methods that use classical particle swarm optimization for pruning.