Advancements in green AI: a pathway to sustainability

In this paper, a survey of model compression and optimization techniques are evaluated on benchmarks of energy efficiency, memory footprint and accuracy on a task key to online safety, phishing. The three primary categories of compression explored are (1) Quantization, (2) Distillation and (3) Pruni...

Full description

Saved in:
Bibliographic Details
Main Author: Palanca Sebastian Gonzalo Miguel IV Puyat
Other Authors: Dusit Niyato
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181771
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181771
record_format dspace
spelling sg-ntu-dr.10356-1817712024-12-18T11:47:29Z Advancements in green AI: a pathway to sustainability Palanca Sebastian Gonzalo Miguel IV Puyat Dusit Niyato College of Computing and Data Science Ong Yoon Sew DNIYATO@ntu.edu.sg Computer and Information Science Artificial intelligence Model compression In this paper, a survey of model compression and optimization techniques are evaluated on benchmarks of energy efficiency, memory footprint and accuracy on a task key to online safety, phishing. The three primary categories of compression explored are (1) Quantization, (2) Distillation and (3) Pruning. The quantization techniques explored are QLoRA and LLM.Int8, techniques designed for compressing LLMs as well as Quantization Aware Training with asymmetric quantization on inference. The Distillation techniques explored are (1) Knowledge Distillation, (2) Hint Distillation for FitNets and (3) Relational Knowledge Distillation, all of which are used to train smaller transformer architectures compared to the base Bert transformer. For Pruning, L1 and L2 Magnitude Pruning and Head Pruning are evaluated. The results showed major gains in both carbon footprint and memory footprint are made with the application of QLoRA with FP4 and a compute type of FP16, with near zero accuracy degradation. The model showed great promise with an accuracy of 98.60%, a carbon footprint of 0.0016kg of CO2 for 20,000 samples, and time per inference of 0.0059 seconds, making it fast, efficient and of high quality, especially when compared to a baseline performance of 98.58%, 0.0095kg of CO2 for 20,000 samples, and a time per inference of 0.016, making the most optimal model 10 times faster and has nearly 6 times less carbon emissions over 20,000 samples. Bachelor's degree 2024-12-18T11:47:29Z 2024-12-18T11:47:29Z 2024 Final Year Project (FYP) Palanca Sebastian Gonzalo Miguel IV Puyat (2024). Advancements in green AI: a pathway to sustainability. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181771 https://hdl.handle.net/10356/181771 en SCSE23-0820 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Artificial intelligence
Model compression
spellingShingle Computer and Information Science
Artificial intelligence
Model compression
Palanca Sebastian Gonzalo Miguel IV Puyat
Advancements in green AI: a pathway to sustainability
description In this paper, a survey of model compression and optimization techniques are evaluated on benchmarks of energy efficiency, memory footprint and accuracy on a task key to online safety, phishing. The three primary categories of compression explored are (1) Quantization, (2) Distillation and (3) Pruning. The quantization techniques explored are QLoRA and LLM.Int8, techniques designed for compressing LLMs as well as Quantization Aware Training with asymmetric quantization on inference. The Distillation techniques explored are (1) Knowledge Distillation, (2) Hint Distillation for FitNets and (3) Relational Knowledge Distillation, all of which are used to train smaller transformer architectures compared to the base Bert transformer. For Pruning, L1 and L2 Magnitude Pruning and Head Pruning are evaluated. The results showed major gains in both carbon footprint and memory footprint are made with the application of QLoRA with FP4 and a compute type of FP16, with near zero accuracy degradation. The model showed great promise with an accuracy of 98.60%, a carbon footprint of 0.0016kg of CO2 for 20,000 samples, and time per inference of 0.0059 seconds, making it fast, efficient and of high quality, especially when compared to a baseline performance of 98.58%, 0.0095kg of CO2 for 20,000 samples, and a time per inference of 0.016, making the most optimal model 10 times faster and has nearly 6 times less carbon emissions over 20,000 samples.
author2 Dusit Niyato
author_facet Dusit Niyato
Palanca Sebastian Gonzalo Miguel IV Puyat
format Final Year Project
author Palanca Sebastian Gonzalo Miguel IV Puyat
author_sort Palanca Sebastian Gonzalo Miguel IV Puyat
title Advancements in green AI: a pathway to sustainability
title_short Advancements in green AI: a pathway to sustainability
title_full Advancements in green AI: a pathway to sustainability
title_fullStr Advancements in green AI: a pathway to sustainability
title_full_unstemmed Advancements in green AI: a pathway to sustainability
title_sort advancements in green ai: a pathway to sustainability
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/181771
_version_ 1819113078636150784