APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION
model to perform a specific task or benchmark. The higher the model’s performance on the task, the greater the value of the data used to train it. However, the data valuation process itself is not cost-effective, especially for gradient-based data valuation methods. This approach calculates and stor...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/87694 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:87694 |
---|---|
spelling |
id-itb.:876942025-02-01T16:42:02ZAPPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION Ananta, Moses Indonesia Theses data valuation, LLM, quantization, model compression, gradient compression. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/87694 model to perform a specific task or benchmark. The higher the model’s performance on the task, the greater the value of the data used to train it. However, the data valuation process itself is not cost-effective, especially for gradient-based data valuation methods. This approach calculates and stores gradients of both training and testing data to then measure their similarity. The higher the similarity between these gradients, the higher the value of the training data. Since the amount of gradient data that must be computed and stored scales with both the model’s parameters and the data volume, this method is highly inefficient for large models such as Large Language Models (LLMs). Two common approaches to reduce model and gradient sizes are model compression and gradient compression, which focus on compressing the model and gradient sizes, respectively. Building upon recent large-scale gradient-based data valuation research in LLMs—specifically LESS (Low-rank gradiEnt Similarity Search), which employs a compression model method known as LoRA (Low-Rank Adaptation). This research aims to apply alternative model compression techniques as well as gradient compression to make data valuation more efficient. Specifically, this study applies quantization methods to compress both the LLM and its gradients. Experiments were conducted by assessing the value of each data point in the Flan v2, COT, Dolly, and OASST1 training datasets against the TyDiQA, MMLU, and BBH test datasets. Training was then performed using the data with the highest value. The experimental results demonstrate that the use of quantization in data valuation and selection enhances model performance compared to models trained on random data. Data selection remains effective even with extreme gradient quantization, such as 1-bit, providing promising opportunities for cost-effective and efficient data valuation for large-scale models. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
model to perform a specific task or benchmark. The higher the model’s performance on the task, the greater the value of the data used to train it. However, the data valuation process itself is not cost-effective, especially for gradient-based data valuation methods. This approach calculates and stores gradients of both training and testing data to then measure their similarity. The higher the similarity between these gradients, the higher the value of the training data. Since the amount of gradient data that must be computed and stored scales with both the model’s parameters and the data volume, this method is highly inefficient for large models such as Large Language Models (LLMs).
Two common approaches to reduce model and gradient sizes are model compression and gradient compression, which focus on compressing the model and gradient sizes, respectively. Building upon recent large-scale gradient-based data valuation research in LLMs—specifically LESS (Low-rank gradiEnt Similarity Search), which employs a compression model method known as LoRA (Low-Rank Adaptation). This research aims to apply alternative model compression techniques as well as gradient compression to make data valuation more efficient. Specifically, this study applies quantization methods to compress both the LLM and its gradients.
Experiments were conducted by assessing the value of each data point in the Flan v2, COT, Dolly, and OASST1 training datasets against the TyDiQA, MMLU, and BBH test datasets. Training was then performed using the data with the highest value. The experimental results demonstrate that the use of quantization in data valuation and selection enhances model performance compared to models trained on random data. Data selection remains effective even with extreme gradient quantization, such as 1-bit, providing promising opportunities for cost-effective and efficient data valuation for large-scale models. |
format |
Theses |
author |
Ananta, Moses |
spellingShingle |
Ananta, Moses APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION |
author_facet |
Ananta, Moses |
author_sort |
Ananta, Moses |
title |
APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION |
title_short |
APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION |
title_full |
APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION |
title_fullStr |
APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION |
title_full_unstemmed |
APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION |
title_sort |
application of gradient-based data valuation on llm using model compression and gradient compression |
url |
https://digilib.itb.ac.id/gdl/view/87694 |
_version_ |
1823000155753807872 |