APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION

model to perform a specific task or benchmark. The higher the model’s performance on the task, the greater the value of the data used to train it. However, the data valuation process itself is not cost-effective, especially for gradient-based data valuation methods. This approach calculates and stor...

Full description

Saved in:
Bibliographic Details
Main Author: Ananta, Moses
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/87694
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:87694
spelling id-itb.:876942025-02-01T16:42:02ZAPPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION Ananta, Moses Indonesia Theses data valuation, LLM, quantization, model compression, gradient compression. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/87694 model to perform a specific task or benchmark. The higher the model’s performance on the task, the greater the value of the data used to train it. However, the data valuation process itself is not cost-effective, especially for gradient-based data valuation methods. This approach calculates and stores gradients of both training and testing data to then measure their similarity. The higher the similarity between these gradients, the higher the value of the training data. Since the amount of gradient data that must be computed and stored scales with both the model’s parameters and the data volume, this method is highly inefficient for large models such as Large Language Models (LLMs). Two common approaches to reduce model and gradient sizes are model compression and gradient compression, which focus on compressing the model and gradient sizes, respectively. Building upon recent large-scale gradient-based data valuation research in LLMs—specifically LESS (Low-rank gradiEnt Similarity Search), which employs a compression model method known as LoRA (Low-Rank Adaptation). This research aims to apply alternative model compression techniques as well as gradient compression to make data valuation more efficient. Specifically, this study applies quantization methods to compress both the LLM and its gradients. Experiments were conducted by assessing the value of each data point in the Flan v2, COT, Dolly, and OASST1 training datasets against the TyDiQA, MMLU, and BBH test datasets. Training was then performed using the data with the highest value. The experimental results demonstrate that the use of quantization in data valuation and selection enhances model performance compared to models trained on random data. Data selection remains effective even with extreme gradient quantization, such as 1-bit, providing promising opportunities for cost-effective and efficient data valuation for large-scale models. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description model to perform a specific task or benchmark. The higher the model’s performance on the task, the greater the value of the data used to train it. However, the data valuation process itself is not cost-effective, especially for gradient-based data valuation methods. This approach calculates and stores gradients of both training and testing data to then measure their similarity. The higher the similarity between these gradients, the higher the value of the training data. Since the amount of gradient data that must be computed and stored scales with both the model’s parameters and the data volume, this method is highly inefficient for large models such as Large Language Models (LLMs). Two common approaches to reduce model and gradient sizes are model compression and gradient compression, which focus on compressing the model and gradient sizes, respectively. Building upon recent large-scale gradient-based data valuation research in LLMs—specifically LESS (Low-rank gradiEnt Similarity Search), which employs a compression model method known as LoRA (Low-Rank Adaptation). This research aims to apply alternative model compression techniques as well as gradient compression to make data valuation more efficient. Specifically, this study applies quantization methods to compress both the LLM and its gradients. Experiments were conducted by assessing the value of each data point in the Flan v2, COT, Dolly, and OASST1 training datasets against the TyDiQA, MMLU, and BBH test datasets. Training was then performed using the data with the highest value. The experimental results demonstrate that the use of quantization in data valuation and selection enhances model performance compared to models trained on random data. Data selection remains effective even with extreme gradient quantization, such as 1-bit, providing promising opportunities for cost-effective and efficient data valuation for large-scale models.
format Theses
author Ananta, Moses
spellingShingle Ananta, Moses
APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION
author_facet Ananta, Moses
author_sort Ananta, Moses
title APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION
title_short APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION
title_full APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION
title_fullStr APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION
title_full_unstemmed APPLICATION OF GRADIENT-BASED DATA VALUATION ON LLM USING MODEL COMPRESSION AND GRADIENT COMPRESSION
title_sort application of gradient-based data valuation on llm using model compression and gradient compression
url https://digilib.itb.ac.id/gdl/view/87694
_version_ 1823000155753807872