VISUAL STIMULUS RECONSTRUCTION BASED ON EEG SIGNALS USING STABLE DIFFUSION MODEL WITH CONTRASTIVE LEARNING APPROACH
In recent years, models have been developed to reconstruct visual stimulus images based on EEG signals. A popular approach currently used is contrastive learning, as it enables training with unlabeled data. However, several datasets utilized in existing studies were collected using the block desi...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/87995 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | In recent years, models have been developed to reconstruct visual stimulus images
based on EEG signals. A popular approach currently used is contrastive learning,
as it enables training with unlabeled data. However, several datasets utilized in
existing studies were collected using the block design approach, which affects the
validity of the research outcomes. Therefore, there is a need to develop models with
contrastive learning using datasets collected under improved design conditions.
To achieve this goal, a model was developed using the Alljoined1 dataset,
employing contrastive learning and stable diffusion model to generate visual
stimulus images. First, the encoder model was trained using contrastive learning to
learn the latent space of image embeddings produced by the pretrained CLIP model
from EEG signal inputs. The EEG embeddings were then aligned using a diffusion
prior. Finally, these EEG embeddings were used as input to generate images
through the pretrained stable diffusion model.
This model successfully generated visual reconstruction images with semantic
evaluation metrics, achieving two-way identification scores of 0.5094 for the CLIP
model, 0.4892 for the Inception model, 0.4570 for the AlexNet(2) model, 0.5239
for the AlexNet(5) model, and a SwAV distance score of 0.6812. Additionally,
analyses were conducted to observe the effects of subject-specific EEG variability
and frequency bands on model performance. |
---|