GARBAGE COLLECTION SCHEDULING TO MINIMIZE COLLISION IN RAID SSD
Solid State Drive (SSD) is already common nowadays. By 2021 it is predicted that the shipment number of SSD will exceed the Hard Disk Drive (HDD). In its operation, SSD needs to run garbage collection (GC) to free up memories that occupied by invalid data that already unused. However, the GC process...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/39580 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Solid State Drive (SSD) is already common nowadays. By 2021 it is predicted that the shipment number of SSD will exceed the Hard Disk Drive (HDD). In its operation, SSD needs to run garbage collection (GC) to free up memories that occupied by invalid data that already unused. However, the GC process causes an increase in latency because I/O requests must wait for the GC to finish before the request can be processed. The impact of GC is worse in RAID RAID because more than one SSDs are involved. The impact of GC can be eliminated by doing data reconstruction using parity that already commonly used in RAID. The data reconstruction process can be executed when there is only one SSD doing GC, or in other words, there is no GC collision in RAID SSD. This study proposed three GC scheduling strategies, GCSync, GCSync+, and GCLock. GCSync uses different time-window for each SSD when the GC process can be started. GCSync+ is a development of GCSync by adding buffer time between two time-windows to eliminate all collisions. While GCLock tries to schedule the GC process in RAID SSD using a lock that stored in the RAID controller. An evaluation using modified SSD simulator, SSDSim, showed a good result. GCSync can reduce GC collisions around 60-71%. Moreover, GCSync+ and GCLock are able to eliminate all collisions. The proposed GC scheduling strategies also makes various latency changes based on the workload used. |
---|