Performance enhancements in large scale storage systems

Data center storage systems of the future in Petabyte and Exabyte scale require very high performance (sub millisecond latencies) and large capacities (100s of Petabytes). The evolution both in scale (or capacity) and performance (throughput and I/O Per Second) is driven by the ever increasing I/O d...

Full description

Saved in:
Bibliographic Details
Main Author: Rajesh Vellore Arumugam
Other Authors: Dusit Niyato
Format: Theses and Dissertations
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/65630
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-65630
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Data::Data storage representations
DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
spellingShingle DRNTU::Engineering::Computer science and engineering::Data::Data storage representations
DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Rajesh Vellore Arumugam
Performance enhancements in large scale storage systems
description Data center storage systems of the future in Petabyte and Exabyte scale require very high performance (sub millisecond latencies) and large capacities (100s of Petabytes). The evolution both in scale (or capacity) and performance (throughput and I/O Per Second) is driven by the ever increasing I/O demands from the current and Internet scale applications. These large scale storage systems are basically distributed systems having two primary components or clusters. The first component is the storage server cluster which handles the primary I/O or data I/O for the applications. The second component is the meta-data server (MDS) cluster which manages a single global namespace and serves the meta-data I/O. In this thesis, we look in to the problem of performance deficiencies and scalability of these two components in a multi tenanted mixed I/O (sequential and random I/O) workload environment. To overcome the limitations of the conventional storage system architecture, the thesis proposes a 3-tier hybrid architecture utilizing next generation Non-volatile memory (NVM) like Phase change memory (PCM), Hybrid drives and conventional drives. NVM is used to absorb the writes to the NAND Flash based SSD. This improves both the performance and lifetime of the SSD. Hybrid drives are used as a low cost alternative to high speed Serial attached SCSI (Small computer system interface) or SAS drives for higher performance. This is achieved through a light-weight caching algorithm on the Flash inside the drive. On the storage server, we consider the problem of cache partitioning of next generation NVM, data migration optimization with placement across tiers of storage, data placement optimization of Hybrid drive’s internal cache and workload interference among multiple applications. On the Meta-data server, we consider the problem of load balancing and distribution of file system meta-data across meta-data server cluster that preserves namespace locality. The following are the major contributions of this thesis to address the primary I/O and meta-data I/O performance scalability in large scale storage systems. A heuristic caching mechanism that adapts to I/O workload was developed for a hybrid device consisting of next generation NVM (like Phase change memory) and SSD. This method called HCache can achieve up to 46% improvement in I/O latencies compared to popular control theory based algorithms available in the literature. A distributed caching mechanism called VirtCache was developed that can reduce I/O interference among workloads sharing the storage system. VirtCache can reduce the 90th percentile latency variation of the application by 50% to 83% under a virtualized shared storage environment compared to state-of-art. An Optimized migration and placement of data objects across multiple storage tiers was developed that can achieve up to 17% improvement in performance compared to conventional data migration techniques. We propose new data placement and eviction algorithms on the Hybrid drive internal cache based on the I/O workload characteristics. It reduces the I/O monitoring meta-data overhead by up to 64% compared to state-of-art methods. The algorithm can also classify hot/cold data 48% times faster compared to existing methods. While these solutions address the performance scalability on the storage server, for the meta-data server scalability we developed the DROP meta-data distribution. The DROP mechanism based on consistent hashing preserves locality and near uniform distribution for load balancing. The hashing and distribution mechanism can achieve up to 40% improvement in namespace locality compared to traditional methods.
author2 Dusit Niyato
author_facet Dusit Niyato
Rajesh Vellore Arumugam
format Theses and Dissertations
author Rajesh Vellore Arumugam
author_sort Rajesh Vellore Arumugam
title Performance enhancements in large scale storage systems
title_short Performance enhancements in large scale storage systems
title_full Performance enhancements in large scale storage systems
title_fullStr Performance enhancements in large scale storage systems
title_full_unstemmed Performance enhancements in large scale storage systems
title_sort performance enhancements in large scale storage systems
publishDate 2015
url https://hdl.handle.net/10356/65630
_version_ 1759857456286007296
spelling sg-ntu-dr.10356-656302023-03-04T00:46:51Z Performance enhancements in large scale storage systems Rajesh Vellore Arumugam Dusit Niyato Foh Chuan Heng Wen Yonggang School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Data::Data storage representations DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Data center storage systems of the future in Petabyte and Exabyte scale require very high performance (sub millisecond latencies) and large capacities (100s of Petabytes). The evolution both in scale (or capacity) and performance (throughput and I/O Per Second) is driven by the ever increasing I/O demands from the current and Internet scale applications. These large scale storage systems are basically distributed systems having two primary components or clusters. The first component is the storage server cluster which handles the primary I/O or data I/O for the applications. The second component is the meta-data server (MDS) cluster which manages a single global namespace and serves the meta-data I/O. In this thesis, we look in to the problem of performance deficiencies and scalability of these two components in a multi tenanted mixed I/O (sequential and random I/O) workload environment. To overcome the limitations of the conventional storage system architecture, the thesis proposes a 3-tier hybrid architecture utilizing next generation Non-volatile memory (NVM) like Phase change memory (PCM), Hybrid drives and conventional drives. NVM is used to absorb the writes to the NAND Flash based SSD. This improves both the performance and lifetime of the SSD. Hybrid drives are used as a low cost alternative to high speed Serial attached SCSI (Small computer system interface) or SAS drives for higher performance. This is achieved through a light-weight caching algorithm on the Flash inside the drive. On the storage server, we consider the problem of cache partitioning of next generation NVM, data migration optimization with placement across tiers of storage, data placement optimization of Hybrid drive’s internal cache and workload interference among multiple applications. On the Meta-data server, we consider the problem of load balancing and distribution of file system meta-data across meta-data server cluster that preserves namespace locality. The following are the major contributions of this thesis to address the primary I/O and meta-data I/O performance scalability in large scale storage systems. A heuristic caching mechanism that adapts to I/O workload was developed for a hybrid device consisting of next generation NVM (like Phase change memory) and SSD. This method called HCache can achieve up to 46% improvement in I/O latencies compared to popular control theory based algorithms available in the literature. A distributed caching mechanism called VirtCache was developed that can reduce I/O interference among workloads sharing the storage system. VirtCache can reduce the 90th percentile latency variation of the application by 50% to 83% under a virtualized shared storage environment compared to state-of-art. An Optimized migration and placement of data objects across multiple storage tiers was developed that can achieve up to 17% improvement in performance compared to conventional data migration techniques. We propose new data placement and eviction algorithms on the Hybrid drive internal cache based on the I/O workload characteristics. It reduces the I/O monitoring meta-data overhead by up to 64% compared to state-of-art methods. The algorithm can also classify hot/cold data 48% times faster compared to existing methods. While these solutions address the performance scalability on the storage server, for the meta-data server scalability we developed the DROP meta-data distribution. The DROP mechanism based on consistent hashing preserves locality and near uniform distribution for load balancing. The hashing and distribution mechanism can achieve up to 40% improvement in namespace locality compared to traditional methods. DOCTOR OF PHILOSOPHY (SCE) 2015-11-26T01:44:11Z 2015-11-26T01:44:11Z 2015 2015 Thesis Rajesh Vellore Arumugam. (2015). Performance enhancements in large scale storage systems. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/65630 10.32657/10356/65630 en 201 p. application/pdf