SUPPORTING EFFICIENT LARGE-SCALE KEY-VALUE SYSTEMS WITH AN OPTIMIZED STORAGE HIERARCHY
MetadataShow full item record
Driven by the growing demands from big-data applications, the focus of their data management has been largely shifted from traditional SQL databases to NoSQL (Not-only- SQL) databases, such as key-value (KV) stores, which provides essential functionalities and much higher performance for storing and retrieving data. Correspondingly new hardware technologies have been developed to support the fast data accesses, such as NVMe SSDs and Infiniband network. However, existing designs of NoSQL databases usually see sub- optimal performance on fast hardware. Traditionally the computing overhead of a database system is overshadowed by the slow storage and network. With the adoption of the new hardware technologies the inefficiency at the software side is now the major source of bottlenecks in today’s systems. In this dissertation we propose solutions to overcome barriers on the adoption of new technologies, such as large DRAMs, fast SSDs, and low-latency Infiniband network, into existing stacks of software systems. Accordingly introduce new designs, including Search Lookaside Buffer (SLB), zExapnder, LSM-trie, and NVMcached, to improve the systems’ efficiency on accessing SSD, DRAM, non-volatile main memory, and CPU cache. We identify false temporal locality and false spatial locality in index search and propose SLB to effectively improve index search by removing the false localities. In zExpander we use compression to increase the effective capacity in KV caches without adding DRAM, which can substantially reduce misses in the KV cache. In LSM-trie we introduce a trie-based data structure to reduce the overhead of internal data reorganization by an order of magnitude for KV stores on SSDs. In NVMcached we remove expansive FLUSH operations for persistent and crash-consistent KV caches on byte-addressable NVM.