Concurrency, Disaggregation, and Compression in Data Structures and DBMSs

The recent developments in the research area of data structures and database management systems (DBMSs) are significantly advancing the field, particularly in the context of multi-core processors and cloud-based systems. There is a notable shift towards lock-free data structures that enhance concurrency and performance in multi-threaded environments. These structures, such as lock-free tries, are being optimized for large key universes and low concurrency scenarios, showing superior performance compared to traditional theoretical models. Additionally, the trend towards disaggregated DBMSs is gaining momentum, leveraging cloud infrastructure to separate processing from storage, enabling elastic and independent scaling of components. This approach allows for dynamic assembly of the most efficient and cost-effective hardware resources based on workload characteristics.

In the realm of key-value stores, adaptive eviction policies are emerging as a key innovation, with algorithms like CAMP demonstrating superior performance and adaptability compared to traditional LRU policies. These policies consider both the size and cost of key-value pairs, maximizing memory utility across diverse applications.

NUMA optimization is also a focal point, with machine learning approaches being employed to improve memory access locality in large-scale systems. Solutions like MAO are being deployed to monitor and optimize NUMA node binding, leading to significant latency improvements and resource savings.

Furthermore, the integration of hardware-conscious scheduling frameworks, such as P-MOSS, is showing promising results in optimizing query execution and data placement on NUMA servers, leveraging low-level hardware statistics for scheduling decisions.

Lastly, the concept of transparent compression within cache systems is being explored to address the challenges of expanding cache capacity. Designs like ZipCache are pioneering the integration of DRAM and SSD with built-in compression, achieving notable improvements in throughput and latency while reducing write amplification.

Noteworthy papers include the implementation of Ko's Lock-Free Binary Trie, which outperforms existing models in specific scenarios, and the introduction of P-MOSS, which demonstrates up to 6x improvement in query throughput through learned scheduling.

Concurrency, Disaggregation, and Compression in Data Structures and DBMSs

Sources