The rapid growth of data-intensive use cases such as simulations, streaming applications (such as IoT and sensor feeds), and unstructured data has raised the importance of performing fast database operations such as writing and reading data, especially when those applications start to scale. Almost any component of a system can potentially become a bottleneck, from the network and storage layers to the application GUI to the CPU.
As we discussed in “Optimizing Metadata Performance for Web-Scale Applications”, one of the main reasons for data bottlenecks is the way the data engine, also called the storage engine, handles data operations, the deepest part of the software stack that classifies and indexes data. Data engines were originally created to store metadata, the critical “data about data” that companies use to recommend movies to watch or products to buy. This metadata also tells us when the data was created, where exactly it was stored, and much more.
Inefficiencies with metadata often arise in the form of random read patterns, slow query performance, inconsistent query behavior, I/O hangs, and write hangs. As these problems worsen, problems originating from this layer can start to trickle down the stack and show up to the end user, where they can show up in the form of slow reads, slow writes, write amplification, space amplification, inability to climb and more.
New architectures remove bottlenecks
Next-generation data engines have emerged in response to the demands of low-latency, data-intensive workloads that require significant scalability and performance. They allow more detailed performance tuning by tuning three types of amplification, or data writing and rewriting, that the engines perform: write amplification, read amplification, and space amplification. They also go further with additional adjustments to how the engine finds and stores data.
Speedb, our company, designed one such data engine as a drop-in replacement for the in fact industry standard, RocksDB. We open sourced Speedb to the developer community based on technology delivered in an enterprise edition over the past two years.
Many developers are familiar with RocksDB, a cool and ubiquitous data engine that is optimized to exploit many CPUs for IO-bound workloads. Its use of a record-structured join (LSM) tree-based data structure, as detailed in the previous article, is great for handling write-intensive use cases efficiently. However, LSM read performance can be poor if the data is accessed in small, random chunks, and the problem worsens as applications scale, particularly in applications with large volumes of small files, such as with metadata. .
Speedb optimizations
Speedb has developed three techniques for optimizing data and metadata scalability, techniques that advance the state of the art since RocksDB and other data engines were developed a decade ago.
compaction
Like other tree-based LSM engines, RocksDB uses compaction to reclaim disk space and remove the old version of data from logs. Extra writes consume data resources and slow down metadata processing, and to mitigate this, data engines perform compaction. However, the two main compaction methods, level and universal, affect the ability of these engines to effectively handle data-intensive workloads.
A brief description of each method illustrates the challenge. Level compaction incurs a very small disk space overhead (the default is around 11%). However, for large databases, it comes with a large I/O amplification penalty. Level compaction uses a “merge with” operation. That is, each level is merged with the next level, which is usually much larger. As a result, each tier adds read and write amplification that is proportional to the ratio of the sizes of the two tier.
Universal compaction has a smaller write amplification, but eventually the database needs full compaction. This full compaction requires space equal to or greater than the total size of the database and may stop processing new updates. Therefore, universal compaction cannot be used in most real-time, high-performance applications.
Speedb’s architecture features hybrid compaction, which reduces write amplification for very large databases without blocking updates and with little overhead in additional space. The hybrid compaction method works as universal compaction at all higher levels, where the size of the data is small relative to the size of the entire database, and works as level compaction only at the lowest level, where a part significant of the updated data is maintained.
Memtable tests (Figure 1 below) show a 17% gain on overwrites and a 13% gain on mixed read and write workloads (90% reads, 10% writes). Results from separate tests of the Bloom filter show a 130% improvement in read errors in a random read workload (Figure 2) and a 26% reduction in memory usage (Figure 3).
Testing by Redis shows increased performance when Speedb replaced RocksDB in the Flash Redis implementation. Their test with Speedb was also independent of the application’s read/write ratio, indicating that performance is predictable across multiple different applications, or applications where the access pattern varies over time.
Figure 1. Memtable test with Speedb.
Figure 2. Bloom filter test using a random read workload with Speedb.
Figure 3. Bloom filter test showing a reduction in memory usage with Speedb.
Memory management
Memory management of built-in libraries plays a crucial role in application performance. Today’s solutions are complex and have too many interlocking parameters, making it difficult for users to optimize for their needs. The challenge increases as the environment or workload changes.
Speedb took a holistic approach by redesigning memory management to simplify usage and improve resource utilization.
A dirty data manager enables an improved flush scheduler, which takes a proactive approach and improves overall memory efficiency and system utilization, without requiring user intervention.
Working from the ground up, Speedb is building additional self-tuning features to achieve performance, scale, and ease of use for a variety of use cases.
Flow control
Speedb redesigns RocksDB’s flow control mechanism to eliminate spikes in user latency. Its new flow control mechanism changes the rate in a way that is much smoother and more precisely tuned for the state of the system than the previous mechanism. He slows down when necessary and speeds up when he can. By doing so, stalls are eliminated and write performance is stable.
When the root cause of data engine inefficiencies is deeply embedded in the system, finding it can be challenging. At the same time, the deeper the root cause, the greater the impact on the system. As the old saying goes, a chain is only as strong as its weakest link.
Next-generation data engine architectures such as Speedb can increase metadata performance, reduce latency, speed up seek time, and optimize CPU consumption. As teams scale their hyperscale applications, new data engine technology will be a critical element in enabling modern architectures that are agile, scalable, and high-performance.
Hilik Yochai is Chief Scientific Officer and Co-Founder of Speedb, the company behind the Speedb data engine, a direct replacement for RocksDB, and Hive, Speedb’s open source community where developers can interact, improve, and share knowledge and best practices on Speedb. and RocksDB. Speedb’s technology helps developers build their hyperscale data operations with unlimited scale and performance without compromising functionality, while constantly striving to improve usability and ease of use.
—
New Tech Forum offers a place to explore and discuss emerging business technology in unprecedented depth and breadth. Selection is subjective, based on our choice of technologies that we believe are important and of most interest to InfoWorld readers. InfoWorld does not accept marketing guarantees for the publication and reserves the right to edit all content contributed. Please send all inquiries to newtechforum@infoworld.com.
Copyright © 2023 IDG Communications, Inc.
Be First to Comment