What the research is:
How it works:
Current flash caches fall into two main categories: set-associative caches and log-structured caches. Set-associative caches write many more bytes than necessary because they have to write a 4 KB flash page — the minimum write granularity — for every object. With tiny objects that are roughly 100 bytes or less in size, this leads to significant write amplification (i.e., set-associative caches write ~40x more bytes than strictly necessary).
Log-structured caches have a large DRAM overhead because they have to keep an index entry for every on-flash object. Thus, neither set-associative nor log-structured flash caches are well designed for tiny object flash caches.
Kangaroo combines log-structured and set-associative caches to reduce both DRAM and flash-write overheads. Kangaroo has two main parts: KLog, a small log-structured flash cache, and KSet, a large set-associative flash cache.
One of Kangaroo’s innovations is in how it moves objects from KLog to KSet. Kangaroo uses KLog to find multiple objects mapping to the same set in KSet, such as the pink and yellow objects shown above. Whenever an object is evicted from KLog, Kangaroo proactively evicts objects from KLog to minimize write amplification in KSet.
Since writing a set always requires writing 4 KB, regardless of the number of objects inserted, writing two objects instead of just one cuts the write overhead per object in half. Thus, Kangaroo can amortize writes to KSet over multiple objects, decreasing the overall number of bytes written to flash. KLog accomplishes this goal with a small capacity (~5 percent of flash), so Kangaroo needs only a small amount of DRAM to index KLog’s entire capacity. Thus, Kangaroo addresses both the DRAM and flash-write overheads of caching tiny objects on flash.
On top of this basic design, Kangaroo introduces additional techniques to realize its hierarchical design and increase its effectiveness. In particular, since Kangaroo is a cache and not a key-value store, it can evict objects to minimize writes. Kangaroo exploits this opportunity by adding a threshold admission policy that evicts objects instead of admitting them to KSet if there are fewer than n objects to insert from KLog. This admission policy allows Kangaroo to guarantee that the write overhead for moving objects to KSet will be much lower than a set-associative cache. Kangaroo provides further optimizations to reduce DRAM overhead and reduce misses, as explained in our SOSP 2021 paper.
Together, these optimizations allow Kangaroo to overcome the limitations of log-structured caches and set-associative caches, creating a flash cache that delivers on the goal of efficient caching for tiny objects.
Why it matters:
Caching plays an important role in large-scale systems by allowing for quicker access to data and lowering the load on databases and other back-end systems. For large caches, it is far more efficient to use flash than to use DRAM. However, flash caches fall short when it comes to caching tiny objects. Flash caches require either too much DRAM, which loses the efficiency benefits of flash, or too many writes, which wears out the flash device too quickly. In either case, flash caching fails to live up to its potential as an efficient, large cache for tiny objects, ultimately resulting in smaller caches and more load on back-end systems.
Many social media and IoT services have large numbers of tiny objects, each a few hundred bytes or less. For example, edges in Facebook’s social graph, which are needed to connect friends, posts, and images, among other content, average under 100 bytes. Caching these objects efficiently on flash allows them to be quickly retrieved at scale and minimizes the load on back-end storage services. Kangaroo enables large-scale flash caching of tiny objects by having low DRAM and write overheads, leading to a lower miss ratio in large flash caches.