Publications

Real Life Is Uncertain. Consensus Should Be Too!

Published in Workshop on Hot Topics in Operating Systems (HOTOS 25), 2025

Modern distributed systems rely on consensus protocols to build a fault-tolerant core upon which they can build applications. Consensus protocols are correct under a specific failure model, where up to $f$ machines can fail. We argue that this $f$-threshold failure model oversimplifies the real world and limits potential opportunities to optimize for cost or performance. We argue instead for a probabilistic failure…

Paper Slides Talk Citation

SkyStore: Cost-Optimized Object Storage Across Regions and Clouds

Published in Proceedings of the VLDB Endowment (pVLDB), 2025, 2022

Modern applications span multiple clouds to reduce costs, avoid vendor lock-in, and leverage low-availability resources in another cloud. However, standard object stores operate within a single cloud, forcing users to manually manage data placement across clouds. This is often a complex choice: users must either pay to store objects in a remote cloud, or pay to transfer them over the network. To address this, we present SkyStore, a unified object store that addresses cost-optimal data management across regions and clouds. SkyStore introduces…

Paper Slides Talk Citation

DINOMO: an elastic, scalable, high-performance key-value store for DPM

Published in Proceedings of the VLDB Endowment (pVLDB), 2022

This paper presents Dinomo, a novel key-value store for disaggregated persistent memory (DPM). Dinomo is the first key-value store for DPM that simultaneously achieves high common-case performance, scalability, and lightweight online reconfiguration simultaneously. Dinomo uses a novel combination of techniques such as ownership partitioning, disaggregated adaptive caching, and selective replication…

Paper Slides Talk Citation

WineFS: a hugepage-aware file system for persistent memory that ages gracefully

Published in ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

Modern persistent-memory (PM) file systems degrade in performance with usage due to their inability to use hugepages. This paper introduces WineFS, a novel hugepage-aware PM file system that eliminates this effect. WineFS combines a new alignment-aware allocator with fragmentation-avoiding approaches to consistency and concurrency to preserve hugepages. Experiments show that WineFS…

Paper Slides Talk Citation

RainBlock: Faster Transaction Processing for Public Blockchains

Published in USENIX Annual Technical Conference (ATC), 2021

This paper presents RAINBLOCK, a public blockchain that achieves high transaction throughput. The number of transactions in each block is limited by I/O bottlenecks. By removing these I/O bottlenecks, RAINBLOCK allows miners to process more transactions in the same amount of time. The RAINBLOCK architecture removes I/O from the critical path, and the distributed, sharded Merkle tree, the DSM-TREE data structure…

Paper Slides Talk Citation

Software-defined data protection: Low overhead policy compliance is within reach!

Published in Proceedings of the VLDB Endowment (pVLDB), 2021

This paper presents our novel approach “Software-Defined Data Protection” (SDP). Its simple, yet powerful premise is to decouple often changing policies from request-level enforcement to allow distributed smart storage nodes to implement the latter at line-rate. Existing and future data protection frameworks can be translated to the same hardware interface which allows storage nodes to offload enforcement efficiently…

Paper Slides Talk Citation

Crashmonkey and ACE: Systematically testing file-system crash consistency

Published in ACM Transactions on Storage (TOCS), 2019

This paper presents CrashMonkey and Ace, a set of tools to systematically find crash-consistency bugs in Linux file systems. CrashMonkey is a record-and-replay framework that simulates power-loss crashes while executing a given workload, and checks if the file system recovers to a consistent state after each crash. Ace automatically generates workloads to be run on the target file system. CrashMonkey and Ace are based on a new approach to test file-system crash consistency: bounded black-box crash testing (B3) which alleviates the consequences of having an infinite set of possible workloads to test. CrashMonkey and Ace are able to find 24 out of the 26 crash-consistency bugs reported in the last 5 years. These tools also revealed 10 new crash-consistency bugs in widely used, mature Linux file systems, 7 of which existed in the kernel since 2014. They also found a crash-consistency bug in a verified file system, FSCQ.

Paper Slides Talk Citation

Finding Crash-Consistency Bugs with Bounded Black-Box Crash Testing

Published in 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2018

This paper presents the bounded black-box crash testing (B3), a new approach to test file-system crash consistency. B3 tests the file system in a black-box manner using workloads with file-system operations. Since the space of possible workloads is infinite, B3 bounds this space based on the insights from studying recent crash-consistency bugs reported in Linux file systems. We build CrashMonkey and Ace, to demonstrate the effectiveness of B3 approach. These tools find 24 out of the 26 recent crash-consistency bugs…

Paper Slides Talk Citation

mLSM: Making Authenticated Storage Faster in Ethereum

Published in USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage), 2018

This paper presents a novel data-authenticating structure, Merkelized LSM (mLSM). In authenticated storage each read returns a value and a proof that allows the client to verify the value returned is correct. Such authentication leads to high read and write amplification (64x in the worst case). mLSM …

Paper Slides Talk Citation

Soujanya Ponnapalli

Publications