A Performance Study of Epoch-based Commit Protocols in Distributed OLTP Databases

Distributed OLTP systems execute the high- overhead, two-phase commit (2PC) protocol at the end of every distributed transaction. Epoch-based commit proposes that 2PC be executed only once for all transactions processed within a time interval called an epoch. Increasing epoch duration allows more transactions to be processed before the common 2PC. It thus reduces 2PC overhead per transaction, increases throughput but also increases average transaction latency. Therefore, required is the ability to choose the right epoch size that offers the desired trade-off between throughput and latency. To this end, we develop two analytical models to estimate throughput and average latency in terms of epoch size taking into account load and failure conditions. Simulations affirm their accuracy and effectiveness. We then present epoch-based multi-commit which, unlike epoch- based commit, seeks to avoid all transactions being aborted when failures occur, and also performs identically when failures do not occur. Our performance study identifies workload factors that make it more effective in preventing transaction aborts and concludes that the analytical models can be equally useful in predicting its performance as well.