Relational Debugging - Pinpointing Root Causes of Performance Problems

14 August 2023, 12:00, Zoom/CSIT Level 2 - Systems Area
Zoom Meeting ID: 816 8790 7001 / Password: 644329
Speaker: Xiang (Jenny) Ren (University of Toronto)

Abstract#

Performance debugging is notoriously elusive—real-world performance problems are rarely clear-cut failures, but manifest through the accumulation of fine-grained symptoms. Oftentimes, it is challenging to determine performance anomalies—absolute measures are unreliable, as system performance is inherently relative to workloads. Existing techniques focus on identifying absolute predicates that deviate between executions, which limits their application to performance problems.

This work introduces relational debugging, a new technique that automatically pinpoints the root causes of performance problems. The core idea is to capture and reason out relations between fine-grained runtime events. We show that relations provide immense utilities to explain performance anomalies and locate root causes. Relational debugging is highly effective with a minimal two executions (a good and a bad run), eliminating the pain point of producing and labeling many different executions required by traditional techniques.

We realize relational debugging by developing a practical tool named Perspect. Perspect directly operates on x86 binaries to accommodate real-world diagnosis scenarios. We evaluate Perspect on twelve challenging performance issues with various symptoms in Go runtime, MongoDB, Redis, and Coreutils. Perspect accurately located (or excluded) the root causes of these issues. In particular, we used Perspect to diagnose two open bugs, where developers failed to find root causes—the root causes reported by Perspect were confirmed by developers. A controlled user study shows that Perspect can speed up debugging by at least 10.87 times.

Speaker Bio#

Xiang (Jenny) Ren is a sixth year PhD student at the University of Toronto. She is working with Professor Ding Yuan on understanding system performance and improving the diagnosability of performance issues.
bars search caret-down plus minus arrow-right times arrow-up