Final PhD Presentation: Reproducibility Debt in Scientific Software

08 September 2025, 12:00, CSIT Level 2 - Systems Area
Speaker: Zara Hassan (ANU)

Abstract#

Reproducibility in scientific computation is essential for validating research but remains elusive due to the complexity and continuous evolution of software systems. These challenges, prevalent across computational sciences, have led to the accumulation of what we conceptualise as Reproducibility Debt (RpD), sub-optimal practices adopted for short-term gains that ultimately compromise the ability to reproduce results. This PhD research offers the first domain-agnostic definition and assessment of RpD in scientific software, systematically identifying its causes, effects, and mitigation strategies. A mixed-methods study was conducted, comprising a systematic literature review of 214 papers, interviews with 23 practitioners, and a global survey (InsightRpD) with 59 participants. Across these studies, 75 causes and 110 effects of RpD were identified, and a probabilistic cause-effect model was developed to illustrate their relationships. The triangulated findings yield a theoretical framework that supports both understanding and proactive management of RpD. The framework consolidates mitigation strategies and provides a shared vocabulary for practitioners. This work delivers conceptual clarity, empirical evidence, and practical tools, laying the groundwork for further research into reproducibility challenges in scientific software development.

bars search caret-down plus minus arrow-right times