Speaker
Description
OrthoTargetDB helps researchers explore protein homologs across species by aggregating annotations from over a dozen databases and presenting them as filterable data alongside a phylogenetic tree in an interactive web interface.
While the tool’s codebase initially ran end-to-end and produced plausible results, its components shared a mutable global state, lacked tests, and pipeline steps could not be run independently. This made the behaviour difficult to inspect and verify. Refactoring the system revealed hidden errors with the potential to bias downstream analyses.
This lightning talk presents how the tool was transformed to make processes and outputs more transparent and verifiable. Key changes included isolating pipeline stages, introducing validation, and restructuring data handling. These improvements not only made hidden bugs visible but also enhanced transparency, reproducibility, and setup speed.