9–10 Jun 2026
UiT - The Arctic University of Norway in Tromsø
Europe/Oslo timezone

Reproducible data science with Nix and DVC

9 Jun 2026, 13:10
20m
Auditorium Cerebrum (UiT - The Arctic University of Norway in Tromsø )

Auditorium Cerebrum

UiT - The Arctic University of Norway in Tromsø

UiT - The Arctic University of Norway Universitetsvegen 61 9019 Tromsø Norway
Talk (20 min)

Speaker

Jarl Gunnar T. Flaten (SINTEF Nord)

Description

Achieving reproducibility in data science can be challenging, as it depends on software reproducibility as well as data reproducibility and ad-hoc parameters/variables. Nevertheless, good tools are being developed that help make it possible. Nix is a package manager that allows users to conveniently define, create, and work with system-level virtual environments within which one can execute data science algorithms such as fitting or training a model. The data versioning tool DVC can be used to keep track of dataset versions and to associate e.g. a training result with the training and validation data used. In this talk, I will describe how we use Nix along with DVC (and git, upon which DVC is built) to achieve pragmatic levels of reproducibility in some of our data science projects, and touch on some of the shortcomings of these tools.

Author

Presentation materials

There are no materials yet.