"Reproducibility" in ML — beyond random seeds, what actually matters?

📰

Reddit r/learnmachinelearning·u/Beneficial_String411·about 1 month ago

Reading 0:00

15s threshold

Beginner intermediate question. "Set the random seed" is the textbook answer, but in practice that only fixes one variable.

What actually breaks reproducibility in your experience?

- Different CUDA versions (already a known issue)

- Stochastic libraries (cudnn determinism flags)

- Data version drift (dataset got updated, you didn't notice)

- Threshold/metric definition shift (someone redefined "accuracy" in code)

- Non-determinism in eval harness itself

Building a mental model of which of these matters most for which kind of work.

Menu