Software reproducibility wishlist - lmmx/devnotes GitHub Wiki

  • I wish pip-chill worked with conda
    • is there any equivalent? can you at least identify what was installed through conda and then propagate deps to exclude from the pip-chill call?
  • I wish miniconda could read anaconda configs
    • unclear why they aren’t readable (e.g. e4e env), can probably set up in lxc container to test
  • I wish there was a convention to specify the full CONDA setup in a single markdown block
    • make CONDA_SETUP.md a thing... or at least just keep making PR/issues on projects which could use them and maybe it’ll catch on

If all the repos that get released with ML research were a multi-group project rather than individual groups, it’d be inevitable that there’d be more serious effort to formalise the requirements.

Instead we get a colab link and a demo.py

  • The colab notebook has !-prefixed shell commands that involve a lot of colab-specific junk
  • it’s rarely clear how strong the requirement to pin the particular version of Python/TensorFlow/etc is
    • or what in particular would prevent earlier/later versions of things
  • Since running this code at all is GPU-specific, it’s not clear what kinds of guarantees CI-testing could actually provide, but it’s an interesting idea to at least try and call the -h flag as a “test pipeline” in a GitHub Action (could just fork and copy across from a prior example after doing it the first time...)