Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lockfiles #14

Open
jmarshrossney opened this issue Jul 22, 2024 · 8 comments
Open

Lockfiles #14

jmarshrossney opened this issue Jul 22, 2024 · 8 comments
Labels
wontfix This will not be worked on

Comments

@jmarshrossney
Copy link
Collaborator

Lockfiles contain a list of all the dependencies, both direct and indirect, of a package, pinned to exact versions. They are necessary though not sufficient for fully reproducible environments.

Another advantage of creating an environment from a lockfile is that you skip the often slow solving step, which is particularly annoying when you have to do it multiple times because one of the packages has introduced a bug (e.g. in a recent update) which you haven't yet spotted.

Obviously if you're developing a package that you want to support over a wide range of package versions then you might not be so interested in avoiding these kinds of problems, but I think we are more interested in experimenting with the science right now so I don't immediately see a downside of locking our dependencies.

I would have liked to introduce conda-lock in #13 but unfortunately this does not seem possible while we depend on plankton-cefas-scivision.

Another option is abandoning conda and using pip to install everything, but I don't expect that to be popular, nor am I really pushing it.

One of the more likely ways out of this is that we no longer depend on plankton-cefas-scivision, e.g. if we were to train our own model or if Turing come out with a new offering.

@jmarshrossney
Copy link
Collaborator Author

Fun fact I just discovered: In principle we can specify all dependencies in pyproject.toml and from this create both conda and venv virtual environments with help from conda-lock.

@metazool
Copy link
Collaborator

This is a useful fun fact that would have immediate wider value! The current minimal approach to pyproject.toml originally came from this post

Even if the model is likely to be deprecated, bound to be similar scenarios occurring. The other project where there are choice of conda / venv, environment management issues and a clean recommendation would be immediately relevant, is building on open-cd...

@mattjbr123
Copy link

The current minimal approach to pyproject.toml originally came from this post

Very useful little blog this! Thanks for sharing as always :)

@mattjbr123
Copy link

mattjbr123 commented Aug 15, 2024

If I've understood what conda-lock is doing correctly, I used to do something similar with
conda list --explicit > envfile.yml
which would create an environment file (essentially a list of urls) which you could pass to other people to replicate your environment, and I think it just installed everything in the file without using the solver.
conda create --name envname --file envfile.yml (or something like that)

@jmarshrossney
Copy link
Collaborator Author

Hey @mattjbr123 that sounds entirely reasonable and probably works fine in 99% of cases, but I don't think it's entirely bullet proof.

I could be wrong wouldn't expect it to skip the solve step, cos there's still a chance the file was created from a broken environment and how would it know without solving the environment first to check?

Apart from that, the issue with conda list --explicit is that it doesn't include pip installed dependencies.

I'm definitely a fan of what conda is trying to do with standardising environments but comments like this one make me want to disengage!

@jmarshrossney
Copy link
Collaborator Author

Reading this comment by one of the (main?) conda-lock maintainers.

The main things relevant to us are:

  • conda-lock seems like it might be struggling under its maintenance burden, and the fact that its maintainers are discussing endorsing another project seems like something to pay attention to.
  • The project they are considering endorsing is called pixi, which looks and feels very similar to poetry, including native lockfile support, but uses conda environments under the hood so can deal with non-Python dependencies.
  • pixi is a substitute for the entire conda workflow (conda activate, conda install etc), kind of like how poetry replaces the venv/pip workflow while using both tools under the hood.

So I'm planning to keep an eye on this and will report back!

@jmarshrossney
Copy link
Collaborator Author

I tried to incorporate conda-lock into this project, see #26 .

I've found it ok in the distant past when I only used conda and almost never pip, but in this case it has been quite frustrating and ultimately the single-sourcing idea failed.

I'm inclined to keep using simple python-specific tools/lockfiles and just accept that environments are non-reproducible at the level of cuda etc.

@metazool
Copy link
Collaborator

The extended self-dialogue in the comments on #26 offers an interesting learning experience that others won't have to go through! I suggest we close this as a wont-fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants