Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single-source pip and conda dependencies #26

Closed
wants to merge 17 commits into from

Conversation

jmarshrossney
Copy link
Collaborator

@jmarshrossney jmarshrossney commented Aug 25, 2024

As mentioned in #14, it 'should' be possible to specify all of our dependencies in pyproject.toml, from which various lockfiles and environment files for either venv or conda environments can be derived.

If you only care about managing/locking python dependencies there are several good options pip-tools, poetry, hatch, pdm, or even just pip freeze. That bit is well-trodden!

On the more experimental side, conda-lock claims to be able to produce lockfiles for conda environments with mixed conda/pip dependencies from a pyproject.toml.

This PR tracks some experimentation with doing this for our project.

It's not really crucial for us - we don't have many deps so managing both pyproject.toml and environment.yml isn't going to be a huge burden. But I'm still curious.

[needs rebasing on #24]
[please squash before merging, if merging at all!]

pyproject.toml Outdated
@@ -18,13 +18,13 @@ dependencies = [
"scikit-image", # secretly required by intake-xarray as default reader
"torch",
"xarray",
"resnet50-cefas@git+https://github.com/jmarshrossney/resnet50-cefas",
#"resnet50-cefas@git+https://github.com/jmarshrossney/resnet50-cefas",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conda-lock fails here. According to the documentation it can deal with this situation where the poetry dependency specification is adopted, but doesn't mention how to resolve this for PEP 621 style pyproject.toml files.

This might be to do with conda-lock using poetry under the hood..

pyproject.toml Outdated
dev = ["pytest", "black", "flake8", "isort"]
all = ["cyto_ml[jupyter,dev]"]
dev = ["pytest", "black", "flake8", "isort", "pip-tools", "conda-lock"]
#all = ["cyto_ml[jupyter,dev]"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern for collecting dependencies together also doesn't work with conda-lock as it tries to look up cyto_ml and doesn't find it.

@jmarshrossney
Copy link
Collaborator Author

The next thing I tried was to explicitly split into pip and conda dependencies by adding default-non-conda-source = "pip" to the [tool.conda-lock] table as instructed here, and adding a new table [tool.conda-lock.dependencies] for the conda deps.

I don't actually want to have to do this at all, because I don't want to remove anything from [project.dependencies] since that will break the pip installation.

Anyway, I tried to test this approach by demanding that only python is installed through conda and the rest is pip installed...

# in pyproject.toml
[tool.conda-lock]
channels = [
    "pytorch", "conda-forge"
]
platforms = [
    "linux-64", 
    "win-64"
]
default-non-conda-source = "pip"

[tool.conda-lock.dependencies]
python = ">=3.12"

... but it failed with

File "/home/joe/Projects/Plankton/cyto_ml/venv/lib/python3.12/site-packages/conda_lock/lockfile/__init__.py", line 38, in _seperator_munge_get
    return d[key.replace("_", "-")]
           ~^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'pip'

I don't understand what's going on at all, and am inclined to draw the tentative conclusion that single-sourcing dependencies in a pyproject.toml and using conda-lock to generate lockfiles and environment files for conda environments is only well-tested / robust when using poetry.

@jmarshrossney
Copy link
Collaborator Author

So I failed at single-source and I don't think there's much point in investing more in this for the time being.

I have added some documentation to the README.md to explain how to create reproducible environments using either pip + requirements.txt or conda-lock + conda-lock.yml, and how to update the lockfiles.

I created the lockfiles as follows:

pip-compile --extra jupyter --extra dev --output-file requirements.txt pyproject.toml  # requirements.txt
conda-lock lock --mamba -f environment.yml  # conda-lock.yml

In both cases you need to run python -m pip install --no-deps . to install cyto_ml itself.

Does this seem overkill? Yes, I think it does. This is where something like poetry (and potentially pixi?) are nice for making lockfiles part of the standard workflow rather than an additional effort where you have to really think about what you're doing, in what order etc.

@jmarshrossney
Copy link
Collaborator Author

On a final note, both lockfiles have all of the dependencies, including the optional jupyter and dev groups.

It is not difficult to maintain multiple lockfiles for different dependency groups with either pip-compile or conda-lock, but I don't see the point right now.

conda-lock.yml Outdated
@@ -31,7 +31,7 @@ package:
manager: conda
platform: win-64
dependencies: {}
url: https://conda.anaconda.org/conda-forge/win-64/_libavif_api-1.1.1-h57928b3_0.conda
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this has happened. I ran conda-lock --update matplotlib to test updating a lockfile..

requirements.txt Outdated
@@ -2,7 +2,7 @@
# This file is autogenerated by pip-compile with Python 3.12
# by the following command:
#
# pip-compile --extra=dev --extra=jupyter --output-file=requirements.txt pyproject.toml
# pip-compile
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I ran pip-compile --upgrade and it dropped all of the optional dependencies :/

@jmarshrossney jmarshrossney mentioned this pull request Aug 25, 2024
@jmarshrossney
Copy link
Collaborator Author

Sorry - I didn't realise that chaining PRs from a fork would be quite so messy, though I should have. The relevant commits start with a1e7acf - conda-lock working example. Everything before that is from #24 and earlier. I thought you could change base branches after opening but of course you can't with the fork-and-PR workflow.

In future I will use branches..

@jmarshrossney
Copy link
Collaborator Author

I don't think these are worthwhile additions, though it was an interesting exercise! Closing :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant