Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify environment.yml #13

Merged
merged 2 commits into from
Jul 29, 2024
Merged

Conversation

jmarshrossney
Copy link
Collaborator

Overview

This PR makes a small change to environment.yml so that

  1. Only direct dependencies are included
  2. Everything that can be installed using conda install is installed using conda install, instead of pip install
  3. chromadb is pinned to 0.5.3

I also modified the supported python versions in pyproject.toml to reflect the fact that we aren't supporting <=3.8 as far as I can tell.

Finally, I've slightly tweaked the installation instructions in README.md.

Dependencies

Issues like the one we've just encountered with chromadb 0.5.4 are going to occur, and it's in these situations where I always seem to end up rediscovering how brittle conda environments with mixed conda and pip dependencies are - see e.g. this article. Concretely, it took me about 5 attempts to create an environment that passed the tests by manually installing the packages in the existing environment.yml.

Acknowledging that there is a reasonable case for using conda here since there are several non-Python dependencies that we might one day want to exert control over (but like... will we?), I've changed environment.yml so as to use conda install instead of pip wherever possible.

I think the direct dependencies we need are

  • pytorch 1.10.0
    • I have to downgrade MKL mkl=2024.0 to fix this bug
  • pandas
  • python-dotenv
  • s3fs
  • pytest
  • intake-xarray
    • This depends on scikit-image although it does not list it as a dependency...hmm
  • chromadb (pinned to 0.5.3 for now)
  • scivision
  • resnet50_cefas (plankton-cefas-scivision)

The rest are I think indirect.

Of the direct dependencies, only scivision and resnet50_cefas cannot be installed using conda install.

So to me a sensible environment.yml looks something like this

name: cyto_39
channels:
  - pytorch
  - conda-forge
  - defaults
dependencies:
  - python=3.9
  - pytorch=1.10.0
  - mkl=2024.0  # see https://github.com/pytorch/pytorch/issues/123097
  - chromadb=0.5.3
  - intake-xarray
  - scikit-image  # intake-xarray dependency
  - pandas
  - pytest
  - python-dotenv
  - s3fs
  - pip
  - pip:
    - scivision
    - git+https://github.com/alan-turing-institute/plankton-cefas-scivision@main

If I then run the following in the root of the repository,

$ conda env create -f environment.yml
$ conda activate cyto_39
$ python -m pip install .
$ python -m pytest

I get 1 failed, 4 passed, 9 warnings in 4.74s where the failed test is just the device mismatch mentioned in #5, and easily fixable.

Notes

I also have channel_priority set to strict in my .condarc, which is now recommended.

Some of the failed attempts resulted in IOError: [Errno 9] Bad file descriptor for reasons I failed to understand.

This was referenced Jul 22, 2024
@metazool metazool self-assigned this Jul 23, 2024
@metazool
Copy link
Collaborator

The changes look good and run clean for me locally.

I'm puzzled by the coverage workflow error. You can see the coverage report generates fine. "Resource not accessible by integration" is ostensibly often a permissions issue, but our project settings show the correct level of read-write

I'm guessing this is because you appear as an outside contributor (though with permissions inherited from group membership) the workflow is failing at the point of posting the coverage report as an issue comment. Can try adding you as a direct contributor, also welcome to make branches and PRs directly in this project rather than via a fork. Happy to approve anyway

Copy link
Collaborator

@metazool metazool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a quick look at the action comment permissions before logging off, but glad of contribution!

@metazool
Copy link
Collaborator

metazool commented Jul 29, 2024

Going to merge this on the grounds that the failing check looks like an artefact of unexpected behaviour of proprietary CI, can always back out if that's somehow not the case

@metazool metazool merged commit 773f7a3 into NERC-CEH:main Jul 29, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ImportError undefined symbol: iJIT_NotifyEvent encountered when MKL 2024.1 is installed.
2 participants