Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we endorse Pixi? #615

Open
maresb opened this issue Mar 4, 2024 · 13 comments
Open

Should we endorse Pixi? #615

maresb opened this issue Mar 4, 2024 · 13 comments

Comments

@maresb
Copy link
Contributor

maresb commented Mar 4, 2024

While I have not yet had the chance to transition to Pixi in any of my projects, I'm convinced that this is the future of lockfiles in the Conda ecosystem. Moreover, they have a full-time team moving things forwards.

I think that conda-lock will still play a role with supporting a large number of existing projects since the effort of retooling can be hard to justify. But I'd hope that new projects would look primarily at Pixi.

Dually, I'd be very curious to hear from people who have tried both, which things does conda-lock do that are are not yet adequately supported by Pixi?

@synapticarbors
Copy link

I've just started experimenting with pixi and there are definitely a lot of things to like. The biggest issue that I've seen so far is just the difference in workflow/philosophy from conda. Pixi is directory-based, whereas conda/mamba has a global registry of environments. With conda I use the same environment for several projects, and will also do stuff like work in a scratch directory to fire up some jupyter-lab instance to do some quick analysis, or just open up a terminal, run mamba activate <env> and then work from there.

Pixi requires you to cd into a known location and run pixi shell or pixi run, or make use of the output of pixi shell-hook, which again has to live somewhere. So this isn't necessarily about features related to conda-lock, but is more a question about whether pixi itself is a full drop-in replacement for all conda workflows, and about adoption. There's been some discussion of pixi supporting a global registry of environments (prefix-dev/pixi#188), but this difference in workflows is an opinionated decision on the part of the pixi developers and isn't likely to change at least in the short term.

I think one thing to ask is if there is a world where you could use pixi to generate lock files and those lock files would be usable directly from conda/mamba to build environments. Then pixi could be used in place of conda-lock, without the requirement of adopting the full pixi workflow.

@maresb
Copy link
Contributor Author

maresb commented Mar 4, 2024

Thanks @synapticarbors for the feedback, it's really valuable, especially since I haven't found the time to investigate these details myself!

Regarding your question, someone asked something very similar here and received and answer.

In order to get the x-link, the answer directs here: prefix-dev/pixi#800

@maresb
Copy link
Contributor Author

maresb commented Jul 21, 2024

I'm trying to develop some ideas for how to proceed with this, while being honest about my lack of time for maintenance.

README and migration guide

I propose adding a notice to the README, something like:

The conda-lock maintainers recommend using pixi for new projects. Pixi is evolving at a much faster pace and provides better support, so we also recommend considering migrating existing projects when feasible.

We might also consider some sort of migration guide separate from the README and make roughly the following points:

  • pixi is based on projects rather than environments
  • pixi is an all-in-one project management tool analogous to Poetry
  • Groups of dependency specifications are called "features". Features are combined into "environments".
  • Pixi "environments" are stored within the project directory (like PDM) and are normal Conda environments.
  • In Python projects, even when using pyproject.toml in place of pixi.toml, the pixi dependencies are independent of the project.dependencies from the pyproject.toml and thus currently require repeating the dependencies from project.dependencies in the pixi.toml's dependencies or pypi-dependencies sections.
  • (And maybe some others?)

Missing features

Then there's the question of what functionality is missing in the gap between pixi and conda-lock?

Some things that come to mind:

  • There's no way to conda-lock install a pixi.lock to a global environment outside the project directory, though there are many proposals floating around for such functionality. discussion.
  • Conda-lock currently parses sources from pyproject.toml, meta.yaml, and environment.yaml files, while pixi doesn't. discussion
  • (Any other functionalities that people are missing?)

What should we do?

It's plausible that this one or both of these features arrive soon in pixi. On the other hand, given what we currently have in conda-lock, I wonder if we could quickly provide solutions ourselves.

  • Given all the code we have already written, maybe it would be easy for us to parse pixi.lock files and render them to the explicit format that can be installed using your Conda command of choice. I think this would make most sense as a separate project cannibalized from conda-lock code. This idea originates from @bollwyvl. (As for accessing the lockfiles, after a brief attempt at searching for formal schema I asked.)

  • As for the dependency source files, I was thinking it would be helpful to have a pre-commit style dependency updater. This could be a Python CLI that takes in a bunch of -f dependency-source arguments like conda-lock currently does, and then modifies a pixi.toml to ensure the dependencies are correctly reflected in the pixi.toml. We already have the function make_lock_spec for turning files into dependency specifications, and the specifications are even in the form of a LockSpecification Pydantic model.

I'm very curious what people think, and if anyone's willing to tackle any of this?

@synapticarbors
Copy link

Just as a note, I recently wrote a prototype tool to render a pixi.lock to an explicit conda environment spec file that can be installed by conda/mamba:
https://github.com/synapticarbors/pixi2ces

I've been talking to some of the pixi devs about developing it into the start of the pixi export command. I just submitted a PR to add a to_spec_string to the ExplicitEnvironmentSpec struct in rattler, which will then be used downstream in pixi.

@bollwyvl
Copy link
Contributor

endorse

I think in general, having a call-out for Alternatives would be sufficient, probably with ❌/✅ feature table. For example, conda-lock will bootstrap on most platforms given only pip but the same cannot be said for a number of other tools (including pixi).

As to integrating directly with pixi... as a solver, as long as it doesn't provide @EXPLICIT or the same output as --dry-run --json, it's not going to be very useful. If it had those, being able to say conda-lock --pixi would be grand, but would be an entirely separate code path, and would almost certainly entail building more tmp folders, and need a bunch more switches.

I don't know if re-implementing the logic for its feature system makes any sense to maintain longterm.

pixi2ces

This looks great as inspiration, and pixi/rattler providing this as a first-party, rather than adding another link in the toolchain, will be welcome! What we don't need today (or in the near future) are more file formats in more versions parseable only/partially by some tools at some versions. @EXPLICIT is terrible (md5 gah), but works everywhere.

Otherwise: pixi is trying to do an awful lot. A pixi project can make for good process but doesn't yet offer a compelling product for the knowledge trapped in the manifest file. This is what truly kills me about heavy process tools such as mvn, etc. as it focuses on the build as the product, which creates true nightmares downstream. I don't know if a pixi-constructor will have some plan for that, e.g. actually embedding a pixi runtime in the installer, but it would probably feel pretty bad as an end user that didn't sign up for the complexity and just wants, likely, one, well-documented CLI command/desktop shortcut to work really well.

I have found the wheel-related stuff to be underwhelming (and the magic it does vs pyproject.toml), so end up using pixi.toml and pip install (usually -e, but sometimes worse things) inside an environment. But then I never used the pip features in conda-lock, or its YAML lockfile format, either. pixi has gotten me onboard to checking in their YAML (or will it be TOML?), but I treat it as entirely opaque, and it would be best to rely on the documented CLI behavior. Sadly the CLI outputs does not yet have a JSON schema, which increases integration risk.

cannibalized

Before adding anything here, or copy pasting any code into new packages, perhaps these morsels could be fully consumed. conda-lock is still a very useful tool (not a mindset), and a lot of product pipelines depend on it. However, I doubt any of them use all of the optional features, but they are always paying for them. I'd see splitting most of them into much more focused, standalone packages (e.g. conda-lock-git, and not even bothering with [extras]), and tying them together with entry_points, leaving conda-lock-core as a python-only library with absolutely minimal dependencies.

@maresb
Copy link
Contributor Author

maresb commented Jul 23, 2024

Thanks so much @bollwyvl for your input!!! I'm struggling a bit to unpack everything. I'd really appreciate it if you could help me understand your viewpoint.

bootstrap on most platforms given only pip
...
leaving conda-lock-core as a python-only library

Why do you prefer Python-only code? Is this a security/auditability thing?

being able to say conda-lock --pixi would be grand, but would be an entirely separate code path, and would almost certainly entail building more tmp folders, and need a bunch more switches

In my head I've actually been envisioning conda-lock-pixi as a separate package, or maybe even a rust app, that basically manages the tmp folders, lets pixi do its thing, and transforms the result into something useful. I think pixi would do most of the heavy lifting, so as long as pixi is behaving properly I'd hope the conda-lock-pixi side would be fairly straightforward. Do you see this as problematic?

I'm confused by the "Otherwise: ..." paragraph about conda-constructor functionality. It seems like you wish there were a CLI interface for extracting info, but I don't understand how this relates to constructor.

pixi has gotten me onboard to checking in their YAML (or will it be TOML?), but I treat it as entirely opaque, and it would be best to rely on the documented CLI behavior. Sadly the CLI outputs does not yet have a JSON schema, which increases integration risk.

I've been thinking that py-rattler may be a good way to interface with the underlying data structures from Python.

perhaps these morsels could be fully consumed

Which morsels, and what specifically do you mean by "consumed"?

I'd see splitting most of them into much more focused, standalone packages (e.g. conda-lock-git, and not even bothering with [extras]), and tying them together with entry_points, leaving conda-lock-core as a python-only library with absolutely minimal dependencies.

I really love this idea, and I wish I had the time to do it myself. Here I get what you're saying, so I can explain it for readers who are missing context. This is basically a complaint that conda-lock has some heavyweight dependency requirements that support features only used by some. For example, we have gitpython which some time ago had a security advisory that caused lots of grief. Also, the Poetry dependency constantly causes a lot of grief because Poetry is designed to be a standalone tool that operates in its own environment, but we're entangled with its internals making it a pretty big effort to maintain #637.

@bollwyvl
Copy link
Contributor

prefer Python-only

The doorhandle counts. The recommended install option (curl | bash) is a nasty "modern" smell. Starting from a "dumb" box that only has a system python, pip install conda-lock can bootstrap up to something that works.

Otherwise

conda-lock is a tool that takes in files, runs some subprocess, and puts out files. It doesn't claim to do much more than that (though, as lauded above, it can actually install environments as a byproduct of having a $CONDA_EXE).

pixi (and related) are an ethos. As with other heavy workflow tools, a team's product can end up being pixi run get-value, as opposed to pixi run build-shippable-thing-what-provides-value. For a lot of toolchains, smaller Unix-ish tools that do one thing, but over agreed-upon disk artifacts, will remain attractive.

If pixi grew an installer builder (after growing a .conda package builder), one could imagine it actually shipping white-labelled pixi run get-value aliased as my-cool-thing. Which would be slick. But they would probably not have the properties of a well-understood conda environment.

"consumed"?

Move everything other than bog-standard, no-magic-comments, multiple environment.yml parsing into contrib/ packages. Initially, at least, release them all with == pins on conda_lock. If they later graduate into their own repos and version lines, with claimed and tested compatibility, all's the better.

Within this repo, have a folder structure like...
pyproject.toml
src/
  conda_lock/
contrib/
  conda-lock-cli/
    pyproject.toml
  conda-lock-git/
    pyproject.toml
  conda-lock-poetry/
    pyproject.toml
  conda-lock-pixi/
    pyproject.toml
  conda-lock-foobarbaz/
    pyproject.toml

The top-level conda-lock package can have [extras] that depend on the other ones (I guess), but for the most part, they are much more loosely coupled.

@jayqi
Copy link

jayqi commented Jul 29, 2024

As someone who just spent some time today experimenting with a migration from conda-lock to Pixi for a project, I just wanted to give some feedback that my biggest motivation was because Pixi does minimal updates to lockfiles. conda-lock not supporting this (#370) is one of the biggest pain points I have in using it.

@maresb
Copy link
Contributor Author

maresb commented Jul 30, 2024

Yes @jayqi, I agree that the minimal updates are very nice.

Can you provide any feedback regarding the actual migration process? For example, did you encounter any annoyances when transitioning from an environment-based workflow to a project-based workflow? It'd be great to be able to provide a migration guide.

@zmbc
Copy link

zmbc commented Aug 2, 2024

The recommended install option (curl | bash) is a nasty "modern" smell.

It's worth noting that pixi is just a binary. The script at https://pixi.sh/install.sh downloads it and puts it on the path. That's it. You can easily do that manually from their releases page.

It's probably a matter of opinion but I am totally fine with pixi or similar being the "root" that installs Python+pip instead of the other way around.

Also, the concern about controlling where environments are stored that @synapticarbors raised is, I think, completely resolved by pixi's detached-environments feature.

Personally I would be in favor of a notice like @maresb proposed.

@maresb
Copy link
Contributor Author

maresb commented Aug 2, 2024

Thanks a lot @zmbc for the input!

I was looking at py-rattler and it seems to me like most if not all the things we'd need to reimplement conda-lock are already there and wouldn't even require a pixi binary.

@bollwyvl
Copy link
Contributor

bollwyvl commented Aug 9, 2024

I'll preface all of my comments with: any new-start project I've started recently is based on pixi. It's not yet suitable for larger projects I work on due to some of the limitations pointed out above.

totally fine with pixi or similar being the "root" that installs Python+pip

Kind of the whole point of using a tool like conda-lock is to be able to capture the full provenance chain of a solution, which, indeed, even ensureconda will break if used, but self-update tools make this almost intractable.

For some more context raised on their repo: prefix-dev/pixi#1765

detached-envs

... still creates an unpredictable environment directory prefix, which then, clearly, can't then be moved.

wouldn't even require a pixi binary

But then would conda-lock grow a hard dependency on py-rattler? Again, it seems like if this isn't made more modular, the situation will just get worse for a self-proclaimed "lightweight" tool:

> mamba create --dry-run -n foo python | wc
     53     177    2885
> mamba create --dry-run -n foo "conda-lock=0.*" | wc
     79     332    5029
> mamba create --dry-run -n foo "conda-lock=1.*" | wc
    119     572    8947
> mamba create --dry-run -n foo conda-lock | wc
    120     584    9057

@maresb
Copy link
Contributor Author

maresb commented Aug 29, 2024

I just generated a pixi.toml for a large-ish conda-lock-based project. The code I used can be found in #664.

This could be considered a first step towards enabling pixi as a backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants