-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rez comparison #25
Comments
Yes we need to add a comparison section in the docs indeed! In short, some of the main differences are:
There is also a lot in common:
I’ll be looking in a more thorough comparison and update you :) Are there any specific features of Rez you would like to see a comparison of? |
This is great, thank you. Let me have a think while I use Rez to see which parts I'm most interested in. Most pressingly is the authoring of new packages, which is such a pain with Rez. Partly because of the separation between what you build and what you publish; two hierarchies I need to keep track of. I understand why, and don't really see a way around it though. And partly because the syntax is Python but so alien (everything being hidden globals) that I keep forgetting how to use it, so JSON should help there. Preferably I'd be able to quickly author small projects from the command-line. Speaking of JSON, I know Rez started with JSON as well, but eventually transitioned into Python. I can't pinpoint exactly why, but that could be something interesting to look into and see whether the problems solved by that decision are problems you are having or will be having as well. Sorry it's a bit rambly, too many things going on. xD But great summary! |
Can you elaborate on that? Is that because you need to duplicate information about the package (e.g. requirements, description, etc.) between the I actually don't have a lot of experience with Rez beside the investigation we conducted a few years ago, so this is great information!
My gut feeling is that using a Python file is handy to use complex logic when installing a package: def commands():
if something:
env.PYTHONPATH.append("foo")
elif something_else:
env.PYTHONPATH.append("bim")
... This could backfire as people would tend to write unnecessary complex logic which can break the deterministic nature of the tool. But we might be missing some crucial point indeed, would be interesting to know more about this. |
Sure, let me preface by saying that my use of Rez is not like most; I use it for development environments and project management. That is, managing the environment for projects (like shot length), in addition to what software goes into each project (like Maya). Most only use it for the latter. Here's what I'm talking about.
Here's what they look like on disk.
The latter being "built" from the former. Sometimes, to "build" means to compile, which means the former is source code and the latter binary, in which case this separation makes sense. I keep the former under source control, and the latter inside of Rez's registry. But a lot of times, my source and build hierarchies are identical, like in the case above where the package consists solely of environment variables (e.g. putting Half of my use of Rez is for just that; environment management. Maybe more. And for that, this separation is major PITA. I've been toying with some ideas to simplify that, like.. And.. But at the end of the day, hand-written shell scripts (e.g. PowerShell) are quicker to write and easier to remember. Which is a bummer, because they are also a pain. Out of interest, should Wiz handle cases like these? And if so, what would that look like? I actually have a lot more on this from a recent venture. That ultimately led to a GUI on top of Rez. For which I later encountered another contender to Rez, called Spack which has some interesting (and likely familiar) ideas if you haven't had a look already. Not to mention Nix. That should keep you busy for a bit. :) And also reminds me, apart from Rez, are there any other inspirations for Wiz?
That actually brings me to another question; why Wiz? What made you start a new project, over using one that already exists? It's not exactly a crowded space, so alternatives are good. And from experience, the vast majority of studios also implement their own solution to this problem, despite having access to a free Rez. The difference here is you chose to open source it, which is great. |
Yeah this is one of the reasons we didn't want to deal with the building process as it forces you to handle every possible building strategy. In your case you might find it easier to install VS manually and create a definition to use it within a custom environment. Or if you are deploying it for multiple users, it might be worth considering a data management system like puppet or conda and write a tool which automatically creates the definition when you install it or links the definition with existing VS install (Gitlab-CI, build script, etc). For instance, the Python installer we wrote is a lightweight extension of Pip which installs each package to a special folder structure instead of targeting a common
We do have our development setup use Wiz as well. As a very simple example, this is how a development registry can look like:
We would then run:
(We can give you a more specific example if you want more details) So yes, we are using Wiz for job setups and development for similar reasons you seem to be using Rez for.
Yes, ION and Anaconda, but we didn't like the idea of using containers as we wanted to avoid duplicating data.
While we absolutely needed a way to manage environments, we had some good processes in place for building packages, and migrating everything under Rez would have been painful. We wanted the benefit of working with Rez (no containers) while being able to be able to leverage robust data management systems (pip, conda, puppet, etc) and have the flexibility of using different data management depending on the context (development and job setups).
This is exactly where Wiz comes from :) But since we have it running in production, we figured that there might be interest outside of the Mill, which could help us improve it. I think there is a real benefit in seeing the environment management step as a separated layer that be built upon, hopefully sharing the code can help finding holistic solution around these issues :) Thanks a lot for all the links! |
Just to clarify, rez doesn't enforce management of the package data. A rez package definition can refer to a package payload in any location, or not at all. The rez-build tool does install into the package repo alongside the package definition, but that is a matter of convenience - you can build something else on top of the rez API if you want to do something different. Eg
One current limitation is the inability to define per-variant attributes beyond the requirements list (eg, an explicit install path). That's something I'd like to address. Some other points:
At its core, rez is a bunch of package definitions in various repos, and a dependency resolver. Once resolved, each package then configures the resulting env, typically by setting/appending env vars. Is this not the same as Wiz? Everything else rez has (rez-build/release and associated tooling for the most part) is optional extra. I'm interested in the resolver side of thing also. Does Wiz give the same guarantees that rez does? Specifically, you're guaranteed to get the latest possible version of each package in the request, in requested order priority. I tried other resolver implementations in the past (specifically using boolean satisfiability - eg https://github.com/niklasso/minisat). Whilst good at finding all possible solves (of which there can be millions), it was not very good at finding the one you want - ie, the one with the highest package versions, in some deterministic way (and rez provides this determinism as mentioned - priority based on order of package request). Thx |
I'm curious about this too. I very much underestimated the importance and complexity of this before taking Rez for a spin. On the other hand, although technically deterministic, it still surprises you on occasion because of how deep the dependency can get. Sometimes, you know that 4th level dependency incompatibility won't become a problem in a specific circumstance and then you're left with unrolling the spaghetti. For less-automated, hands-on, ad-hoc setups (like perhaps tens/hundreds of ad-projects), I can imagine a less complex solver being suitable so long as it's predictable. E.g. let the user ask for both Maya 2019 and 2020 and merge the results, or let indirect dependencies be incompatible with each other and transfer some of the burden to the developer/user in exchange for a more predictable resolver and iterative workflow (e.g. "I'll fix that warning later). |
Thanks for your inputs Alan!
It does. We will document the algorithm in more details (#33), but here is how it works:
You can find some examples on steps 1-2-3 in the benchmark tests.
Thanks for the link, there are a few package resolvers out there which uses SAT solver so this is definitely worth exploring further:
I also started to have a look at Pip's new dependency resolver as the problem they are trying to solve is very similar to ours.
This is a very hard problem indeed! :) |
Regarding your other points:
This is precisely what we wanted to prevent though. it is sometimes tempting to solve a problem with a simple command, but this could end up being a nightmare to debug. (e.g. this package)
Ok I see what you mean, is that the serialization logic? It seems quite harder than to do than using a data serialization format though. What was the blocker with YAML? The commands I suppose? |
Hey Jeremy,
Thanks for the info!
RE serialization:
That code link you gave is something a little different - this is for
processing packages that haven't been built yet, there are constructs (such
as 'early' decorator) that don't exist in installed packages, because they
only make sense pre-build. Basically there is not a serialise format per se
- the package.py contents _is_ the format, and the API is used to get a
package definition to/from python source. Wrt YAML, it actually is still
supported but some newer features aren't, I've opted to deprecate it as
there's no pressing need and it's just more code to maintain - everyone
uses package.py.
RE solver:
That sounds really interesting. The biggest issue I ran into with SAT was
the inability to apply weights to the solutions, hence needing to search
the entire solution space to find the right solve. One question though,
specifically about -
"""A graph is created from initial package requests with all dependencies,
including all versions
<https://wiz.readthedocs.io/en/stable/definition.html#version> and variants
<https://wiz.readthedocs.io/en/stable/definition.html#variants> of each
package"""
We (Method) have enough packages that this could easily pull in 1000's or
10,000s of packages, which would mean constructing a pretty hefty initial
graph. Have you run into issues with how long it takes to process this
phase?
Cheers
Allan
…On Wed, Sep 2, 2020 at 4:56 AM Jeremy Retailleau ***@***.***> wrote:
Regarding your other points:
rez is deterministic unless a package author deliberately does something
non-deterministic in their commands function. In practice I've never seen
this (beyond obvious cases such as appending some user-specific plugin
path, for eg)
This is precisely what we wanted to prevent though. it is sometimes
tempting to solve a problem with a simple command, but this could end up
being a nightmare to debug. (e.g. this package
<https://github.com/predat/rez-packages/blob/ae72b8619b519ff7ae026397ab68e9892a92f441/softs/houdini/16.5.439/package.py>
)
rez package definition is serializable, it has to be because definitions
are stored into its memcached server
Ok I see what you mean, is that the serialization logic?
https://github.com/nerdvegas/rez/blob/663efc277924bdb353c85869585132f4191b703e/src/rez/serialise.py#L284
It seems quite harder than to do than using a data serialization format
<https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats>
though. What was the blocker with YAML? The commands I suppose?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#25 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMOUSTHBMI7ARA3QKPFBMTSDU7U3ANCNFSM4PXM255A>
.
|
We did run into performance issues during the definition discovery phase that were mostly solved in v3.1.0. The benchmark give satisfying results for up to 4500 definitions:
Tweaking the test to load 10.000 definitions is still under a second:
We didn't implement any serious caching logic yet so there is still room for improvement! |
Hey Jeremy,
I think I'll have to wait for that algorithm description as there's
definitely concepts I don't follow yet. For example, you've said you build
a graph connecting all packages/variants with their requirements, but even
for a modest resolve, that would be a million+ edges. Ie, I'm assuming that
if foo-1.2.3 requires bah>1.2, that might be 20 outgoing edges (if there
are 20 bah versions > 1.2). Also to your point in (2) (A graph
"combination" is generated with only one variant of each package), clearly
any combination of variants from packages could be the correct resolve, so
I don't yet know what happens if these latest variants conflict.
In any case, I look forward to finding out more. If the functionality here
is equivalent to what rez is doing, it could make sense to port it. Have
you considered separating the solver out into its own project? General
dependency resolvers in python aren't much of a thing and I'm sure there
would be applications for it outside of package management.
Thanks
A
…On Wed, Sep 2, 2020 at 12:09 PM Jeremy Retailleau ***@***.***> wrote:
We (Method) have enough packages that this could easily pull in 1000's or
10,000s of packages, which would mean constructing a pretty hefty initial
graph. Have you run into issues with how long it takes to process this
phase?
We did run into performance issues during the definition discovery phase
that were mostly solved in v3.1.0
<https://wiz.readthedocs.io/en/stable/release/release_notes.html#release-3.1.0>
.
The benchmark give satisfying results for up to 4500 definitions:
> pytest ./test/benchmark/test_definitions_discover.py
------------------------------------------------------------------------------------------------- benchmark: 5 tests ------------------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_discover_1500_definitions 73.5521 (1.0) 91.4437 (1.0) 77.4378 (1.0) 6.3268 (1.0) 74.2542 (1.0) 2.9440 (3.51) 2;2 12.9136 (1.0) 11 1
test_discover_3000_definitions 151.4398 (2.06) 188.0195 (2.06) 158.2435 (2.04) 14.5964 (2.31) 152.4630 (2.05) 0.8385 (1.0) 1;1 6.3194 (0.49) 6 1
test_discover_4500_definitions_linux_only 228.8429 (3.11) 283.2860 (3.10) 242.3035 (3.13) 23.0151 (3.64) 232.2764 (3.13) 15.8498 (18.90) 1;1 4.1271 (0.32) 5 1
test_discover_4500_definitions 230.3487 (3.13) 253.9806 (2.78) 236.5502 (3.05) 9.8483 (1.56) 233.2503 (3.14) 7.8511 (9.36) 1;1 4.2274 (0.33) 5 1
test_discover_4500_definitions_windows_only 239.0745 (3.25) 370.6112 (4.05) 292.9962 (3.78) 48.4912 (7.66) 281.2205 (3.79) 50.1262 (59.78) 2;0 3.4130 (0.26) 5 1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Tweaking the test to load 10.000 definitions is still under a second:
------------------------------------------------------------------------------------------------- benchmark: 5 tests ------------------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_discover_3500_definitions 157.8479 (1.0) 177.3744 (1.0) 163.5049 (1.0) 7.1500 (1.0) 161.2196 (1.0) 4.7130 (1.0) 1;1 6.1160 (1.0) 6 1
test_discover_7000_definitions 331.6141 (2.10) 369.6768 (2.08) 344.8822 (2.11) 17.0816 (2.39) 334.5079 (2.07) 26.8802 (5.70) 1;0 2.8995 (0.47) 5 1
test_discover_10500_definitions_windows_only 494.6193 (3.13) 543.4300 (3.06) 520.6159 (3.18) 19.7367 (2.76) 519.6463 (3.22) 32.1102 (6.81) 2;0 1.9208 (0.31) 5 1
test_discover_10500_definitions 497.8384 (3.15) 571.8083 (3.22) 533.4982 (3.26) 26.8244 (3.75) 537.1576 (3.33) 29.3926 (6.24) 2;0 1.8744 (0.31) 5 1
test_discover_10500_definitions_linux_only 498.1399 (3.16) 540.1189 (3.05) 524.9916 (3.21) 17.2282 (2.41) 532.1038 (3.30) 24.5732 (5.21) 1;0 1.9048 (0.31) 5 1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
We didn't implement any serious caching logic yet so there is still room
for improvement!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#25 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMOUSRJWZGWP5JTEWWQGT3SDWSOTANCNFSM4PXM255A>
.
|
Sorry I misunderstood your question, the loading of 10.000 definitions from the registries takes around half a second but not if you include all these nodes in the graph! The graph is only built from the initial package requests dependency tree. Actually I'm not sure we ever had any requests which ended up building a graph with more than 500 nodes, but this would be an interesting metric to track. I pushed a little benchmark on the dev branch to see how it would behave with 100, 1000, 5000 and 10.000 nodes: These are my results:
In this scenario, only the latest version of "bah" will be added to the graph (and all its variants if necessary). But if another package requests another version of bah (e.g. "bah==3.0.0"), this version will also be added to the graph and it will be considered as a conflict that needs to be solved. That's what I meant when I said that all versions are added to the graph, sorry for the confusion.
That would be an interesting idea. The main blocker at the moment would be time and resource but once we're done with open sourcing this framework we can probably give it a try! |
@mottosso We just made our Python installer public: You might find it useful to kickstart small Python projects without too much overhead. This part might particularly interest you: https://qip.readthedocs.io/en/stable/development.html Let us know what you think about it! |
Thanks, a wrapper for pip is a good idea and something I'm familiar with. :) It does the same job, calling The main hurdles I found was:
Other than that, this is limited to Python packages, which isn't necessarily an issue. I found that you could easily extend the concept to additional package managers, like rez-scoopz for Windows packages, and had some ideas for wrapping things like So overall, I think this is the right track! |
Yeah it's pretty much the same strategy we adopted :)
We actually don't use the automatically generated scripts, instead we create aliases from entry points using the python -m command. For instance, definition for "command": {
"pyblish": "python -m pyblish.cli"
} So you can simply run it with the For compiled packages, it will try to compile it during install like it does with Pip, but we have set a Devpi index over PyPi to make sure that we have wheel instead of Tar files.
Wiz is, but Qip isn't. For consistency, all definitions created by Qip use lowercase identifier, so we don't really run into this issue:
Yeah we haven't really solved that one either, we download the package and then extract the dependencies using pkg_resources Might be worth submitting a issue to pip to provide this feature at some point.
I didn't know about scoop, very interesting! We are currently working on a wrapper around Conda which will provide non-Python libraries and the ability to setup our own channel |
Ah, yes. You've got the luxury of not supporting Windows. :) |
Not sure I understand, you mean |
No, it does. Re-reading your message, I thought you were referring to making an But what you actually meant was..
Which is interesting! It would solve that issue, on every platform. It does make executables longer to type, and is that something you call from within an environment, or before you activate an environment? |
We can call aliases from the
And also from a spawned environment as we create a temporary RC file to define aliases:
This strategy works for most cases, but not all of them. Funnily enough, I got reminded of that when trying to demonstrate it with
More on this issue here: themill/qip#7 Also Wiz doesn't work on Windows yet, but we should have this covered soon-ish (#14). |
Hola! Was just tipped about this project, looks interesting!
Would it be possible to write a few lines about whether someone familiar with Rez should consider Wiz, and how it differs? I noticed reference to part of it in #19.
The text was updated successfully, but these errors were encountered: