Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add yarn-berry3 and yarn-berry4 to fetch and use version 2 yarn.lock files #355053

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

gador
Copy link
Member

@gador gador commented Nov 10, 2024

Problem

Currently there is no good way to use newer yarn.lock files because they rely on yarn-berry and cannot be used together with pre-existing tooling like fetch-yarn-deps.
There is an (abandoned? ) project which converts v1 files to v2 files with some caveats. This does require an internet connection and cannot be used in an isolated builder.

So one needs to convert the yarn.lock file and commit it to nixpkgs to be used for an offline cache.

There is an issue open (#254369) which discusses some ways around it. The most sophisticated way was by @szlend and this PR builds on top of it to extend to be included in our new yarnConfigHook from #318015

Implementation of a solution

A new argument has been added to fetch-yarn-deps: yarnVersion which will default to 1 which uses the original yarn.

For newer yarn.lock files, one can set the yarnVersion to either 3 or 4 which will use yarn-berry3 or yarn-berry4 respectively. We need to separate both versions, because they have some small changes in the way the lock file is created.

This also adds the corresponding tests.

How to use it

For an implementation see the included pgadmin4 derivation, which was also my main motivation for this. I had to build a rather ugly and complicated update script before and do a lot of custom work to build the frontend. This is now simplified.

In short, add yarnBerry3ConfigHook to nativeBuildInputs and

yarnOfflineCache = fetchYarnDeps {
    src = "...";
    yarnVersion = 3;
    hash = "..."
  };

to your derivation.

fixes #254369

Once we agree on the implantation, I'll add the documentation.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.
  • Add documentation

Add a 👍 reaction to pull requests you find important.

pkgs/build-support/node/fetch-yarn-deps/default.nix Outdated Show resolved Hide resolved
pkgs/build-support/node/fetch-yarn-deps/default.nix Outdated Show resolved Hide resolved
pkgs/development/tools/yarn-berry/default.nix Outdated Show resolved Hide resolved
pkgs/development/tools/yarn-berry/default.nix Outdated Show resolved Hide resolved
pkgs/development/tools/yarn-berry/default.nix Show resolved Hide resolved
@doronbehar
Copy link
Contributor

Note also the CI failure.

@gador
Copy link
Member Author

gador commented Nov 11, 2024

Note also the CI failure.

This is caused by not adding the yarn-berry3 and yarn-berry4 to pkgs/by-name which doesn't make sense here

Attribute pkgs.yarn-berry3 is a new top-level package using pkgs.callPackage ./pkgs/development/tools/yarn-berry { /* ... */ }.
Please define it in pkgs/by-name/ya/yarn-berry3/package.nix instead.
See pkgs/by-name/README.md for more details.
Since the second callPackage argument is not { }, the manual callPackage in pkgs/top-level/all-packages.nix is still needed.

  • Attribute pkgs.yarn-berry4 is a new top-level package using pkgs.callPackage ./pkgs/development/tools/yarn-berry { /* ... */ }.
    Please define it in pkgs/by-name/ya/yarn-berry4/package.nix instead.
    See pkgs/by-name/README.md for more details.
    Since the second callPackage argument is { }, no manual callPackage in pkgs/top-level/all-packages.nix is needed anymore.

This PR introduces additional instances of discouraged patterns as listed above. Merging is discouraged but would not break the base branch.

@doronbehar
Copy link
Contributor

Note also the CI failure.

This is caused by not adding the yarn-berry3 and yarn-berry4 to pkgs/by-name which doesn't make sense here

Hmm seems like a false negative to me. Let's ask the advice of @NixOS/nixpkgs-vet .

@willbush
Copy link
Member

Think this recommendation applies here: https://github.com/NixOS/nixpkgs/tree/master/pkgs/by-name#recommendation-for-new-packages-with-multiple-versions

@gador
Copy link
Member Author

gador commented Nov 12, 2024

Think this recommendation applies here: https://github.com/NixOS/nixpkgs/tree/master/pkgs/by-name#recommendation-for-new-packages-with-multiple-versions

Awesome, thanks for the pointer!

Copy link
Contributor

@doronbehar doronbehar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, though I wouldn't feel comfortable to merge this because I don't know the yarn config settings in yarnBerryConfigHook but they looks pretty reasonable.

pkgs/tools/admin/pgadmin/default.nix Outdated Show resolved Hide resolved
pkgs/tools/admin/pgadmin/update.sh Outdated Show resolved Hide resolved
pkgs/top-level/all-packages.nix Outdated Show resolved Hide resolved
@ofborg ofborg bot added the 11.by: package-maintainer This PR was created by the maintainer of the package it changes label Nov 13, 2024
Copy link
Member

@winterqt winterqt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a no-go in its current state.

mkdir -p $out
''
+ lib.optionalString (yarnVersion > 1) ''
yarn install --immutable --mode skip-build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the FODs almost instantly if Yarn changes anything in how they pull deps, and also is probably not reproducible between platforms when platform-specific dependencies are used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. This is partly why there are two yarn versions.

Also this could be said for all external package managers, not just yarn-berry.

I'm uncertain about your point about platform specific dependencies. I'll have to look into it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Platform specific dependencies should be made reproducible by explicitly setting supportedArchitectures. Your point about Yarn breaking how they fetch deps may be valid depending on the upstream stability guarantees. Explicitly picking the version of yarn somewhat mitigates that issue, but it's no guarantee yes. This is the same path taken in pnpm.fetchDeps #290715

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@winterqt what is your reply on the arguments laid out above? Your "request for changes" is blocking the merge of the PR, and this is the only review comment left really open.

gador added a commit to gador/nixpkgs that referenced this pull request Nov 20, 2024
$TMP should be univsal for all derivations and should
work on darwin as well, if it ever switches to another
build-dir (e.g. /nix/)

see NixOS#355053 (comment)

Signed-off-by: Florian Brandes <[email protected]>
@doronbehar
Copy link
Contributor

OK most issues are addressed. Let's give @winterqt and @szlend a day or 2 to reply (@winterqt to this thread).

Comment on lines 160 to 175
DIR=$(dirname $yarnLock)
cp $DIR/yarn.lock $DIR/package.json $HOME/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work when yarnLock is provided via a path, e.g. yarnLock = src/yarn.lock;. In this case, it will be copied to the Nix store, so it will have a path like /nix/store/fv6mn0y55ig62k43021j2vaii3qf5fpy-yarn.lock. In turn this will cause DIR=/nix/store and the following cp to fail.

Copy link
Member Author

@gador gador Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know.
See #355053 (comment)
But if you read the follow up, this shouldn't be an issue, because we ususally use the src of the parent derivation and have the dir

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. For context I tried using this pull request to package a Yarn project outside nixpkgs, and ran into this problem.

I guess what you're saying is that we should always be using

yarnLock = "${src}/yarn.lock";

instead of

yarnLock = src/yarn.lock;

Which is understandable but seems like an easy mistake to make so I think mentioning it in the documentation or having a check would be nice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #355053 (comment)

Sorry, I'm not sure what's going on with GitHub (issues redesign I guess) but this link isn't working for me, neither by clicking nor opening in a new tab.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmh weird, I'm on mobile. Maybe because of that..

You can have at the tests at how I solved that problem with the store path.

Thanks for testing it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@the-sun-will-rise-tomorrow is correct that the current state in which yarnLock is the argument of the function is slightly misleading. Perhaps after all we should change the function signature of every fetchYarnDeps variant now? I'm sorry that I don't have time to put an effort in that myself, I'd have loved to help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, it should just accept src if it uses yarn-berry. Feels like this should be a different api entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easy enough to just throw a warning if you pass it just the lock file though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with you on the last comment. Lets issue a warning for now, so it catches if only the lock file from the nix store is passed and keep the API the same. In another refactor PR, we can then

  1. Move it under the package (e.g. yarn.fetchDeps see Add yarn-berry3 and yarn-berry4 to fetch and use version 2 yarn.lock files  #355053 (comment))
  2. Use src for yarn and yarn-berry which will illiminate this current issue

Comment on lines 161 to 165
if [ $DIR = "/nix/store" ]; then
echo "Only a yarn.lock file has been passed (e.g. yarnLock = ./yarn.lock;)"
echo "Please change this so a directory can be deduced (e.g. yarnLock = finalAttrs.src + "/yarn.lock";)"
exit 1
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@winterqt winterqt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I am very against adding these kinds of fetchers. (I was too burnt out to comment much on the pnpm situation, and I wish I had the spoons to object more there.)

Upstream makes no guarantees as to the stability of this, and as we've seen with Rust, this is a ticking time bomb.

As an alternative, we could try to e.g. create our own PnP-based fetcher or something? It takes a lot of time to do things properly as various edge cases are discovered, though, and time isn't something I have a lot of at the moment.

If this is to be merged in its current state, please address the issues I've commented on in this review. However, I would strongly recommend against doing so.

Comment on lines 12 to 18
version_4 = "4.5.1";
version_3 = "3.8.6";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment noting to be very careful when this changes, preferably testing FODs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +144 to +145
"3" = yarn-berry_3;
"4" = yarn-berry_4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe the comment should go here. (Or we can just explicitly pin versions right here.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it at the package level, because this is where we would do our update

pkgs/build-support/node/fetch-yarn-deps/default.nix Outdated Show resolved Hide resolved
@@ -0,0 +1,16 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a comment somewhere so that people avoid touching this file/are very careful when doing it. Ideally we'd be able to use JSON with comments...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file means that we can never expand our platform support without mass FOD breaks across the tree? That seems very bad. There are already platforms you can boot NixOS on that would be ruled out by this file. At the very least we should try to list every value already used in npm here so that it’s relatively safe to expand in future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since JSON and comments don't agree (and we also read the file directly), I added a Readme.md to the folder

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file means that we can never expand our platform support without mass FOD breaks across the tree?

yes

That seems very bad. There are already platforms you can boot NixOS on that would be ruled out by this file. At the very least we should try to list every value already used in npm here so that it’s relatively safe to expand in future?

We did add all the relevant platforms, cpu's and libc's (see https://yarnpkg.com/configuration/yarnrc#supportedArchitectures) that are available. The only one missing is win32 and I don't think we need to add this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it’s everything Yarn supports then I guess it’s not so bad – assuming we will be able to add additional platforms if Yarn does support more in future without invalidating every FOD?

I do think we should include Windows. You can use Nixpkgs to cross‐build a lot of stuff for Windows today, and people are working on native bring‐up. What’s the harm?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the idea is that packages can’t include any platforms other than the ones Yarn supports, so no existing FOD should change as long as we keep up with Yarn? Or can packages on npm reference other platforms and Yarn just can’t support fetching them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.
On a sidenode: I did neither have to change the test hashes, nor the hash for the (seriously complex and long) yarn pgadmin offlineCache. (and yes, I obviously invalidated them, re-downloaded the packages and compared the hashes)
So maybe this won't be such a big deal after all

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the idea is that packages can’t include any platforms other than the ones Yarn supports, so no existing FOD should change as long as we keep up with Yarn?

Yes, I think your right. My guess is that adding another platform or compiler will a huge undertaking anyway, so maybe they will publish yet another big version upgrade, which would make it easier for us (e.g. yarn-berry_5)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I think I was correct here:

Or can packages on npm reference other platforms and Yarn just can’t support fetching them?

The npm docs imply that the fields are quite freeform; e.g. the cpu example uses mips as an example. That means that this is just a limitation of Yarn’s configuration format rather than a constraint on the actual packages in existence, and there are packages that could change in the future if we add new platforms like RISC‐V.

The correct solution here would be to fetch everything independent of platform – theoretically Yarn could add all values here to opt in to fetching everything, but writing our own deterministic fetcher would make it easy and also obviate the other reproducibility concerns. If neither of those is an option, perhaps we can find a way to check that nothing in the dependency tree references any platform we haven’t included in this file, and otherwise bail out? If we can’t do even that, then it’d be good if someone could write a script to scan the npm repository and collect every value that is used in practice so that we can include them all from the start and reduce the scope of potential changes.

Because we’re adding fundamental packaging infrastructure here that is part of our public interface and will surely be relied upon by out‐of‐tree packages, and changing FOD hashes are very difficult to spot and debug, I don’t think we’ll have the option of just breaking it in the future, unless we can show that the backwards incompatibility would be entirely theoretical in the existing public corpus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I think I was correct here:

Or can packages on npm reference other platforms and Yarn just can’t support fetching them?

The npm docs imply that the fields are quite freeform; e.g. the cpu example uses mips as an example. That means that this is just a limitation of Yarn’s configuration format rather than a constraint on the actual packages in existence, and there are packages that could change in the future if we add new platforms like RISC‐V.

Actually, it is not a limitation of the configuration, as it can be adapted to include everything node supports. https://yarnpkg.com/configuration/yarnrc#supportedArchitectures says:

List of CPU architectures to cover.
See https://nodejs.org/docs/latest/api/process.html#processarch for the architectures supported by Node.js

which states:

The operating system CPU architecture for which the Node.js binary was compiled. Possible values are: 'arm', 'arm64', 'ia32', 'loong64', 'mips', 'mipsel', 'ppc', 'ppc64', 'riscv64', 's390', 's390x', and 'x64'.

So we can include all of those for our supported-archs. Adding this, btw, didn't change the cache for pgadmin, either.
I looked into how yarn-berry does its comparisons (since these values are for the local .yarnrc.toml config file) and it compares it to upstreams package.json file. This file has the same fields (https://yarnpkg.com/configuration/manifest#cpu) and, more importantly, only the libc and os fields we already have.

The correct solution here would be to fetch everything independent of platform – theoretically Yarn could add all values here to opt in to fetching everything, but writing our own deterministic fetcher would make it easy and also obviate the other reproducibility concerns.

Yes, writing our own fetcher could be a solution. This will be quite a task to parse the yarn.lock file and generate the cache. I really feel uncomftable reinventing the wheel, though. We do have a package manager, which already parses this file and also controlls the specs about our architectures. I don't want to write yet another solution of yarn2nix

If neither of those is an option, perhaps we can find a way to check that nothing in the dependency tree references any platform we haven’t included in this file, and otherwise bail out?

There seems to be an option (yarn info --all --manifest --json) which prints all the manifests, but not one had a cpu, libc or os field. There is another option (yarn npm info _packageName_ --json) which prints the upstream manifest and from the quick peek I took, didn't include any of those, either.

If we can’t do even that, then it’d be good if someone could write a script to scan the npm repository and collect every value that is used in practice so that we can include them all from the start and reduce the scope of potential changes.
Because we’re adding fundamental packaging infrastructure here that is part of our public interface and will surely be relied upon by out‐of‐tree packages, and changing FOD hashes are very difficult to spot and debug, I don’t think we’ll have the option of just breaking it in the future, unless we can show that the backwards incompatibility would be entirely theoretical in the existing public corpus.

Totally agree. This is why we added versions here. Not only because upstream has two supported versions (3 and 4), but also because their yarn.lock files differ.
If there is a situation where either the file format changes, or the supported architectures, I would suggest adding another version (e.g. yarn-berry_5 or yarn-berry_39). This will keep all the hashes intakt, be reproducible and still be able to adapt to future changes.

gador added a commit to gador/nixpkgs that referenced this pull request Nov 22, 2024
$TMP should be univsal for all derivations and should
work on darwin as well, if it ever switches to another
build-dir (e.g. /nix/)

see NixOS#355053 (comment)

Signed-off-by: Florian Brandes <[email protected]>
@gador gador force-pushed the yarn-berry-fetch branch 3 times, most recently from a94f6c4 to d0dc8fa Compare November 22, 2024 22:24
gador added a commit to gador/nixpkgs that referenced this pull request Nov 22, 2024
$TMP should be univsal for all derivations and should
work on darwin as well, if it ever switches to another
build-dir (e.g. /nix/)

see NixOS#355053 (comment)

Signed-off-by: Florian Brandes <[email protected]>
@gador gador force-pushed the yarn-berry-fetch branch 2 times, most recently from 9312a29 to 7a126ea Compare November 22, 2024 22:40
@ofborg ofborg bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 23, 2024
gador and others added 8 commits November 24, 2024 12:11
Signed-off-by: Florian Brandes <[email protected]>
Signed-off-by: Florian Brandes <[email protected]>
$TMP should be univsal for all derivations and should
work on darwin as well, if it ever switches to another
build-dir (e.g. /nix/)

see NixOS#355053 (comment)

Signed-off-by: Florian Brandes <[email protected]>
A new argument has been added to fetch-yarn-deps:
`yarnVersion` which will default to `1` which uses
the original `yarn`.

For newer `yarn-berry` `yarn.lock`, one can set the
`yarnVersion` to either `3` or `4` which will use
yarn-berry3 or yarn-berry4 respectively.

The difference is the `yarn.lock` file, which follows
a different format, depending on the version.

This also adds the corresponding tests.

The added support for yarn.lock files > version 1 has been
inspired by @szlend from
NixOS#254369 (comment)

fixes NixOS#254369

Signed-off-by: Florian Brandes <[email protected]>
Co-authored-by: Doron Behar <[email protected]>
@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nodejs 8.has: documentation 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 11-100 10.rebuild-linux: 11-100 11.by: package-maintainer This PR was created by the maintainer of the package it changes ofborg-internal-error Ofborg encountered an error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fetchYarnDeps doesn't support v2 (YAML) lockfiles
8 participants