Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use runfiles strategy to get runfiles root #2115

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mattem
Copy link
Collaborator

@mattem mattem commented Aug 9, 2024

When attempting to calculate the current repository, the runfiles helper used a strategy of walking up the current files directory four times, however this fails to correctly resolve the runfiles root when the runfiles helper is consumed via the PyPi package and linked into a virtual environment (the root will be determined to be the venv lib directory).

This change makes it so the runfiles root is fetched from the helper strategy, which is either created with the $RUNFILES_ROOT environment variable (directory based), or calculates the root via the runfiles manifest (manifest based).

With these changes, the runfiles helper correctly identifies the runfiles root when linked into a Python virtual environment.

@alexeagle
Copy link
Collaborator

@fmeum FYI as you're the RUNFILES resolution expert :)

@@ -321,19 +324,6 @@ def Create(env: Optional[Dict[str, str]] = None) -> Optional["Runfiles"]:
# Support legacy imports by defining a private symbol.
_Runfiles = Runfiles


def _FindPythonRunfilesRoot() -> str:
"""Finds the root of the Python runfiles tree."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that I introduced this function instead of using RUNFILES_DIR since I thought that the Python runfiles tree may be different from the actual runfiles directory (the one with non-Python data dependencies) when using --build_python_zip. Do you know whether that's true? If this isn't true, I at least don't understand why the runfiles library would have to deal with RUNFILES_MANFIEST_FILE at all as the Python runfiles are always arranged in a directory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds vaguely familiar.

It should be easy to test these cases now. Over in tests/base_rules/BUILD.bazel, there's some existing tests for setting these zip flags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, some of the failures seem to show this.

@mattem Could you share an example of the venv layout? We can try to find logic that works in both cases.

Copy link
Collaborator Author

@mattem mattem Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmeum The venv created by rules_py when running a binary or test is rooted at the root of the targets runfiles tree, and takes the standard Python venv layout, with dependencies then being symlinked in. For example, a target named foo in the //py/tests package depending on the PyPi version of the runfiles helper will have the following package in the site-packages directory.

bazel-bin/py/tests/external-deps/foo.runfiles/.foo.venv/lib/python3.9/site-packages/runfiles

The runfiles last segment of the path is the Python package name.

There is also the case where the user plants the venv directory in an arbitrary place in the source tree for IDE support, at that point .foo.venv could technically be anywhere, but likely at the source tree root or in the same package as the target. Users then likely "run" the binary via the IDE for debugging etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the path under the venv of some Python package from an external repository look like?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each Python package coming from a Bazel module (not an external pure-Python dep), it would need to write that package name and the corresponding canonical repo name (Label.workspace_name) to a file.

Then we could look for that file here, find the package name in the path of the current source file and obtain the repo name from it. Feel free to mention me on the PR (draft) that sets up the file if you need help.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we able to tell that a dependency comes from a bazel module vs an external Python dependency? They will have that Label.workspace_name set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I guess your external deps can come from everywhere? In that case you could also just store the mapping between package and Label.workspace_name for every package. That would match the logic here if it weren't broken due to the different runfiles structure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, dependencies can come from both external bazel dependencies or fetched from PyPi (etc) by rules_python. In fact in rules_py, there is very little difference between 1p and 3p dependencies.

That would match the logic here if it weren't broken due to the different runfiles structure.

To be clear, the runfiles helper only fails when it is consumed via the PyPi dependency, it works as expected when used via @bazel_tools//tools/python/runfiles, as I guess it doesn't attempt to fetch the repo name? Secondly, the "normal" runfiles structure is there, it's just up a few more directory segments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, the runfiles helper only fails when it is consumed via the PyPi dependency, it works as expected when used via @bazel_tools//tools/python/runfiles, as I guess it doesn't attempt to fetch the repo name?
The runfiles helper in @bazel_tools//tools/python/runfile is deprecated and not fully functional with Bzlmod.

Secondly, the "normal" runfiles structure is there, it's just up a few more directory segments.
The subtle part is that this failure isn't about not finding the runfiles strategy, it's about identifying the point of reference for lookups in that structure. For example, when you request rules_python/foo/bar via Rlocation, you need to know which Bazel repo is asking for that to look up the actual repo (and thus its canonical name) seen under that name by the caller.

If we had a separate file that maps Python package names to Bazel repo names, we could get the Python package name from the path and thus the Bazel repo name relative to which runfiles paths would be interpreted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants