-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Path mapping #70
Path mapping #70
Conversation
…apping Path mapping requires a custom Java toolchain where javac_supports_worker_multiplex_sandboxing is set to True. There was an issue where rules_jvm_external came before rules_java in the WORKSPACE and it called the default rules_java toolchain setup functions, preventing our custom toolchain from being used correctly. This in turn broke path mapping.
…ferent configurations Currently we compare the absolute path in the deps checker. This causes us problems when there are configuration specific parts of one input and not the other. This happens when there we've written a configuration specific part of a path to disk and are comparing it to a path mapped argument. I imagine this is not the only place we'll run into issues with this.
# rules_license | ||
rules_license_tag = "1.0.0" | ||
|
||
http_archive( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we using this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dependency depended upon this and without setting the version explicitly things were failing. I'm very much looking forward to moving to bzlmod, so we don't have to play this game as much.
@@ -16,7 +16,7 @@ def phase_coverage_jacoco(ctx, g): | |||
return | |||
|
|||
toolchain = ctx.toolchains["//rules/scala:toolchain_type"] | |||
worker_inputs, _, worker_input_manifests = ctx.resolve_command( | |||
worker_inputs, _ = ctx.resolve_tools( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, can you explain what this does?
How is it different from passing toolchain.code_coverage_configuration.instrumentation_worker
in the tools
argument of the action below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolve_command
returns a 3 element tuple and we never used the last element. The last element is also empty. This method also requires bash to be installed on Windows.
The Bazel doc also says to consider using resolve_tools
if that meets your needs, so I moved to that instead. https://bazel.build/rules/lib/builtins/ctx#resolve_command
In contrast to ctx.resolve_command, this method does not require that Bash be installed on the machine, so it's suitable for rules built on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why ctx.resolve_tools
exists at all? The Bazel documentation didn't make that very clear. In other words, how is this:
worker_inputs, _ = ctx.resolve_tools(
tools = [toolchain.code_coverage_configuration.instrumentation_worker],
)
ctx.actions.run(
...,
inputs = ... + worker_inputs.to_list(),
)
different from this:
ctx.actions.run(
...,
tools = [toolchain.code_coverage_configuration.instrumentation_worker.files_to_run],
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, doesn't Bazel automatically add the runfiles of the executable
to the list of files needed by the action? Why is it necessary to call ctx.resolve_tools
and add the files it outputs to the inputs of the action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have a point. FWIW here's a message from a Bazel Slack thread:
Based on a quick reading of the code, I think that the tools attribute of ctx.actions.run does pretty much the same thing as manually calling ctx.resolve_tools and passing the result to inputs and input_manifests: In both cases, the FilesToRunProvider of the individual targets is checked and its runfiles are added.
It only matters if you accept e.g. a list of labels considered tools (for example when implementing a genrule-like rule).
Here's the design doc as well: https://docs.google.com/document/d/1xPsvTY-vWav9zX--7ieXjUilcl7M46m-88_oNV0QhEU/edit?tab=t.0#heading=h.5mcn15i0e1ch
Link to the Slack thread: https://bazelbuild.slack.com/archives/CA31HN1T3/p1684317657372349
outputs = [in_out_pair[1] for in_out_pair in in_out_pairs], | ||
executable = toolchain.code_coverage_configuration.instrumentation_worker.files_to_run.executable, | ||
input_manifests = worker_input_manifests, | ||
executable = toolchain.code_coverage_configuration.instrumentation_worker.files_to_run, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If toolchain.code_coverage_configuration.instrumentation_worker.files_to_run.executable
were provided, would we need to add worker_inputs
to the inputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx.actions.run
can accept a FileToRun
provider, so there was no need for us to be passing in the .executable
. This code is equivalent, but just uses less code. https://bazel.build/rules/lib/builtins/actions#run
We were passing in a File
with the .executable
, but the FilesToRun
provider works as well.
I figure if it works with less code, then great. No downsides as far as I know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we compare the absolute path in the deps checker. This causes
us problems when there are configuration specific parts of one input and
not the other. This happens when there we've written a configuration
specific part of a path to disk and are comparing it to a path mapped
argument.
I wonder if this is why the dependency checking tests have been flaky, and why those tests are now succeeding on this branch. By the way, you may want to re-enable tests/dependencies/indirect/test
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind; you did that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally thought the same thing, but the dependency checking tests were flaky because of the multiplex sandbox bug. I think there's a Bazel bug and have seen some reports on the Bazel GitHub that seem suspiciously similar to what we're running into. At this point, I'm planning to try things out with Bazel 8 and see if the bug persists. I hope it just goes away with Bazel 8.
rules/scala.bzl
Outdated
@@ -408,7 +408,7 @@ _scala_repl_private_attributes = _dicts.add( | |||
_runtime_private_attributes, | |||
{ | |||
"_runner": attr.label( | |||
cfg = "host", | |||
cfg = "exec", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be host
because this is run when the target is run, not when it's built?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cfg = "host"
is going away. https://bazel.build/reference/command-line-reference#flag--incompatible_disable_starlark_host_transitions
Which is why I made these changes.
If set to true, rule attributes cannot set 'cfg = "host"'. Rules should set 'cfg = "exec"' instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. Shouldn't it be cfg = "target"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I think you're right. I'll change this.
if g.classpaths.srcs: | ||
args.add_joined("--srcs", g.classpaths.srcs, join_with = " ") | ||
else: | ||
fail("Empty srcs list passed to bootstrap compiler") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until now, our rules have supported an empty source list. Shouldn't it continue to be valid? Or does the Scala compiler not support this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the fail
here. I think whether it will work depends on what the other arguments are.
shell_args = ctx.actions.args() | ||
shell_args.add(ctx.executable._zipper) | ||
shell_args.add_all([gendir], expand_directories = False) | ||
shell_args.add(gendir.short_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be shell_args.add([gendir], map_each = _short_path)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand things, short_path
doesn't include any configuration specific parts of the path, so it doesn't need to be done the same way.
From the state of path mapping thread:
Generally speaking, all functions that return a path string that may contain a configuration prefix such as bazel-out/darwin-amd64-fastbuild/bin must only be called in map_each callbacks, where they are automatically path mapped by Bazel.
and https://bazel.build/rules/lib/builtins/File#short_path
The path of this file relative to its root. This excludes the aforementioned root, i.e. configuration-specific fragments of the path. This is also the path under which the file is mapped if it's in the runfiles of a binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. We discussed this over our call and I understand better what short_path
does now.
args = ctx.actions.args() | ||
args.add(ctx.file._runner) | ||
args.add(ctx.workspace_name) | ||
args.add(manifest.short_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be args.add([manifest], map_each = _short_path)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment response as above.
args = ctx.actions.args() | ||
args.add(ctx.file._testrunner) | ||
args.add(ctx.workspace_name) | ||
args.add(manifest.short_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be args.add([manifest], map_each = _short_path)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same response as above.
@@ -4,6 +4,6 @@ | |||
echo "class A$RANDOM" > Example.gen.scala | |||
|
|||
rm -fr "$(bazel info execution_root)/.bazel-zinc" | |||
bazel build --noworker_sandboxing --worker_extra_flag=ScalaCompile=--persistence_dir=.bazel-zinc :lib | |||
bazel build --experimental_output_paths=off --noworker_sandboxing --worker_extra_flag=ScalaCompile=--persistence_dir=.bazel-zinc :lib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why'd we turn this setting off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not remember and will poke at this. I think this test fails with path mapping on, but I could be wrong about that.
In general the incremental Zinc compilation tests just disable everything like path mapping, sandboxing, etc. in order to function.
That said, now that I know everything else is working I'll go try the test with path mapping on and see what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed that path mapping needs to be disabled for the incremental compilation stuff to work at all as it reaches outside Bazel's sandbox and thus things like path mapping trip it up.
I have some code in phase_zinc_compile that should disable path mapping if using a toolchain that has incremental compilation enabled. Problem is, we're just using the default toolchain in this test and using command line options to disable sandboxing and path mapping.
I don't think I care enough about the incremental compilation stuff to improve this as we're likely ripping that feature out soon.
We replace the string ${workDir} with the workDir path. Problem is, sometimes the workDir path is " " (the current directory). If that's the case, then we replace ${workDir} with a blank string. Using an absolute path prevents that blank string from happening.
Because we're using strings, names like zinc_3 would likely conflict in other repos.
For some reason multiplex sandboxing seems to have issues with files going missing. It's been happening for a while, but intermittently. It got much worse recently. Not sure if this is a rule set bug or a Bazel bug. Regardless, disabling multiplex sandboxing fixes it.
It runs on the target architecture rather than the exec architecture, so this makes more sense.
c33607b
to
ee6bbb7
Compare
This gets us path mapping support and a handful of other niceities