Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Bzlmod and add rules_scala to bazel-central-registry #1482

Open
sfc-gh-pbennes opened this issue Mar 16, 2023 · 33 comments · Fixed by #1619
Open

Support Bzlmod and add rules_scala to bazel-central-registry #1482

sfc-gh-pbennes opened this issue Mar 16, 2023 · 33 comments · Fixed by #1619

Comments

@sfc-gh-pbennes
Copy link

sfc-gh-pbennes commented Mar 16, 2023

Greetings!

It looks like rules_scala isn't part of the Bzlmod effort or added to bazel-central-registry yet. I've opened an issue both here and in https://github.com/bazelbuild/bazel-central-registry to request it: bazelbuild/bazel-central-registry#522

@chrislovecnm
Copy link

Just a note to anyone that starts the work native.register_toolchains is not supported when using bzlmod.

@sluongng
Copy link

@chrislovecnm
Copy link

chrislovecnm commented Jul 11, 2023

@sluongng
Copy link

Yeah you would only want to prepare the toolchain (as a new repository) inside the extension I think. Registering it must be called from the MODULE.bazel file.

So native.register_toolchains call from the extension's starlark does not work, but having the extension create @pythons_hub//:all and then MODULE.bazel calling register_toolchains("@pythons_hub//:all") worked. 👍

@mateuszkuta256
Copy link
Contributor

I prepared a minimal bzlmod support here and gonna split it into a few PRs
TODOs after merging it:

  • write documentation on how to use rules_scala with bzlmod (some properties like SCALA_VERSION will probably be configurable via repo-env)
  • prepare release notes
  • contribute repo to the Bazel Central Registry
  • add tests that the project builds using both WORKSPACE and MODULE.bazel
  • migrate test/external workspaces to blzmod too

@pcj
Copy link
Member

pcj commented Jun 30, 2024

@mateuszkuta256 thanks for getting this started! Any progress updates or blockers?

@mateuszkuta256
Copy link
Contributor

@mateuszkuta256 thanks for getting this started! Any progress updates or blockers?

hi, unfortunately I'm working on other things now and won't continue with this PR in the foreseeable future
I remember this migration was on hold because interfered with 'cross compilation support'
It looks like great progress was done it this area, so maybe someone can already take over the PR

@mbland
Copy link
Contributor

mbland commented Oct 2, 2024

I'd like to take on the task of Bzlmodifying this repo through a series of pull requests.

I've already created a (mostly) working branch in my own fork. Though I saw a couple draft pull requests here, I ended up taking a different approach and got it mostly working. In fact, I did exactly what @sluongng suggested in #1482 (comment). (I didn't notice this comment before just now—I might've read it, but not understood it at the time—but I did study rules_python and rules_go, and ended up doing exactly that.)

There are still issues I need help to address (recorded in some of the commit messages), I didn't strictly maintain WORKSPACE compatibility (fixable), and I only tested against Scala 2.13. But I can start teasing chunks out of it in a methodical fashion, to ensure that I maintain WORKSPACE and cross-version compatibility.

For an example of what it looks like from a client perspective, here's what the MODULE.bazel stanza from my local EngFlow/example repo looks like (as opposed to the current stanza, explanatory comment notwithstanding):

bazel_dep(name = "rules_scala", repo_name = "io_bazel_rules_scala")
local_path_override(
    module_name = "rules_scala",
    path = "../../bazelbuild/rules_scala"
)   

scala_dev_deps = use_extension(
    "@io_bazel_rules_scala//scala/extensions:deps.bzl",
    "scala_deps",
    dev_dependency = True,
)   
scala_dev_deps.toolchains(
    scalatest = True,
)

So if folks are game for me to do this, I'll start carving off pieces as separate pull requests, and we can resolve any outstanding problems in the process.

cc: @BillyAutrey @jayconrod @benjaminp @TheGrizzlyDev

mbland added a commit to mbland/rules_scala that referenced this issue Oct 3, 2024
This begins the Bzlmod compatibility migration by updating Bazel to
version 7.3.2 and adding initial `MODULE.bazel` and `WORKSPACE.bzlmod`
files.

Part of: bazelbuild#1482

Though Bzlmod remains disabled, updating to Bazel 7.3.2 requred updating
or adding the following packages to maintain `WORKSPACE` compatibility.

In `rules_scala_setup()`:

- bazel_skylib: 1.4.1 => 1.7.1
- rules_cc: 0.0.6 => 0.0.10
- rules_java: 5.4.1 => 7.9.0
- rules_proto: 5.3.0-21.7 => 6.0.2

Dev dependencies in `WORKSPACE`:

- com_google_protobuf: 28.2
- rules_pkg: 1.0.1
- rules_jvm_external: 6.4
- com_google_absl: abseil-cpp-20240722.0
- zlib: 1.3.1

Of all of the new, explicit dev dependencies, only `com_google_protobuf`
will be necessary to include in `MODULE.bazel`. The Bzlmod mechanism
will discover these other transitive dev dependencies automatically.

Also removed the `rules_java_extra` repo from `WORKSPACE`, which
appeared unused.

---

Though the current `rules_java` version is 7.12.1, and largely works
with this repo, it requires a few temporary workarounds. Rather than
commit the workarounds, upgrading only to 7.9.0 now seems less crufty.

What follows is a very detailed explanation of what happens with 7.12.1
with Bazel 7.3.2, just to have it on the record.

---

The workaround is to change a few toolchain and macro file targets from
`@bazel_tools//tools/jdk:` to `@rules_java//toolchains:`. This isn't a
terribly bad or invasive workaround, but `@bazel_tools//tools/jdk:` is
clearly the canonical path. Best to keep it that way, lest we build up
technical debt.

Without the workaround, these targets would fail:

- //test/src/main/resources/java_sources:CompiledWithJava11
- //test/src/main/resources/java_sources:CompiledWithJava8
- //test/toolchains:java21_toolchain
- //test:JunitRuntimePlatform
- //test:JunitRuntimePlatform_test_runner
- //test:scala_binary_jdk_11

with this error:

```txt
ERROR: .../external/rules_java_builtin/toolchains/BUILD:254:14:

While resolving toolchains for target
@@rules_java_builtin//toolchains:platformclasspath (096dcc8):

No matching toolchains found for types
@@bazel_tools//tools/jdk:bootstrap_runtime_toolchain_type.
```

This appears to be a consequence of both upgrading the Bazel version
from 6.3.0 to 7.3.2 and updating `rules_java` to 7.12.1. The
`rules_java_builtin` repo is part of the `WORKSPACE` prefix that adds
implicit dependencies:

- https://bazel.build/external/migration#builtin-default-deps

This repo was added to 7.0.0-pre.20231011.2 in the following change,
mapped to `@rules_java` within the scope of the `@bazel_tools` repo:

- bazelbuild/bazel: Add rules_java_builtin to the users of Java modules
  bazelbuild/bazel@ff1abb2

This change tried to ensure `rules_java` remained compatible with
earlier Bazel versions. However, it changed all instances of
`@bazel_tools//tools/jdk:bootstrap_runtime_toolchain_type` to
`//toolchains:bootstrap_runtime_toolchain_type`:

- bazelbuild/rules_java: Make rules_java backwards compatible with Bazel
  6.3.0
  bazelbuild/rules_java@30ecf3f

Bazel has bumped `rules_java` in its `workspace_deps.bzl` from 7.9.0 to
7.11.0, but it's only available as of 8.0.0-pre.20240911.1.

- bazelbuild/bazel: Update rules_java 7.11.1 / java_tools 13.8
  bazelbuild/bazel#23571
  bazelbuild/bazel@f92124a

---

What I believe is happening is, under Bazel 7.3.2 and `rules_java`
7.12.1:

- Bazel creates `rules_java` 7.9.0 as `@rules_java_builtin` in the
  `WORKSPACE` prefix.

- `@bazel_tools` has `@rules_java` mapped to `@rules_java_builtin` when
  initialized during the `WORKSPACE` prefix, during which
  `@bazel_tools//tools/jdk` registers `alias()` targets to
  `@rules_java` toolchain targets. These aliased toolchains specify
  `@bazel_tools//tools/jdk:bootstrap_runtime_toolchain_type` in their
  `toolchains` attribute.

- `WORKSPACE` loads `@rules_java` 7.12.1 and registers all its
  toolchains with type
  `@rules_java//toolchains:bootstrap_runtime_toolchain_type`.

- Some `@rules_java` rules explicitly specifying toolchains from
  `@bazel_tools//tools/jdk` can't find them, because the
  `@bazel_tools//tools/jdk` toolchain aliases expect toolchains of type
  `@bazel_tools//tools/jdk:bootstrap_runtime_toolchain_type`.

This has broken other projects in the same way:

- bazelbuild/bazel: [Bazel CI] Downstream project broken by rules_java
  upgrade #23619
  bazelbuild/bazel#23619

These problems don't appear under Bzlmod, and `@rules_java_builtin` was
never required. This is because `WORKSPACE` executes its statements
sequentially, while Bzlmod builds the module dependency graph _before_
instantiating repositories (within module extensions).

It seems a fix is on the way that removes `@rules_java_builtin` from the
`WORKSPACE` prefix, and adds `@rules_java` to the suffix. At this
moment, though, it's not even in a prerelase:

- bazelbuild/bazel: Remove rules_java_builtin in WORKSPACE prefix
  bazelbuild/bazel@7506690

---

Note that the error message matches that from the following resolved
issue, but that issue was for non-Bzlmod child repos when `WORKSPACE`
was disabled.

- bazelbuild/bazel: Undefined @@rules_java_builtin repository with
  --noenable_workspace option
  bazelbuild/bazel#22754
mbland added a commit to mbland/rules_scala that referenced this issue Oct 5, 2024
Related to bazelbuild#1482, bazelbuild#1618, and bazelbuild#1619. Results from the investigation
documented at:

- bazelbuild#1619 (comment)

Updates `_import_paths()` in `scala_proto_aspect.bzl` to handle
differences `ProtoInfo.proto_source_root` and `ProtoInfo.direct_sources`
values between Bazel 6 and Bazel 7.

Without this change, `_import_paths()` emits incorrect values under
Bazel 7, causing targets containing generated `.proto` inputs to fail,
e.g.  `//test/proto3:test_generated_proto`.

See also:

- Fix paths for sibling repository setup and generated .proto files
  bazelbuild/bazel@6c6c196

- The docstring for `ProtoInfo.proto_source_root` in the Bazel sources:
  https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/starlark/builtins_bzl/common/proto/proto_info.bzl#L155-L172

- Remove incompatible_generated_protos_in_virtual_imports
  bazelbuild/bazel@3efaa32

- Comment from: Generated Protos are no longer considered as
  virtual_imports in Bazel 7
  bazelbuild/bazel#21075 (comment)

---

I cherrypicked this commit into bazelbuild#1618. While it fixed the
`//test/proto3` build failure, it does _not_ fix the hanging
scalapb_workers from the ProtoScalaPBRule aspect.

I'll have to investiate further whether than hang is related to Bazel,
rules_proto, com_google_protobuf, or some mixture thereof. Still,
progress!
mbland added a commit to mbland/rules_scala that referenced this issue Oct 5, 2024
Related to bazelbuild#1482, bazelbuild#1618, and bazelbuild#1619. Results from the investigation
documented at:

- bazelbuild#1619 (comment)

Updates `_import_paths()` in `scala_proto_aspect.bzl` to handle
differences `ProtoInfo.proto_source_root` and `ProtoInfo.direct_sources`
values between Bazel 6 and Bazel 7.

Without this change, `_import_paths()` emits incorrect values under
Bazel 7, causing targets containing generated `.proto` inputs to fail,
e.g.  `//test/proto3:test_generated_proto`.

See also:

- Fix paths for sibling repository setup and generated .proto files
  bazelbuild/bazel@6c6c196

- The docstring for `ProtoInfo.proto_source_root` in the Bazel sources:
  https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/starlark/builtins_bzl/common/proto/proto_info.bzl#L155-L172

- Remove incompatible_generated_protos_in_virtual_imports
  bazelbuild/bazel@3efaa32

- Comment from: Generated Protos are no longer considered as
  virtual_imports in Bazel 7
  bazelbuild/bazel#21075 (comment)

---

I cherrypicked this commit into bazelbuild#1618. While it fixed the
`//test/proto3` build failure, it does _not_ fix the hanging
scalapb_workers from the ProtoScalaPBRule aspect.

I'll have to investiate further whether than hang is related to Bazel,
rules_proto, com_google_protobuf, or some mixture thereof. Still,
progress!
mbland added a commit to mbland/rules_scala that referenced this issue Oct 6, 2024
Related to bazelbuild#1482, bazelbuild#1618, and bazelbuild#1619. Results from the investigation
documented at:

- bazelbuild#1619 (comment)

Updates `_import_paths()` in `scala_proto_aspect.bzl` to handle
differences `ProtoInfo.proto_source_root` and `ProtoInfo.direct_sources`
values between Bazel 6 and Bazel 7.

Without this change, `_import_paths()` emits incorrect values under
Bazel 7, causing targets containing generated `.proto` inputs to fail,
e.g.  `//test/proto3:test_generated_proto`.

See also:

- Fix paths for sibling repository setup and generated .proto files
  bazelbuild/bazel@6c6c196

- The docstring for `ProtoInfo.proto_source_root` in the Bazel sources:
  https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/starlark/builtins_bzl/common/proto/proto_info.bzl#L155-L172

- Remove incompatible_generated_protos_in_virtual_imports
  bazelbuild/bazel@3efaa32

- Comment from: Generated Protos are no longer considered as
  virtual_imports in Bazel 7
  bazelbuild/bazel#21075 (comment)

---

I cherrypicked this commit into bazelbuild#1618. While it fixed the
`//test/proto3` build failure, it does _not_ fix the hanging
scalapb_workers from the ProtoScalaPBRule aspect.

I'll have to investiate further whether than hang is related to Bazel,
rules_proto, com_google_protobuf, or some mixture thereof. Still,
progress!
mbland added a commit to mbland/rules_scala that referenced this issue Oct 7, 2024
Related to bazelbuild#1482, bazelbuild#1618, and bazelbuild#1619. Results from the investigation
documented at:

- bazelbuild#1619 (comment)

Updates `_import_paths()` in `scala_proto_aspect.bzl` to handle
differences `ProtoInfo.proto_source_root` and `ProtoInfo.direct_sources`
values between Bazel 6 and Bazel 7.

Without this change, `_import_paths()` emits incorrect values under
Bazel 7, causing targets containing generated `.proto` inputs to fail,
e.g.  `//test/proto3:test_generated_proto`.

See also:

- Fix paths for sibling repository setup and generated .proto files
  bazelbuild/bazel@6c6c196

- The docstring for `ProtoInfo.proto_source_root` in the Bazel sources:
  https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/starlark/builtins_bzl/common/proto/proto_info.bzl#L155-L172

- Remove incompatible_generated_protos_in_virtual_imports
  bazelbuild/bazel@3efaa32

- Comment from: Generated Protos are no longer considered as
  virtual_imports in Bazel 7
  bazelbuild/bazel#21075 (comment)

---

I cherrypicked this commit into bazelbuild#1618. While it fixed the
`//test/proto3` build failure, it does _not_ fix the hanging
scalapb_workers from the ProtoScalaPBRule aspect.

I'll have to investiate further whether than hang is related to Bazel,
rules_proto, com_google_protobuf, or some mixture thereof. Still,
progress!
mbland added a commit to mbland/rules_scala that referenced this issue Oct 7, 2024
Part of bazelbuild#1482.

Splits the last component off of canonical repo names to produce the
expected repo name.

Without Bzlmod, it returns the original name. With Bzlmod enabled, it
avoids generating output like:

    scala_import(
        name = "_main~scala_deps~io_bazel_rules_scala_scala_compiler",
        jars = ["scala-compiler-2.12.18.jar"],
    )

resulting in errors like:

```
ERROR: .../_main~_repo_rules~io_bazel_rules_scala/scala/BUILD:
no such target '@@_main~scala_deps~io_bazel_rules_scala_scala_compiler//:io_bazel_rules_scala_scala_compiler':
target 'io_bazel_rules_scala_scala_compiler' not declared in package ''
defined by .../_main~scala_deps~io_bazel_rules_scala_scala_compiler/BUILD
and referenced by '@@_main~_repo_rules~io_bazel_rules_scala//scala:default_toolchain_scala_compile_classpath_provider'
```

Also fixes the following error when attaching resources from custom repos to
targets under Bzlmod:

```txt
$ bazel test //test/src/main/scala/scalarules/test/resources:all

1) Scala library depending on resources from external resource-only
  jar::allow to load resources(scalarules.test.resources.ScalaLibResourcesFromExternalDepTest)
  java.lang.NullPointerException
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.get(ScalaLibResourcesFromExternalDepTest.scala:17)
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.$anonfun$new$3(ScalaLibResourcesFromExternalDepTest.scala:11)
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.$anonfun$new$2(ScalaLibResourcesFromExternalDepTest.scala:11)
```

Can be replaced with a future bazel-skylib implementation, if accepted
into that repo.

---

We can't rely on the specific canonical repository name format:

> Repos generated by extensions have canonical names in the form of
> `module_repo_canonical_name+extension_name+repo_name`. For extensions
> hosted in the root module, the `module_repo_canonical_name` part is
> replaced with the string `_main`. Note that the canonical name format is
> not an API you should depend on — it's subject to change at any time.
>
> - https://bazel.build/external/extension#repository_names_and_visibility

The change to no longer encode module versions in canonical repo names in
Bazel 7.1.0 is a recent example of Bazel maintainers altering the format:

- bazelbuild/bazel#21316

And the maintainers recently replaced the `~` delimiter with `+` in the
upcoming Bazel 8 release due to build performance issues on Windows:

- bazelbuild/bazel#22865

This function assumes the only valid `repo_name` characters are letters,
numbers, '_', '-', and '.'. It finds the last character not in this set, and
returns the contents of `name` following this character. This is valid so
long as this condition holds:

- https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/java/com/google/devtools/build/lib/cmdline/RepositoryName.java#L159-L162
mbland added a commit to mbland/rules_scala that referenced this issue Oct 7, 2024
Part of bazelbuild#1482.

Splits the last component off of canonical repo names to produce the
expected repo name.

Without Bzlmod, it returns the original name. With Bzlmod enabled, it
avoids generating output like:

    scala_import(
        name = "_main~scala_deps~io_bazel_rules_scala_scala_compiler",
        jars = ["scala-compiler-2.12.18.jar"],
    )

resulting in errors like:

```
ERROR: .../_main~_repo_rules~io_bazel_rules_scala/scala/BUILD:
no such target '@@_main~scala_deps~io_bazel_rules_scala_scala_compiler//:io_bazel_rules_scala_scala_compiler':
target 'io_bazel_rules_scala_scala_compiler' not declared in package ''
defined by .../_main~scala_deps~io_bazel_rules_scala_scala_compiler/BUILD
and referenced by '@@_main~_repo_rules~io_bazel_rules_scala//scala:default_toolchain_scala_compile_classpath_provider'
```

Also fixes the following error when attaching resources from custom repos to
targets under Bzlmod:

```txt
$ bazel test //test/src/main/scala/scalarules/test/resources:all

1) Scala library depending on resources from external resource-only
  jar::allow to load resources(scalarules.test.resources.ScalaLibResourcesFromExternalDepTest)
  java.lang.NullPointerException
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.get(ScalaLibResourcesFromExternalDepTest.scala:17)
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.$anonfun$new$3(ScalaLibResourcesFromExternalDepTest.scala:11)
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.$anonfun$new$2(ScalaLibResourcesFromExternalDepTest.scala:11)
```

Can be replaced with a future bazel-skylib implementation, if accepted
into that repo.

---

We can't rely on the specific canonical repository name format:

> Repos generated by extensions have canonical names in the form of
> `module_repo_canonical_name+extension_name+repo_name`. For extensions
> hosted in the root module, the `module_repo_canonical_name` part is
> replaced with the string `_main`. Note that the canonical name format is
> not an API you should depend on — it's subject to change at any time.
>
> - https://bazel.build/external/extension#repository_names_and_visibility

The change to no longer encode module versions in canonical repo names in
Bazel 7.1.0 is a recent example of Bazel maintainers altering the format:

- bazelbuild/bazel#21316

And the maintainers recently replaced the `~` delimiter with `+` in the
upcoming Bazel 8 release due to build performance issues on Windows:

- bazelbuild/bazel#22865

This function assumes the only valid `repo_name` characters are letters,
numbers, '_', '-', and '.'. It finds the last character not in this set, and
returns the contents of `name` following this character. This is valid so
long as this condition holds:

- https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/java/com/google/devtools/build/lib/cmdline/RepositoryName.java#L159-L162
liucijus pushed a commit that referenced this issue Oct 8, 2024
Related to #1482, #1618, and #1619. Results from the investigation
documented at:

- #1619 (comment)

Updates `_import_paths()` in `scala_proto_aspect.bzl` to handle
differences `ProtoInfo.proto_source_root` and `ProtoInfo.direct_sources`
values between Bazel 6 and Bazel 7.

Without this change, `_import_paths()` emits incorrect values under
Bazel 7, causing targets containing generated `.proto` inputs to fail,
e.g.  `//test/proto3:test_generated_proto`.

See also:

- Fix paths for sibling repository setup and generated .proto files
  bazelbuild/bazel@6c6c196

- The docstring for `ProtoInfo.proto_source_root` in the Bazel sources:
  https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/starlark/builtins_bzl/common/proto/proto_info.bzl#L155-L172

- Remove incompatible_generated_protos_in_virtual_imports
  bazelbuild/bazel@3efaa32

- Comment from: Generated Protos are no longer considered as
  virtual_imports in Bazel 7
  bazelbuild/bazel#21075 (comment)

---

I cherrypicked this commit into #1618. While it fixed the
`//test/proto3` build failure, it does _not_ fix the hanging
scalapb_workers from the ProtoScalaPBRule aspect.

I'll have to investiate further whether than hang is related to Bazel,
rules_proto, com_google_protobuf, or some mixture thereof. Still,
progress!
@simuons simuons reopened this Oct 8, 2024
mbland added a commit to mbland/rules_scala that referenced this issue Oct 8, 2024
Part of bazelbuild#1482.

Splits the last component off of canonical repo names to produce the
expected repo name.

Without Bzlmod, it returns the original name. With Bzlmod enabled, it
avoids generating output like:

    scala_import(
        name = "_main~scala_deps~io_bazel_rules_scala_scala_compiler",
        jars = ["scala-compiler-2.12.18.jar"],
    )

resulting in errors like:

```
ERROR: .../_main~_repo_rules~io_bazel_rules_scala/scala/BUILD:
no such target '@@_main~scala_deps~io_bazel_rules_scala_scala_compiler//:io_bazel_rules_scala_scala_compiler':
target 'io_bazel_rules_scala_scala_compiler' not declared in package ''
defined by .../_main~scala_deps~io_bazel_rules_scala_scala_compiler/BUILD
and referenced by '@@_main~_repo_rules~io_bazel_rules_scala//scala:default_toolchain_scala_compile_classpath_provider'
```

Also fixes the following error when attaching resources from custom repos to
targets under Bzlmod:

```txt
$ bazel test //test/src/main/scala/scalarules/test/resources:all

1) Scala library depending on resources from external resource-only
  jar::allow to load resources(scalarules.test.resources.ScalaLibResourcesFromExternalDepTest)
  java.lang.NullPointerException
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.get(ScalaLibResourcesFromExternalDepTest.scala:17)
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.$anonfun$new$3(ScalaLibResourcesFromExternalDepTest.scala:11)
    at scalarules.test.resources.ScalaLibResourcesFromExternalDepTest.$anonfun$new$2(ScalaLibResourcesFromExternalDepTest.scala:11)
```

Can be replaced with a future bazel-skylib implementation, if accepted
into that repo.

---

We can't rely on the specific canonical repository name format:

> Repos generated by extensions have canonical names in the form of
> `module_repo_canonical_name+extension_name+repo_name`. For extensions
> hosted in the root module, the `module_repo_canonical_name` part is
> replaced with the string `_main`. Note that the canonical name format is
> not an API you should depend on — it's subject to change at any time.
>
> - https://bazel.build/external/extension#repository_names_and_visibility

The change to no longer encode module versions in canonical repo names in
Bazel 7.1.0 is a recent example of Bazel maintainers altering the format:

- bazelbuild/bazel#21316

And the maintainers recently replaced the `~` delimiter with `+` in the
upcoming Bazel 8 release due to build performance issues on Windows:

- bazelbuild/bazel#22865

This function assumes the only valid `repo_name` characters are letters,
numbers, '_', '-', and '.'. It finds the last character not in this set, and
returns the contents of `name` following this character. This is valid so
long as this condition holds:

- https://github.com/bazelbuild/bazel/blob/7.3.2/src/main/java/com/google/devtools/build/lib/cmdline/RepositoryName.java#L159-L162
@mbland
Copy link
Contributor

mbland commented Oct 11, 2024

A quick update for visibility: I'm very close to having the Bzlmodified rules_scala passing 100% of the tests, down to the last couple of failures:

  • I need to replace the bind() calls from twitter_scrooge with alias() or some such to get test_version.sh to pass.
  • I need to fix the Scala 3.1.2 and Scala 3.3.3 failures for the Scalafmt targets from test/shell/test_cross_build.sh (corresponding to the test3 and library3 test cases in test_scalafmt()).

I also need to get test_lint.sh working again, and update the dt_patches repos for Bzlmod. But the following all pass:

  • test_rules_scala.sh
  • test_reproducibility.sh
  • test_examples.sh
  • test_coverage.sh

Hopefully I can get these fixed up today, and I'll start peeling off the next pull request or two. And thanks to @simuons and @liucijus for reviewing and merging #1619 and #1620 already.

@mbland
Copy link
Contributor

mbland commented Nov 3, 2024

Again, I deeply appreciate the appreciation, but these have been helpful contributions, not heroism. 🙂 I'm compelled to push back politely on such characterizations, however honored I am to receive them.

To answer your question, though, I'm hoping we can land the remaining Bzlmodification changes quite soon.


That said, it's up to the time and bandwidth @simuons and @liucijus have to review the pull requests, whether they have any concerns and/or suggestions, and when they're comfortable cutting a release after that. They've been incredibly helpful and responsive to date, and I'm not trying to rush them. Several of the changes have benefitted substantially from review feedback, including several receiving serendipitous feedback from @WojciechMazur. (Plus, I indirectly roped @fmeum into effectively reviewing #1621 via bazelbuild/bazel-skylib#548. I'm so glad I did, because the end result was so much better.)

As I've been fond of telling people recently, this is Open Source at its best, working as intended. 🙂

Once #1633 gets reviewed, possibly updated, approved, and merged, that will clear the way for the rest of the essential Bzlmodification/toolchainization changes. My comments in #1633 explain what I'm shooting for, and link to the working prototype in my Bzlmod working branch—which now also maintains WORKSPACE compatibility. So hopefully the complete Bzlmodification itself will land quickly.


While I'm at it, let me recap the other activity related to this effort.

Most of the changes so far have been in preparation for landing the essential Bzlmodification/toolchainification changes, the first of which being #1633. Some have been essential prerequisites, fixing things that outright break under Bzlmod, such as #1621 or #1626. But with those fixed, now we're talking about providing a proper Bzlmod API that doesn't put the burden on users to import dozens of repos required by rules_scala toolchains (such as I had to do in the current MODULE.bazel from EngFlow/example). Technically, #1635 falls in this API design category as well.

Other pull requests have addressed small compatibility issues I've encountered along the way, like Bazel 7 compatibility in #1619 and #1620; Bzlmod compatible testing in #1622; and Scala version compatibility in #1628, #1632, and #1634. I've included a couple of helpful utilities I've written as well in #1629 and #1638. All of these should make maintaining the implementation and switching to Bzlmod even smoother, by resolving existing problems before anyone else encounters them.

I've also been identifying and making changes to ensure users aren't version locked on any dependencies thanks to rules_scala, such as #1631 (which was borderline essential, because Scalafmt 3.0.0 threw java.util.NoSuchElementException: last of empty IndexedSeq under Scala 2.13). I'm now focusing on trying to ensure more recent Protobuf module versions work with the scala_proto toolchain and with aspects implemented using protobuf/gRPC/ScalaPB/protoc-bridge (including the Scalafmt aspect). This is where changes like #1623, #1624, #1630, and #1637 come from. Much of that effort involves coordinating version bumps in the jar dictionaries from the third_party/repository/scala_*.bzl files and adding API adapters for specific Scala and/or dependency versions as needed. (Updating those dictionaries manually is why I'm also investing in updating scripts/create_repository.py in #1639, because doing all that work by hand in #1631 was the most time consuming part of this effort to date.)


Anywho, there's a few more changes to come, for Bzlmodification itself, for Protobuf upgradability, and for internal test and tooling updates. But as I've told friends and colleagues, I'm beyond seeing the light at the end of the tunnel. I'm out of the tunnel, off of the highway, and in the neighborhood, around the corner from the destination, turning the radio down so I can focus on where to park. 😛

@sundbry
Copy link

sundbry commented Nov 3, 2024

For my $0.02, I've checked out your mbland/bzlmod scala branch, and used it to build some scala programs - it works great! Thank you very much for pushing this code upstream.

The only issue I had, and I don't think this was related to your changes in particular, was figuring out how to set it up to use the scala3 toolchain- I ended up using --repo_env=SCALA_VERSION=3.4.3 in the build command to set it, but couldn't figure out how to set it in the MODULE.bazel or BUILD.bazel with the new configuration.

@mbland
Copy link
Contributor

mbland commented Nov 3, 2024

For my $0.02, I've checked out your mbland/bzlmod scala branch, and used it to build some scala programs - it works great! Thank you very much for pushing this code upstream.

My pleasure!

The only issue I had, and I don't think this was related to your changes in particular, was figuring out how to set it up to use the scala3 toolchain- I ended up using --repo_env=SCALA_VERSION=3.4.3 in the build command to set it, but couldn't figure out how to set it in the MODULE.bazel or BUILD.bazel with the new configuration.

Ah—my MODULE.bazel from EngFlow/example has an example of what this will look like. The only difference is the extension path will be @rules_scala//scala/extensions:config.bzl:

# This constant matches the default Scala version from rules_scala for now.
SCALA_VERSION = "2.13.12"
SCALA_VERSIONS = [SCALA_VERSION]

scala_config = use_extension("//scala/extensions:config.bzl", "scala_config")
scala_config.settings(
    scala_version = SCALA_VERSION,
    scala_versions = SCALA_VERSIONS,
)

Using named constants is optional, of course. Also, in this case, the SCALA_VERSIONS definition isn't necessary, because rules_scala will do the right thing without it. But if you do use SCALA_VERSIONS, it must contain SCALA_VERSION. Check out the relevant stanza from scala_config.bzl.

There's a similar, but slightly more detailed example in the MODULE.bazel file from my bzlmod working branch. The key difference there is the dev_dependency = True argument to use_extension. That's there because that configuration is only for developing rules_scala, not for its consumers.

liucijus pushed a commit that referenced this issue Nov 4, 2024
* First create_repository.py refactoring

Preserves the behavior of the previous version in preparation for
further improvements and eventual behavior changes. Part of #1482.

Tested by running the original script, calling `git stash
third_party/...`, then running the updated script. Confirmed that the
updated script produced no new changes in the working tree compared to
the indexed changes.

* Second create_repository.py refactoring

Preserves the behavior of the previous version in preparation for further
improvements and eventual behavior changes. Part of #1482.

This change eliminates a lot of duplication, and replaces the previous
functions to add trailing commas to the output with a `re.sub()` call.

Tested by running the original script, calling `git stash third_party/...`,
then running the updated script. Confirmed that the updated script produced no
new changes in the working tree compared to the indexed changes.

Also:

- Added `#!/usr/bin/env python3` and set file mode 0755 to enable
  running the script directly instead of as an argument to `python3`.

- Used `pathlib.Path` on `__file__` to enable running the script from
  any directory, not just from inside `scripts/`.

- Updated `scripts/README.md` to reflect some of the script changes, and
  to improve the Markdown formatting in general.

- Added the `DOWNLOADED_ARTIFACTS_FILE` variable, added its value to
  `.gitignore`, and added a `Path.unlink()` at the end of the script to
  clean it up.

* Third create_repository.py refactoring

Preserves the behavior of the previous version in preparation for
further improvements and eventual behavior changes. Part of #1482.

This vastly improves performance by _not_ downloading and hashing
artifacts that are already present in the current configuration.

Added `PROTOBUF_JAVA_VERSION = "4.28.2"` as a core artifact to avoid
downgrades, and to signal the importance of this particular artifact.

Tested by running the original script, calling `git stash
third_party/...`, then running the updated script. Confirmed that the
updated script produced no new changes in the working tree compared to
the indexed changes, modulo the previous `protobuf-java` downgrade.

* Small create_repository.py format updates, fixes

Added missing traliling comma in `COORDINATE_GROUPS[0]`.

Updated `get_label` to check for groups starting with `org.scala-lang.`

Sorted `deps` in `to_rules_scala_compatible_dict` and always sets `deps`
in the return value, even when empty.

* Ensure Scala 2 libs get "_2" repo names in Scala 3

Without this change, it was possible for Scala 3 core library versions
to become overwritten with Scala 2 versions specified as dependencies of
other jars. Specifically, all the `@io_bazel_rules_scala_scala_*_2` deps
explicitly added to the `scala_3_{1,2,3,4,5}.bzl` files in #1631 for
Scalafmt get stripped of the `_2` suffix before this change.

Also computed `is_scala_3` in `create_file` and passed it through where
needed. At some point it might be worth refactoring the script into a
proper object instead.

* Bump Scalafmt to 3.8.3 in create_repository.py

This matches the version set in #1631.

* Prevent create_repository.py from downgrading jars

The script will now only update an existing artifact if the newly
resolved version is greater than that already in the
third_party/repositories file. Part of #1482.

Tested by running on `master` both before and after #1631. The script
produced no new changes, and is even faster now, since it also won't try
to downgrade existing artifact versions. More specifically, this change
prevents the following downgrades:

- scalap: from latest Scala version to an earlier version, including
  2.13.14 to 2.13.11
- com.lihaoyi.pprint: from pprint_3:0.9.0 to pprint_2.13:0.6.4
- compiler-interface: 1.10.1 to 1.3.5 or 1.9.6
- com.lihaoyi.geny: from geny_3:1.1.1 to geny_2.13:0.6.5

* Ensure all artifact deps written to repo files

Updates the main loop in `create_file` to iterate over the newly
generated artifact dictionary, not the original dictionary parsed from
the existing file. This ensures that all dependencies of the updated
artifacts are written to the file.

This also improves performance when rerunning, since the script will no
longer keep trying to fetch the previously missing artifacts.

Tested by running once, `git add`ing the results, and running again to
ensure the second run was a no-op.

* Update get_label, fix scala3 label bug

Fixes a bug in the previous implementation whereby the `scala3_`
components of org.scala-lang artifact labels listed in an artifact's
`deps` weren't transformed to `scala_`. Improved the handling of
org.scala-lang artifacts more generally via the new
`adjust_scala_lang_label` function.

Refactored the `COORDINATE_GROUPS` array into a series of separate `set`
variables named after their `get_label` transformations. Makes the
`get_label` implementation read a little more easily.

* Replace `split_artifact_and_version`

I finally noticed that `get_maven_coordinates` already did the same
parsing, essentially. I migrated the comment and implementation from
`split_artifact_and_version` into it.

* Check against current ResolvedArtifact instances

Updates `create_artifact_version_map` to
`create_current_resolved_artifacts_map` to set up future improvements.

Specifically, an existing artifact's dependency coordinates aren't
updated unless the resolved artifact version is newer. A small change
after this will ensure all existing dependency coordinates get updated.
This could be a one-off change, or we could decide to keep it; either
way, it would be a very small one to make.

Also:

- Added `MavenCoordinates.artifact_name` to replace
  `artifact_name_from_maven_coordinates`.

- Renamed `is_newer_than_current_version` to `is_newer_version`.

- Added logic in `create_artifact_metadata_map` to ensure values with
  `testonly` set to `True` aren't part of the automatic resolution
  process.

* Handle scalap, org.thesamet.scalapb special cases

Preserves the existing labels for these artifacts.

Tested by running in a new branch that updates the direct dependencies
of core artifacts. Prior to this change, the `scalap` repo and some of
the `org.thesamet.scalapb` repos were duplicated with a new label. After
this change, that duplication disappears.

* Add --version, emit command, stderr only on error

These are changes suggested by @WojciechMazur in #1639.

* Raise and catch SubprocessError, return exit code

Further embellishment of the suggestions from @WojciechMazur in #1639.

This way we can see exactly which version updates encountered an error,
and how many, while continuing to attempt to update other versions.

* Hoist output_dir from create_or_update_repo_file

Makes the logic slightly easier to follow.

* Hoist OUTPUT_DIR as top level constant

I still like passing it as an argument to
`create_or_update_repository_file`.

* Add --output_dir flag, add defaults to --help

Making the output directory configurable from the command line was the
next logical extension. This makes it possible to see what the script
will generate from scratch without having to erase the existing repo
files.

Figured it would be nice to see the default values in the `--help`
output.

* Create a new empty file if no previous file exists

Decided it might be better to literally start from scratch in
`--output_dir` than trying to copy a file from `OUTPUT_DIR`. One could
always copy files from `OUTPUT_DIR` before running the script if so
desired.

Also changed the `__main__` logic to exit immediately on
`CreateRepositoryError` rather than attempt to keep going.

* Remove redundant `file.exists()` check

Eliminated the check in `create_or_update_repository_file` in favor of
that in `copy_previous_version_or_create_new_file_if_missing`.

* Improve `create_repository.py --help` docstrings

* Refactor `get_label` in `create_repository.py`

`get_label` now uses the `SPECIAL_CASE_GROUP_LABELS` dict instead of the
`LAST_GROUP_COMPONENT_GROUP` and `NEXT_TO_LAST_GROUP_COMPONENT_GROUP`
sets.

Also added `com.google.guava` to `ARTIFACT_LABEL_ONLY_GROUPS` to
generate the correct `io_bazel_rules_scala_guava` label. Added
`com.google.api.grpc` and `dev.dirs.directories` to
`SCALA_PROTO_RULES_GROUPS`.

Updated indentation of `ARTIFACT_LABEL_ONLY_GROUPS` and
`GROUP_AND_ARTIFACT_LABEL_GROUPS` elements.

* Emit correct label for scala-collection-compat

Specifically, `org_scala_lang_modules_scala_collection_compat`.
simuons pushed a commit that referenced this issue Nov 4, 2024
This catches errors that happen at any point in
`ExtraProtobufGenerator.run`, not just within
`handleCodeGeneratorRequest`. This, in turn, prevents more cases of
`ProtoScalaPBRule` aspect workers hanging. Part of #1482.

As I started experimenting with updating gRPC and ScalaPB libs, I bumped
the ScalaPB libs to 1.0.0-alpha.1 to try to resolve a problem. The
`ProtoScalaPBRule` workers then started hanging again.

After pulling the `catch ... Throwable` block from #1630 up into
`ExtraProtobufGenerator.run`, I got the following stack trace:

```txt
$ bazel build //test/proto/...

ERROR: .../external/com_google_protobuf/BUILD.bazel:334:14:
  ProtoScalaPBRule
  external/com_google_protobuf/wrappers_proto_jvm_extra_protobuf_generator_scalapb.srcjar
  failed: (Exit 1): scalapb_worker failed: error executing command
  (from target @com_google_protobuf//:wrappers_proto)

bazel-out/.../bin/src/scala/scripts/scalapb_worker
  ... (remaining 2 arguments skipped)

--jvm_extra_protobuf_generator_out: java.lang.IllegalAccessError:
  class scalapb.options.Scalapb$ScalaPbOptions tried to access method
  'com.google.protobuf.LazyStringArrayList
    com.google.protobuf.LazyStringArrayList.emptyList()'
  (scalapb.options.Scalapb$ScalaPbOptions and
   com.google.protobuf.LazyStringArrayList are in unnamed module
   of loader 'app')
      at scalapb.options.Scalapb$ScalaPbOptions.<init>(Scalapb.java:5021)
      at scalapb.options.Scalapb$ScalaPbOptions.<clinit>(Scalapb.java:11165)
      at scalapb.options.Scalapb.<clinit>(Scalapb.java:24184)
      at scalapb.options.compiler.Scalapb$.registerAllExtensions(Scalapb.scala:8)
      at scalarules.test.extra_protobuf_generator.ExtraProtobufGenerator$.run(ExtraProtobufGenerator.scala:48)
      at protocbridge.frontend.PluginFrontend$.$anonfun$runWithBytes$1(PluginFrontend.scala:51)
      at scala.util.Try$.apply(Try.scala:213)
      at protocbridge.frontend.PluginFrontend$.runWithBytes(PluginFrontend.scala:51)
      at protocbridge.frontend.PluginFrontend$.runWithInputStream(PluginFrontend.scala:121)
      at protocbridge.frontend.PosixPluginFrontend$.$anonfun$prepare$2(PosixPluginFrontend.scala:40)
      at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
      at scala.concurrent.impl.ExecutionContextImpl$DefaultThreadFactory$$anon$1$$anon$2.block(ExecutionContextImpl.scala:75)
      at java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3118)
      at scala.concurrent.impl.ExecutionContextImpl$DefaultThreadFactory$$anon$1.blockOn(ExecutionContextImpl.scala:87)
      at scala.concurrent.package$.blocking(package.scala:146)
      at protocbridge.frontend.PosixPluginFrontend$.$anonfun$prepare$1(PosixPluginFrontend.scala:38)
      at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
      at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
      at scala.util.Success.$anonfun$map$1(Try.scala:255)
      at scala.util.Success.map(Try.scala:213)
      at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
      at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:42)
      at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:74)
      at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426)
      at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
      at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
      at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
      at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
      at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)

java.lang.RuntimeException: Exit with code 1
     at scala.sys.package$.error(package.scala:30)
     at scripts.ScalaPBWorker$.work(ScalaPBWorker.scala:44)
     at io.bazel.rulesscala.worker.Worker.persistentWorkerMain(Worker.java:96)
     at io.bazel.rulesscala.worker.Worker.workerMain(Worker.java:49)
     at scripts.ScalaPBWorker$.main(ScalaPBWorker.scala:39)
     at scripts.ScalaPBWorker.main(ScalaPBWorker.scala)

ERROR: .../external/com_google_protobuf/BUILD.bazel:334:14
  scala @com_google_protobuf//:wrappers_proto failed:
  (Exit 1): scalapb_worker failed: error executing command
  (from target @com_google_protobuf//:wrappers_proto)

bazel-out/.../bin/src/scala/scripts/scalapb_worker
  ... (remaining 2 arguments skipped)
```

Bumping to protobuf v28.2 resolved the issue, at the expense of
sacrificing Bazel 6 support, as v21.7 was the last to support Bazel 6.
But this issue is orthogonal to the original gRPC test failure issue,
which I'll address in a future change.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 4, 2024
Moves the `toolchain` targets for `//scala:toolchain_type` to a new
`@io_bazel_rules_scala_toolchains` repository as a step towards
Bzlmodification. Part of bazelbuild#1482.

Instantiating toolchains in their own repository enables module
extensions to define the the repositories required by those toolchains
within the extension's own namespace. Bzlmod users can then register the
toolchains from this repository without having to import all the
toolchains' dependencies into their own namespace via `use_repo()`.

---

The `scala_toolchains_repo()` macro wraps the underlying repository rule
and assigns it the standard name `io_bazel_rules_scala_toolchains`.
Right now it's only instantiating the main Scala toolchain via the
default `scala = True` parameter. Future changes will expand this
macro and repository rule with more boolean parameters to instantiate
other toolchains, specifically:

- `scalatest`
- `junit`
- `specs2`
- `twitter_scrooge`
- `jmh`
- `scala_proto` and `scala_proto_enable_all_options`
- `testing` (includes all of `scalatest`, `junit`, and `specs2`)
- `scalafmt`

---

`WORKSPACE` users will now have to import and call the
`scala_toolchains_repo()` macro to instantiate
`@io_bazel_rules_scala_toolchains`.

```py
load("@io_bazel_rules_scala//:scala_config.bzl", "scala_config")

scala_config()

load(
    "//scala:scala.bzl",
    "rules_scala_setup",
    "rules_scala_toolchain_deps_repositories",
    "scala_toolchains_repo",
)

rules_scala_setup()

rules_scala_toolchain_deps_repositories()

scala_toolchains_repo()

register_toolchains("@io_bazel_rules_scala_toolchains//...:all")
```

This is what the corresponding `MODULE.bazel` setup would look like:

```py
module(name = "rules_scala", version = "7.0.0")

scala_config = use_extension(
    "//scala/extensions:config.bzl", "scala_config"
)
scala_config.settings(scala_version = "2.13.14")

scala_deps = use_extension("//scala/extensions:deps.bzl", "scala_deps")
scala_deps.toolchains()
```

The `register_toolchains()` call in `WORKSPACE` isn't strictly required
at this point, but is recommended. However, all the `WORKSPACE` files in
this repo already register their required toolchains using existing
macros, which have been updated in this change.

In fact, calling `register_toolchains()` right after
`scala_toolchains_repo()` as shown above breaks two tests that depend on
the existing `WORKSPACE` toolchain registration:

- `test_compilation_fails_with_plus_one_deps_undefined` from
  `test/shell/test_compilation.sh` depends on
  `scala_register_unused_deps_toolchains()` setting up its toolchain to
  resolve first. `//scala:unused_dependency_checker_error_toolchain`
  sets the `scala_toolchain()` parameters `dependency_tracking_method =
  "ast-plus"` and `unused_dependency_checker_mode = "error"`, and the
  `@io_bazel_rules_scala_toolchains//scala` toolchains don't.

- `test_scala_binary_allows_opt_in_to_use_of_argument_file_in_runner_for_improved_performance`
  from `test/shell/test_scala_binary.sh` depends on the
  `use_argument_file_in_runner` parameter of `scala_toolchain` being
  `False`. This is the default, but the
  `@io_bazel_rules_scala_toolchains//scala` toolchains explicitly set
  this to `True` instead.

In the Bzlmod case, the `register_toolchains()` call isn't necessary at
all. This is because `@io_bazel_rules_scala_toolchains` includes one
package per set of toolchains, and the rules_scala `MODULE.bazel` calls
`register_toolchains("@io_bazel_rules_scala_toolchains//...:all")`. This
will automatically register all configured rules_scala toolchains, while
allowing users to override any of them using `register_toolchains()` in
their own `MODULE.bazel` files.

Technically, the `scala_deps.toolchains()` call isn't required when only
using the default `scala = True` parameter; the rules_scala
`MODULE.bazel` will instantiate this automatically, too.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 4, 2024
Extracted a single `scala_toolchains` macro to share between `WORKSPACE`
and the `deps.bzl` module extension. This will make it easier to ensure
`WORKSPACE` compatibility, and to add a Bazel module extension as a thin
layer on top. Part of bazelbuild#1482.

This change includes updates to `rules_scala_setup` and
`scala_repositories` to support this, while preserving compatibility
with existing `WORKSPACE` calls. The next commit will replace many
existing `WORKSPACE` calls with `scala_toolchains`.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 4, 2024
Updates the transitive dependency lists of all resolved top level
artifacts, resulting in more complete, sorted lists. Part of bazelbuild#1482.

Technically, this change shouldn't strictly be necessary. It's possible
we may want to remove this extra clause after all the
`create_repository.py` updates are done and the
`third_party/repositories/scala_*.bzl` files have settled. However, it
will prove useful to keep it as we add more top level artifacts to the
script, which will drive more existing artifact updates.

Also removes the unused `proc` variable from
`map_to_resolved_artifacts`.
@zcking
Copy link

zcking commented Nov 5, 2024

@mbland Totally get it! My comment was mostly intended to show support for this issue and vested interest outside of the normal contributors. This work is really important if Scala is to join many other languages already up-to-speed in the Bazel registry.

Also, although I'm not familiar with development for bazel rules such as this, I'd be happy to help contribute in my spare time if any additional help is needed!

mbland added a commit to mbland/bazel-skylib that referenced this issue Nov 6, 2024
Adds the following macros to work with apparent repo names when running
under Bzlmod.

- `adjust_main_repo_prefix`
- `apparent_repo_label_string`
- `apparent_repo_name`

Originally developed while updating rules_scala to support Bzlmod as
part of bazelbuild/rules_scala#1482.

For examples of their use, see bazelbuild/rules_scala#1621.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 11, 2024
Moves the `toolchain` targets for `//scala:toolchain_type` to a new
`@io_bazel_rules_scala_toolchains` repository as a step towards
Bzlmodification. Part of bazelbuild#1482.

Instantiating toolchains in their own repository enables module
extensions to define the the repositories required by those toolchains
within the extension's own namespace. Bzlmod users can then register the
toolchains from this repository without having to import all the
toolchains' dependencies into their own namespace via `use_repo()`.

---

The `scala_toolchains_repo()` macro wraps the underlying repository rule
and assigns it the standard name `io_bazel_rules_scala_toolchains`.
Right now it's only instantiating the main Scala toolchain via the
default `scala = True` parameter. Future changes will expand this
macro and repository rule with more boolean parameters to instantiate
other toolchains, specifically:

- `scalatest`
- `junit`
- `specs2`
- `twitter_scrooge`
- `jmh`
- `scala_proto` and `scala_proto_enable_all_options`
- `testing` (includes all of `scalatest`, `junit`, and `specs2`)
- `scalafmt`

---

`WORKSPACE` users will now have to import and call the
`scala_toolchains_repo()` macro to instantiate
`@io_bazel_rules_scala_toolchains`.

```py
load("@io_bazel_rules_scala//:scala_config.bzl", "scala_config")

scala_config()

load(
    "//scala:scala.bzl",
    "rules_scala_setup",
    "rules_scala_toolchain_deps_repositories",
    "scala_toolchains_repo",
)

rules_scala_setup()

rules_scala_toolchain_deps_repositories()

scala_toolchains_repo()

register_toolchains("@io_bazel_rules_scala_toolchains//...:all")
```

This is what the corresponding `MODULE.bazel` setup would look like:

```py
module(name = "rules_scala", version = "7.0.0")

scala_config = use_extension(
    "//scala/extensions:config.bzl", "scala_config"
)
scala_config.settings(scala_version = "2.13.14")

scala_deps = use_extension("//scala/extensions:deps.bzl", "scala_deps")
scala_deps.toolchains()
```

The `register_toolchains()` call in `WORKSPACE` isn't strictly required
at this point, but is recommended. However, all the `WORKSPACE` files in
this repo already register their required toolchains using existing
macros, which have been updated in this change.

In fact, calling `register_toolchains()` right after
`scala_toolchains_repo()` as shown above breaks two tests that depend on
the existing `WORKSPACE` toolchain registration:

- `test_compilation_fails_with_plus_one_deps_undefined` from
  `test/shell/test_compilation.sh` depends on
  `scala_register_unused_deps_toolchains()` setting up its toolchain to
  resolve first. `//scala:unused_dependency_checker_error_toolchain`
  sets the `scala_toolchain()` parameters `dependency_tracking_method =
  "ast-plus"` and `unused_dependency_checker_mode = "error"`, and the
  `@io_bazel_rules_scala_toolchains//scala` toolchains don't.

- `test_scala_binary_allows_opt_in_to_use_of_argument_file_in_runner_for_improved_performance`
  from `test/shell/test_scala_binary.sh` depends on the
  `use_argument_file_in_runner` parameter of `scala_toolchain` being
  `False`. This is the default, but the
  `@io_bazel_rules_scala_toolchains//scala` toolchains explicitly set
  this to `True` instead.

In the Bzlmod case, the `register_toolchains()` call isn't necessary at
all. This is because `@io_bazel_rules_scala_toolchains` includes one
package per set of toolchains, and the rules_scala `MODULE.bazel` calls
`register_toolchains("@io_bazel_rules_scala_toolchains//...:all")`. This
will automatically register all configured rules_scala toolchains, while
allowing users to override any of them using `register_toolchains()` in
their own `MODULE.bazel` files.

Technically, the `scala_deps.toolchains()` call isn't required when only
using the default `scala = True` parameter; the rules_scala
`MODULE.bazel` will instantiate this automatically, too.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 11, 2024
Extracted a single `scala_toolchains` macro to share between `WORKSPACE`
and the `deps.bzl` module extension. This will make it easier to ensure
`WORKSPACE` compatibility, and to add a Bazel module extension as a thin
layer on top. Part of bazelbuild#1482.

This change includes updates to `rules_scala_setup` and
`scala_repositories` to support this, while preserving compatibility
with existing `WORKSPACE` calls. The next commit will replace many
existing `WORKSPACE` calls with `scala_toolchains`.
@mbland
Copy link
Contributor

mbland commented Nov 11, 2024

Another quick update...

@zcking I appreciate the offer for help. I feel like the actual code changes are essentially complete, modulo adjustments based on review feedback. Feel free to look at the outstanding and upcoming PRs and offer any feedback or questions if you have any.

@simuons provided some feedback on #1633 that prompted me to make some adjustments and to ask for confirmation on whether to make certain other changes. This one will establish the direction for pulling other changes from my Bzlmod working branch. Though it will remain compatible with WORKSPACE, there will be small yet breaking changes from the existing WORKSPACE API, so it's worth taking the time to get it right.

I've reached some conclusions with regard to my Protobuf compatibility and upgradeability side quest, which I've written up in #1647. I found that rules_scala can either support up to Protobuf v25.5 when building with both Bazel 6 and 7 (with C++ compiler flags set for Bazel 6), or can leap to supporting only Protobuf v28.2 and above. My suggestion is to support v25.5, since the current v21.7 seems too old, but I'd fear switching straight to v28.2 might break too many users right now.

I've opened some new PRs to make a few improvements to the build and test scripts along the way:

The first and last of these, in particular, turbocharged my Protobuf investigation:

  • The create_repository.py script is now super fast, does more to ensure the entries in the generated third_party/repository/scala_*.bzl files are accurate, and updates more core artifacts (including ScalaPB in my bzlmod-update-protocbridge-and-grpc and bzlmod-update-abseil-cpp-and-protobuf working branches).
  • The new RULES_SCALA_TEST_ONLY and RULES_SCALA_TEST_VERBOSE env vars made it easier to iterate on specific test cases or files from test_all.sh, test_version.sh, etc., and to see what was happening. (The default is to only emit output on failure, after the command finishes, which proves especially problematic if a test hangs.)

I also figured out how to put a nail in the coffin of the generators-hanging-on-Protobuf-upgrade issue by wrapping ScalaPbCodeGenerator to catch and report previously uncaught exceptions.

At this point, it's just a matter of taking the time to land things properly. No need to rush, and now that I've got all my tools and investigations and proofs of concept working, I plan to take it easy for a while. 🙂

mbland added a commit to mbland/rules_scala that referenced this issue Nov 12, 2024
Moves the `toolchain` targets for `//scala:toolchain_type` to a new
`@io_bazel_rules_scala_toolchains` repository as a step towards
Bzlmodification. Part of bazelbuild#1482.

Instantiating toolchains in their own repository enables module
extensions to define the the repositories required by those toolchains
within the extension's own namespace. Bzlmod users can then register the
toolchains from this repository without having to import all the
toolchains' dependencies into their own namespace via `use_repo()`.

---

The `scala_toolchains_repo()` macro wraps the underlying repository rule
and assigns it the standard name `io_bazel_rules_scala_toolchains`.
Right now it's only instantiating the main Scala toolchain via the
default `scala = True` parameter. Future changes will expand this
macro and repository rule with more boolean parameters to instantiate
other toolchains, specifically:

- `scalatest`
- `junit`
- `specs2`
- `twitter_scrooge`
- `jmh`
- `scala_proto` and `scala_proto_enable_all_options`
- `testing` (includes all of `scalatest`, `junit`, and `specs2`)
- `scalafmt`

---

`WORKSPACE` users will now have to import and call the
`scala_toolchains_repo()` macro to instantiate
`@io_bazel_rules_scala_toolchains`.

```py
load("@io_bazel_rules_scala//:scala_config.bzl", "scala_config")

scala_config()

load(
    "//scala:scala.bzl",
    "rules_scala_setup",
    "rules_scala_toolchain_deps_repositories",
    "scala_toolchains_repo",
)

rules_scala_setup()

rules_scala_toolchain_deps_repositories()

scala_toolchains_repo()

register_toolchains("@io_bazel_rules_scala_toolchains//...:all")
```

This is what the corresponding `MODULE.bazel` setup would look like:

```py
module(name = "rules_scala", version = "7.0.0")

scala_config = use_extension(
    "//scala/extensions:config.bzl", "scala_config"
)
scala_config.settings(scala_version = "2.13.14")

scala_deps = use_extension("//scala/extensions:deps.bzl", "scala_deps")
scala_deps.toolchains()
```

The `register_toolchains()` call in `WORKSPACE` isn't strictly required
at this point, but is recommended. However, all the `WORKSPACE` files in
this repo already register their required toolchains using existing
macros, which have been updated in this change.

In fact, calling `register_toolchains()` right after
`scala_toolchains_repo()` as shown above breaks two tests that depend on
the existing `WORKSPACE` toolchain registration:

- `test_compilation_fails_with_plus_one_deps_undefined` from
  `test/shell/test_compilation.sh` depends on
  `scala_register_unused_deps_toolchains()` setting up its toolchain to
  resolve first. `//scala:unused_dependency_checker_error_toolchain`
  sets the `scala_toolchain()` parameters `dependency_tracking_method =
  "ast-plus"` and `unused_dependency_checker_mode = "error"`, and the
  `@io_bazel_rules_scala_toolchains//scala` toolchains don't.

- `test_scala_binary_allows_opt_in_to_use_of_argument_file_in_runner_for_improved_performance`
  from `test/shell/test_scala_binary.sh` depends on the
  `use_argument_file_in_runner` parameter of `scala_toolchain` being
  `False`. This is the default, but the
  `@io_bazel_rules_scala_toolchains//scala` toolchains explicitly set
  this to `True` instead.

In the Bzlmod case, the `register_toolchains()` call isn't necessary at
all. This is because `@io_bazel_rules_scala_toolchains` includes one
package per set of toolchains, and the rules_scala `MODULE.bazel` calls
`register_toolchains("@io_bazel_rules_scala_toolchains//...:all")`. This
will automatically register all configured rules_scala toolchains, while
allowing users to override any of them using `register_toolchains()` in
their own `MODULE.bazel` files.

Technically, the `scala_deps.toolchains()` call isn't required when only
using the default `scala = True` parameter; the rules_scala
`MODULE.bazel` will instantiate this automatically, too.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 12, 2024
Extracted a single `scala_toolchains` macro to share between `WORKSPACE`
and the `deps.bzl` module extension. This will make it easier to ensure
`WORKSPACE` compatibility, and to add a Bazel module extension as a thin
layer on top. Part of bazelbuild#1482.

This change includes updates to `rules_scala_setup` and
`scala_repositories` to support this, while preserving compatibility
with existing `WORKSPACE` calls. The next commit will replace many
existing `WORKSPACE` calls with `scala_toolchains`.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 13, 2024
This is part of the Bzlmod prep work from bazelbuild#1482, but also reduces
duplication between the `dt_patches/test_dt_patches*/WORKSPACE` files.

The structure of the `dt_patches/compiler_sources/extensions.bzl`
declarations accommodates the fact that, unlike `WORKSPACE,
`MODULE.bazel` files cannot import constants or contain conditional
statements. That is to say, there's no way to port the `if
SCALA_VERSION.startswith("2.")` expressions from the previous
`WORKSPACE` files to Bzlmod.

The new `import_compiler_source_repos` macro, however, works for both
`WORKSPACE` and Bzlmod builds, as it's trivial to wrap
`import_compiler_source_repos` in a module extension.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 15, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of bazelbuild#1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (bazelbuild#1639, bazelbuild#1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)
mbland added a commit to mbland/rules_scala that referenced this issue Nov 15, 2024
This presents the same API, remains compatible with both `WORKSPACE` and
Bzlmod, and avoids relying on the canonical repository name format at
all. Part of bazelbuild#1482, and supersedes the addition of `apparent_repo_name`
from bazelbuild#1621.

Also adds the `_SCALA_IMPORT_RULE_LOAD` constant in
`scala/scala_maven_import_external.bzl`, which also removes
`@io_bazel_rules_scala` from the `load`ed file path. It now generates
the correct path with a `Label` instead.

---

With bazelbuild/bazel-skylib#548 going nowhere, I eventually realized a
macro wrapper could generate a default target name for a repository_rule
by duplicating the `name` attr. This isn't as easy as adding
`apparent_repo_name(repository_ctx.name)` to a repo rule's
implementation, but it's also not that much more work.

One small downside is that the macro can't be imported into a
`MODULE.bazel` file directly via `use_repo_rule`, if one wanted to do
that. If you did, you'd have to add a module extension to call it, or
use `modules.as_extension` from the `bazel_skylib` module. I suppose
that's a small price to pay for squashing a canonical repo name format
dependency forever.

The other small downside may be that the documentation from the original
`repository_rule` doesn't automatically convey to the repo wrapper. In
this case, `_alias_repository` is already internal, and the original
`jvm_import_external` didn't have docstrings to begin with.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 15, 2024
This presents the same API, remains compatible with both `WORKSPACE` and
Bzlmod, and avoids relying on the canonical repository name format at
all. Part of bazelbuild#1482, and supersedes the addition of `apparent_repo_name`
from bazelbuild#1621.

Also adds the `_SCALA_IMPORT_RULE_LOAD` constant in
`scala/scala_maven_import_external.bzl`, which also removes
`@io_bazel_rules_scala` from the `load`ed file path. It now generates
the correct path with a `Label` instead.

---

The goal is to maintain the ability to generate default target names
from a repository name under Bzlmod. While not documented on
https://bazel.build, it is documented in the Bazel source itself, even
under the current 9.0.0-pre.20241105.2 prerelease:

- https://github.com/bazelbuild/bazel/blob/9.0.0-pre.20241105.2/src/main/java/com/google/devtools/build/lib/cmdline/LabelParser.java#L99

Under `WORKSPACE`, the `repository_ctx.name` value served this purpose
directly, as it would reflect the `name` value as provided by the
caller. Under Bzlmod, this value now contains the canonical repo name,
which is different than that provided by the `name` parameter. The
`apparent_repo_name` utility was my first attempt to resolve this, by
parsing the original `name` parameter from the canonicalized
`repository_ctx.name`.

I also created bazelbuild/bazel-skylib#548 to contribute
`apparent_repo_name` to `bazel_skylib`, along with other proposed
utilities. I quickly found better solutions to obviate the other
utilities, but hung onto `apparent_repo_name`.

After that pull request stagnated, I eventually realized a macro wrapper
could generate a default target name for a repository_rule by
duplicating the `name` attr. This isn't as easy as adding
`apparent_repo_name(repository_ctx.name)` to a repo rule's
implementation, but it's also not that much more work. See also this
thread from the #bzlmod channel in the Bazel Slack workspace about
generating default repo names:

- https://bazelbuild.slack.com/archives/C014RARENH0/p1730909824656159

---

One small downside of this technique is that the macro can't be imported
into a `MODULE.bazel` file directly via `use_repo_rule`. If you did want
to use it in `MODULE.bazel`, you'd have to write a module extension to
call it, or use `modules.as_extension` from `bazel_skylib`. I suppose
that's a small price to pay for squashing a canonical repo name format
dependency forever.

The other small downside may be that the documentation from the original
`repository_rule` doesn't automatically convey to the repo wrapper. In
this case, `_alias_repository` is already internal, and the original
`jvm_import_external` didn't have docstrings to begin with.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 15, 2024
This presents the same API, remains compatible with both `WORKSPACE` and
Bzlmod, and avoids relying on the canonical repository name format at
all. Part of bazelbuild#1482, and supersedes the addition of `apparent_repo_name`
from bazelbuild#1621.

Also adds the `_SCALA_IMPORT_RULE_LOAD` constant in
`scala/scala_maven_import_external.bzl`, which also removes
`@io_bazel_rules_scala` from the `load`ed file path. It now generates
the correct path with a `Label` instead.

---

The goal is to maintain the ability to generate default target names
from a repository name under Bzlmod. While not documented on
https://bazel.build, it is documented in the Bazel source itself, even
under the current 9.0.0-pre.20241105.2 prerelease:

- https://github.com/bazelbuild/bazel/blob/9.0.0-pre.20241105.2/src/main/java/com/google/devtools/build/lib/cmdline/LabelParser.java#L99

Under `WORKSPACE`, the `repository_ctx.name` value served this purpose
directly, as it would reflect the `name` value as provided by the
caller. Under Bzlmod, this value now contains the canonical repo name,
which is different than that provided by the `name` parameter. The
`apparent_repo_name` utility was my first attempt to resolve this, by
parsing the original `name` parameter from the canonicalized
`repository_ctx.name`.

I also created bazelbuild/bazel-skylib#548 to contribute
`apparent_repo_name` to `bazel_skylib`, along with other proposed
utilities. I quickly found better solutions to obviate the other
utilities, but hung onto `apparent_repo_name`.

After that pull request stagnated, I eventually realized a macro wrapper
could generate a default target name for a repository_rule by
duplicating the `name` attr. This isn't as easy as adding
`apparent_repo_name(repository_ctx.name)` to a repo rule's
implementation, but it's also not that much more work. See also this
thread from the #bzlmod channel in the Bazel Slack workspace about
generating default repo names:

- https://bazelbuild.slack.com/archives/C014RARENH0/p1730909824656159

---

One small downside of this technique is that the macro can't be imported
into a `MODULE.bazel` file directly via `use_repo_rule`. If you did want
to use it in `MODULE.bazel`, you'd have to write a module extension to
call it, or use `modules.as_extension` from `bazel_skylib`. I suppose
that's a small price to pay for squashing a canonical repo name format
dependency forever.

The other small downside may be that the documentation from the original
`repository_rule` doesn't automatically convey to the repo wrapper. In
this case, `_alias_repository` is already internal, and the original
`jvm_import_external` didn't have docstrings to begin with.
mbland added a commit to mbland/rules_scala that referenced this issue Nov 15, 2024
This presents the same API, remains compatible with both `WORKSPACE` and
Bzlmod, and avoids relying on the canonical repository name format at
all. Part of bazelbuild#1482, and supersedes the addition of `apparent_repo_name`
from bazelbuild#1621.

Also adds the `_SCALA_IMPORT_RULE_LOAD` constant in
`scala/scala_maven_import_external.bzl`, which also removes
`@io_bazel_rules_scala` from the `load`ed file path. It now generates
the correct path with a `Label` instead.

---

The goal is to maintain the ability to generate default target names
from a repository name under Bzlmod. While not documented on
https://bazel.build, it is documented in the Bazel source itself, even
under the current 9.0.0-pre.20241105.2 prerelease:

- https://github.com/bazelbuild/bazel/blob/9.0.0-pre.20241105.2/src/main/java/com/google/devtools/build/lib/cmdline/LabelParser.java#L99

Under `WORKSPACE`, the `repository_ctx.name` value served this purpose
directly, as it would reflect the `name` value as provided by the
caller. Under Bzlmod, this value now contains the canonical repo name,
which is different than that provided by the `name` parameter. The
`apparent_repo_name` utility was my first attempt to resolve this, by
parsing the original `name` parameter from the canonicalized
`repository_ctx.name`.

I also created bazelbuild/bazel-skylib#548 to contribute
`apparent_repo_name` to `bazel_skylib`, along with other proposed
utilities. I quickly found better solutions to obviate the other
utilities, but hung onto `apparent_repo_name`.

After that pull request stagnated, I eventually realized a macro wrapper
could generate a default target name for a repository_rule by
duplicating the `name` attr. This isn't as easy as adding
`apparent_repo_name(repository_ctx.name)` to a repo rule's
implementation, but it's also not that much more work. See also this
thread from the #bzlmod channel in the Bazel Slack workspace about
generating default repo names:

- https://bazelbuild.slack.com/archives/C014RARENH0/p1730909824656159

---

One small downside of this technique is that the macro can't be imported
into a `MODULE.bazel` file directly via `use_repo_rule`. If you did want
to use it in `MODULE.bazel`, you'd have to write a module extension to
call it, or use `modules.as_extension` from `bazel_skylib`. I suppose
that's a small price to pay for squashing a canonical repo name format
dependency forever.

The other small downside may be that the documentation from the original
`repository_rule` doesn't automatically convey to the repo wrapper. In
this case, `_alias_repository` is already internal, and the original
`jvm_import_external` didn't have docstrings to begin with.
@mbland
Copy link
Contributor

mbland commented Nov 19, 2024

OK, here's my latest weekly update. 🙂 The executive summary is that we're in the home stretch here.

#1633 is super close to going in. And now that I think about it, I think there aren't any WORKSPACE API-breaking changes after all. All the old macros should continue to work with existing builds with no changes (though scala_toolchains is nicer and can shrink existing WORKSPACE files substantially). So technically, the only "breaking" change so far (or bugfix, depending how you look at it) comes from removing external/$REPO)_NAME path prefixes from embedded resources, from #1621.

Speaking of #1621, I've superseded the apparent_repo_name utility from that pull request in the new #1650. Not critical, but it's the right thing to do; I've also closed bazelbuild/bazel-skylib#548 as a result. And an update as of this morning: The Bazel maintainers may be considering a new repository_ctx field to render this workaround unnecessary (eventually).

Integrating the changes from #1648 broke my bzlmod-rebuilding working branch, because instantiating scala_proto toolchain repos collided with instantiating ScalaPB repos for scalafmt and Guava repos for twitter_scrooge under Bzlmod. The WORKSPACE implementation doesn't care, but it's an error to instantiate the same repo multiple times from a module extension:

$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl

This led me to develop a schema whereby the toolchains declare their Maven artifact dependencies in a separate macro from the one that calls repositories to instantiate them. (The Scalafmt toolchain had defined a similar macro before, to accommodate different artifact sets for different Scala versions, which I've since updated.) This allows the new scala_toolchains macro from #1633 to collect all the artifact IDs, de-dupe them, and call repositories itself. You can see the final result in scala/private/macros/toolchains.bzl from my bzlmod working branch.

I've prepared commits for the testing toolchains and the Scalafmt toolchain already, to land in that order after #1633. I've started one for the scala_proto toolchain, but want to wait for #1648 to land before going further with it, since that PR introduces scala_proto_artifact_ids, which I'd used to accommodate differences between Scala 2.11 and other versions.

After that, there'll be:

  • either one or two pull requests to add the jmh and twitter_scrooge toolchains; I do have twitter_scrooge building under Bzlmod already, with all native.bind calls removed and all tests passing
  • a pull request to prepare the dt_patches repos for Bzlmodification. as well as remove duplication
  • maybe one pull request to remove all remaining uses of @io_bazel_rules_scala, so that Bzlmod users can usebazel_dep(name = "rules_scala", version = "7.0.0") without a repo_name = "io_bazel_rules_scala" parameter
  • at least one more pull request to drop in MODULE.bazel files, module extensions, and remove --noenable_bzlmod from .bazelrc; this could be more than one, depending on the maintainers' appetite to migrate all internal repos at once, or one at a time, or in batches
  • either in the above pull request or in a separate one, turn the Bazel last green head build back on using Bzlmod to close Bazel green build CI job now fails persistently due to rules_java versioning #1625

We could also bump the default .bazelversion to 7.4.1 (or whatever the latest is by that point) and/or bump Protobuf to v25.5, per #1647, but that's not strictly critical. (It would technically break Bazel 6 users, because they'd have to add C++ compiler flags to their .bazelrc to build Protobuf v22 and later.)

At some point I'll look into adding the Publish to the Bazel Central Registry tool, probably after the official 7.0.0 release, unless the maintainers would prefer to do it themselves. After that, and when https://registry.bazel.build/modules/rules_scala goes live, I think we could close this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet