-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds multiplex sandbox support and improves argument parsing #52
Conversation
0c9b83d
to
e3e76db
Compare
import protocbridge.{ProtocBridge, ProtocRunner} | ||
import scala.jdk.CollectionConverters._ | ||
import scalapb.ScalaPbCodeGenerator | ||
|
||
object ScalaProtoWorker extends WorkerMain[Unit] { | ||
|
||
private class ScalaProtoRequest private ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Should this be a case class
(this applies to the other configuration classes created in this commit)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe so because you get this error message related to Scala 3 migration:
src/main/scala/higherkindness/rules_scala/workers/common/CommonArguments.scala:15: error: access modifiers for `copy` method are copied from the case class constructor under Scala 3 (or with -Xsource-features:case-apply-copy-access)
Scala 3 migration messages are issued as errors under -Xsource:3. Use -Wconf or @nowarn to demote them to warnings or suppress.
Applicable -Wconf / @nowarn filters for this fatal warning: msg=<part of the message>, cat=scala3-migration, site=higherkindness.rules_scala.workers.common.CommonArguments
case class CommonArguments private (
^
1 error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think you need to remove the private
constructor modifier too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I want the constructor to be private, though, which is why this is a class and not a case class.
I only want people to be able to construct these objects by passing in a namespace and a working directory, so someone doesn't accidentally construct one incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I think this is fine as-is, then.
def getSandboxFile(workDir: Path, file: File): File = { | ||
workDir.resolve(file.toPath).toFile | ||
} | ||
def getSandboxPaths(workDir: Path, paths: JList[Path]): List[Path] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Do we need these getSandboxPaths
methods? Writing a method for every collection seems a bit unnecessary when the caller could just as easily call getSandboxPath
for every element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally was doing that, but found it to be enough boiler plate that I wrote these as convenience methods.
@@ -14,20 +18,38 @@ import scala.io.Codec | |||
|
|||
object ScalafmtRunner extends WorkerMain[Unit] { | |||
|
|||
protected[this] def init(args: Option[Array[String]]): Unit = {} | |||
private[this] class ScalafmtRequest private ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think we should avoid using private[this]
because it doesn't offer many guarantees over private
and will be removed in Scala 3 (this applies to the other uses of private[this]
and protected[this]
in this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a performance advantage, however small, in Scala 2 to using private[this]
vs private
https://users.scala-lang.org/t/what-does-passing-this-after-access-modifier-mean/3838/5
Something that might have a performance impact (though I think the JIT should be pretty good at eliminating this anyway), is that a private[this] val gets compiled to a field only. While private val gets compiled to a field and an accessor method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good to know. This was just a nit, so I'm fine with keeping it either way.
src/main/scala/BUILD
Outdated
@@ -3,6 +3,7 @@ load( | |||
"configure_bootstrap_scala", | |||
"configure_zinc_scala", | |||
"scala_library", | |||
# "bar_toolchain", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Nice catch.
src/main/scala/BUILD
Outdated
@@ -67,17 +68,15 @@ configure_zinc_scala( | |||
|
|||
# Scala 3 | |||
|
|||
# Adding this, so we make sure to have a Scala library in the | |||
# IntelliJ libraries, so we can get a Scala SDK on sync. | |||
# Adding this, so we make sure to have a Scala library in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this comment was duplicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Fixing.
def pathsForLabel(depLabel: String): Seq[String] = { | ||
val label = workRequest.label | ||
val directDepLabels = workRequest.directDepLabels | ||
val groupLabelToJarPaths = workRequest.groups.map { group => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we inline label
and directDepLabels
? Also, should groupLabelToJarPaths
be a lazy val
on DepsRunnerRequest
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've inlined those two. I did not not change that val to be lazy due to the performance penalty of lazy and the locking that comes with it.
.readAllLines(workRequest.usedDepsFile) | ||
.asScala | ||
.view | ||
// Use the read mapper on the used classpath entries in order to keep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, could you explain what the read mapper does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read/write mapper is what we use to serialize and deserialize paths in a machine independent way.
If you didn't do mapping you'd end up with something like /home/foo/project/src/scala/bar/Qux.scala
, but that's machine dependent. We only care about the src/scala/bar/Qux.scala
bit.
The write mappers write things to disk in a machine independent way.
The read mappers know how to read those things from disk and translate them to actual paths on your machine.
Most of this is defined in src/main/scala/higherkindness/rules_scala/workers/common/AnnexMapper.scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That clarifies things. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you fill me in on why this changes was necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalize does this:
Returns a path that is this path with redundant name elements eliminated.
I realized we weren't normalizing paths, so it would in theory be possible to get a path that points to the same spot, but the string doesn't match up. I wanted to eliminate that potential bit of non-determinism by normalizing the absolute paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. I wasn't sure if it was necessary as a consequence of introducing multiplexed, sandboxed workers, or if it was just good practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This is unrelated to multiplex sandboxing. Just something I encountered while working on the sandboxing and applied everywhere. It really should have been its own commit. My bad.
val parser = ArgumentParsers.newFor("zinc-worker").addHelp(true).build | ||
parser.addArgument("--persistence_dir", /* deprecated */ "--persistenceDir").metavar("path") | ||
parser.addArgument("--use_persistence").`type`(Arg.booleanType) | ||
parser.addArgument("--extracted_file_cache").metavar("path") | ||
// deprecated | ||
parser.addArgument("--max_errors") | ||
parser.parseArgsOrFail(args.getOrElse(Array.empty)) | ||
val namespace = parser.parseArgsOrFail(args.getOrElse(Array.empty)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use ArgsUtil.parseArgsOrFailSafe
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not here, no. This is called with the worker is being initialized. It's ok for the worker to die here. If the worker dies here it just fails to start. It's also preferable for it to die as I'm not sure what else to do here.
If that worker dies on a work request, that's bad. Recovery is just telling Bazel that work request failed and then waiting for the next request to come in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense.
@@ -13,11 +14,14 @@ trait WorkerMain[S] { | |||
|
|||
protected[this] def init(args: Option[Array[String]]): S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method seems rarely used in subclasses of WorkerMain
because argument parsing (which calls ArgsUtil.parseArgsOrFailSafe
) now requires access to the output stream, which this method doesn't provide. Should we refactor it to provide that and make more places use init
in a meaningful way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still useful for when you need to initialize the worker.
init
happens once when the worker starts. work
happens on every work request.
Agree that it is called in frequently and a good tradeoff may be providing a default implementation of init, but I'm not sure off the top of my head how to accomplish that because of the generics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mainly concerned with us re-parsing arguments in work
in a lot of places. I wonder if removing the args
parameter in work
would help to enforce that we're doing argument parsing in init
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argument parsing happens in work, though. In work
, we parse the arguments for each work request.
In init
we parse args for the worker as a whole. I think the only place we have a valid use case for this is the Zinc incremental compilation stuff, which I'd like to remove. Beyond that, I don't see a good reason to keep init
around.
In another PR I can remove the incremental compilation piece and we should be able to remove init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was conflating command-line arguments vs. the arguments supplied with each worker task. Carry on.
…traction These could/should probably be two separate commits, but the improvement to argument parsing was inspired by the requirements of multiplex sandboxing. The work was so intertwined that I'm leaving it as one PR. Good news is that if we need to move to another parsing framework in the future, it should be much easier because parsing is no longer spread throughout the work functions.
This is useful for when workers fail to run and have an error message to help with debugging. Otherwise you just have empty output.
e3e76db
to
80843d2
Compare
Primary reviewer here is @jadenPete. I added @gregghz in case they're interested in taking a look.