Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AnnexScalaInstance jar cache misses due to ordering and path comparison problems #64

Merged
merged 1 commit into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ import java.net.URLClassLoader
import java.nio.file.{Files, Path, Paths}
import java.util.Properties
import java.util.concurrent.ConcurrentHashMap
import scala.collection.immutable.Map
import scala.collection.immutable.TreeMap

object AnnexScalaInstance {
// See the comment on getAnnexScalaInstance as to why this is necessary
private val instanceCache: ConcurrentHashMap[String, AnnexScalaInstance] =
new ConcurrentHashMap[String, AnnexScalaInstance]()
private val instanceCache: ConcurrentHashMap[Set[Path], AnnexScalaInstance] =
new ConcurrentHashMap[Set[Path], AnnexScalaInstance]()

/**
* We only need to care about minimizing the number of AnnexScalaInstances we create if things are being run as a
Expand Down Expand Up @@ -48,7 +48,7 @@ object AnnexScalaInstance {
* from a prior compilation in combination with a new ScalaInstance for the current compilation. Or there's some issue
* with Scala's classpath caching.
*
* If all three caches are enabled (this cache + Zinc claspath + Scala compiler), then the non-determinism seems go
* If all three caches are enabled (this cache + Zinc claspath + Scala compiler), then the non-determinism seems to go
* away.
*
* If this cache is not used, but the Zinc classpath + Scala compiler caches are used, then non-determinsm shows up in
Expand All @@ -67,11 +67,22 @@ object AnnexScalaInstance {
* any different from leaving Zinc's cache enabled.
*/
private def getAnnexScalaInstance(allJars: Array[File], workDir: Path): AnnexScalaInstance = {
// We need to remove the sandbox prefix from the paths in order to compare them.
// We want to compare short paths to avoid the Bazel sandbox prefix and arch/config specific parts of the path
// We use a set because we don't care about ordering for comparison purposes
val mapBuilder = Map.newBuilder[Path, Path]
val keyBuilder = Set.newBuilder[Path]
val absoluteWorkDir = workDir.toAbsolutePath().normalize()
allJars.foreach { jar =>
val comparableJarPath = jar.toPath().toAbsolutePath().normalize()
mapBuilder.addOne(jar.toPath -> workDir.toAbsolutePath().normalize().relativize(comparableJarPath))
val absoluteJarPath = jar.toPath().toAbsolutePath().normalize()

// Remove the arch/config specific parts of the path.
val comparablePath = FileUtil.bazelShortPath(
// Remove the sandbox prefix from the path
absoluteWorkDir.relativize(absoluteJarPath),
replaceExternal = false,
)
mapBuilder.addOne(jar.toPath -> comparablePath)
keyBuilder.addOne(comparablePath)
}
val workRequestJarToWorkerJar = mapBuilder.result()

Expand All @@ -84,10 +95,10 @@ object AnnexScalaInstance {
// but that would require hashing the all the classpath jars on every compilation request. I
// imagine that would cause a performance hit.
//
// I also imagine it is extremeley rare to be mutating the contents of compiler classpath jars
// I also imagine it is extremely rare to be mutating the contents of compiler classpath jars
// while keeping the names the same, e.g., generating new scala library jar for scala 2.13.14.
// As a result I'm leaving this string based for now.
val key = workRequestJarToWorkerJar.values.mkString(":")
val key = keyBuilder.result()

Option(instanceCache.get(key)).getOrElse {
// Copy all the jars to the worker's directory because in a sandboxed world the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ package workers.common
import scala.annotation.tailrec
import java.io.{File, IOException}
import java.nio.channels.FileChannel
import java.nio.file.{FileAlreadyExistsException, FileVisitResult, Files, Path, SimpleFileVisitor, StandardCopyOption, StandardOpenOption}
import java.nio.file.{FileAlreadyExistsException, FileVisitResult, Files, Path, Paths, SimpleFileVisitor, StandardCopyOption, StandardOpenOption}
import java.nio.file.attribute.BasicFileAttributes
import java.util.zip.{ZipEntry, ZipInputStream, ZipOutputStream}

Expand Down Expand Up @@ -59,6 +59,28 @@ object FileUtil {
path.getFileName().toString().stripPrefix("header_").stripPrefix("processed_")
}

/**
* Given a Bazel path to the bazel output directory, return a path excluding the bazel configuration specific parts of
* the path. This is analogous to Bazel's File.short_path function. Fair warning: this function is super good enough,
jjudd marked this conversation as resolved.
Show resolved Hide resolved
* but is likely not perfect.
*/
def bazelShortPath(path: Path, replaceExternal: Boolean = true): Path = {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this in a function because I use this code in the path mapping PR for the deps checker and want to be able to reuse it in that PR

val nameCount = path.getNameCount()

val shortPath = if (path.startsWith("bazel-out") && nameCount >= 4) {
path.subpath(3, nameCount)
} else {
path
}

// Handle difference between Bazel's external directory being referred to as .. in the short_path
if (replaceExternal && shortPath.subpath(0, 1) == Paths.get("external")) {
Paths.get("..").resolve(shortPath.subpath(1, shortPath.getNameCount() - 1))
} else {
shortPath
}
}

def copy(source: Path, target: Path) = Files.walkFileTree(source, new CopyFileVisitor(source, target))

def delete(path: Path) = Files.walkFileTree(path, new DeleteFileVisitor)
Expand Down
Loading