Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Non-GC object allocations using annotation hints #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 128 additions & 0 deletions proposals/allocation-hints.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@

---
layout: doc-page
title: "Object allocation hints"
author: Wojciech Mazur
---

**By: Wojciech Mazur**

## History

| Date | Version |
|---------------|--------------------|
| Jan 06th 2022 | Initial Draft |

## Summary
Allow to annotate class constructors with a hint how object should be allocated: on the heap using GC, on the stack or via dedicated arena allocators (`Zone`s).

## Motivation
Due to the object-oriented nature of Scala, one of the most frequent operations done by runtime is the allocation of new objects. Currently, all allocated objects are by default allocated and managed by the GC. The only exception from that is an experimental allocation of classes using `scala.scalanative.memory.SafeZone` by specifying explicitly or implicitly an arena allocator used to allocate and free required memory. It is powered by Dotty Capture Calculus (project Caprese) allowing it to provide type-safety over allocated memory, especially it allows to detect at compile time illegal memory state, especially use-after-free scenarios. Unfortunately, it would take an underdetermined amount of time until this feature would become stable and publicly available to users in stable Scala versions.

## Proposed solution
Until typesafe memory management for allocated objects is available, users might be tempted to manually allocate and manage allocated objects. Unfortunately, currently, there is no available syntax allowing to allocation of classes on the stack or in the unsafe memory zone. One of the reasons for that was the lack of good representation of handles to class constructors as they're strongly entangled with a `new` operator. One workaround would be to use a dedicated `apply` method, but it would require a large amount of boilerplate and would not allow to work with 3rd-party types, e.g. Java standard library.

To solve this issue we propose annotation-based hints for the compiler which would modify the Scala Native backend to emit custom allocators for objects. Thanks to the usage of annotations we can provide stubs for other platforms (JVM, Scala.js) allowing for 0-zero abstraction and easy cross-compilation.
We propose to provide initially 3 allocation hints, informing the compiler that should given type be allocated on the stack, on the heap using GC, or in an available memory zone provided by the user.

The annotations can be only applied to local `val`/`var` statements or as the type annotation of an expression. It would be illegal to annotate a member of a class - their allocation is predefined by the caller of class constructor.

Custom allocations not involving GC are possible only because Scala Native has never introduced object finalizers - since we don't need to track which object would be collected by the GC and potentially require to run finalization code, we can safely allocate them on the stack or in the explicit zone allowing for the most optimal freeing of resources.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about this, but based on the trickery used in https://github.com/scala-native/scala-native/blob/86551250b939bab687281bea0af81572d2638ab8/javalib/src/main/scala/java/nio/MappedByteBufferData.scala#L18-L35, isn't there some finalization logic regarding WeakReferences? Won't those require some special handling?

I assume WeakReferences allocated on the stack or zone won't have anything special, but what about the WeakReferenceRegistry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, these are special. WeakReferences have a special handling which allows to internally add weak reference finalizer. Luckily these finalisers cannot be currently defined by the users (they're private to the javalib which does not produce JVM Bytecode required for Scalac compiler).
This might be on of a cases for which we might ignore user's allocation hint and give warning/error instead - this can be made easily at compile time.


## API
User user-facing part of the interface involves only a set of annotations triggering special handling in the Scala Native compiler plugin.

```scala
import scala.annotation.meta
import scala.scalanative.unsafe.Zone

@meta.field
sealed abstract class allocationHint extends scala.annotation.StaticAnnotation

object allocationHint {
// Would allocate annotated objects on the stack
final class stack extends allocationHint

// would allocate annotated object using GC (default allocation type), can override allocations in blocks
final class gc extends allocationHint

// would allocate annotated object using memory zone
final class zone(implicit zone: Zone) extends allocationHint
}

```

## Example of usage
```scala
import scala.scalanative.annotation.allocationHint._
import scala.scalanative.unsafe.Zone

object Examples {
def localValues() = {
// would allocate x on stack
@stack val x = new String("foo")

// would allocate y using gc
@gc val y = new {}

// would allocate z using implicit or explicit unsafe.Zone (or memory.SafeZone under Dotty with Capture Calculus enabled)
given Zone = ???
@zone val z1 = new {}
@zone(using summon[Zone]) val z2 = new {}

// Allocate every instance in rhs on the stack with exception of `x2` which would be allocated using GC
@stack val block = {
val x1 = new {}
@gc val x2 = new {}
(x1, x2)
}
}

def anonymousInstance() = {
// We cannot add annotations to expressions, use type annotation instead eg. `<expr>: @stack`
// Allocate object on stack and pass it to function
println(new {}: @stack)

locally {
val x1 = new {}
@gc val x2 = new {}
(x1, x2)
}: @stack
}
}

object InvalidUsage {
// class members cannot have defined allocation hint
@stack val forbidden = new String("forbidden") // compile-error

locally {
// Multiple hints create ambiguity
@stack @gc val ambiguous = new {} // compile-error
}
}

```

The draft of the solution for annotation hints is available in https://github.com/WojciechMazur/scala-native/tree/feature/class-allocation-hints

## Interaction between custom allocators and GC
One of the most important aspects of providing a safe execution of programs using custom allocation hints would be runtime safety. We need to ensure that objects are allocated on the stack, but having fields possibly referring to objects managed by the GC would be reachable by the GC while scanning.

In case of allocation on the stack, the Scala Native needs no to little amount of modifications to make it safe when working with GC. Currently, all GC implementations are already scanning the stack of each thread. If an object would be allocated on the stack, we would always reach its inner fields when scanning.

However, custom allocation zones are not scanned by default - these allocators can be using any arbitrary chunk of memory internally. To make their usage safe we would first need to inform the GC about custom roots for scanning. This strategy was already successfully applied to `scala.scalanative.memory.SafeZone` and would not require a big amount of changes for other `scala.scalanative.unsafe.Zone` implementations.

## In-direct object allocations
In Scala code we often use companion objects `apply` methods to allocate new instances of objects. It's especially useful when we need to execute some logic before calling an object constructur. We don't want to introduce any special handling for these kind of operations. Instead we would fallback to Scala 3 inlining mechanism. After the Scala 3 compiler `inlining` phase the inlined right-hand expression of the annotated expression would become indistinguishable for the Scala Native backend allowing to apply annotation hints of the caller expression.

## Opportunities
By providing an alternative, user-defined memory management over-allocated objects we might provide an opportunity to lower GC bottlenecks and time spent on garbage collection. It might be beneficial for memory-space costly functions or areas of code that could be manually tuned for best performance.

## Risks
One of the biggest risks of the new feature would be allowing for the introduction of illegal memory states in user programs. Especially use-after-free errors when storing stack/zone-allocated memory in GC-allocated objects or returning it the the caller of the function. Some of these runtime errors could be detected at compile time, either in the Scala compiler or after Scala Native optimizers (especially after inlining) to check if memory allocated on the stack is returned to the function caller, which might lead to undefined behavior at runtime. Without the support of compile-time allocation capabilities checks of Dotty Capture Calcusus (project Caprese), we would never have been able to successfully detect all of the memory management issues.

## Compatibility
The changes would require to include additional information to the NIR format about the allocation hints. These can be introduced in a backward compatible way to the NIR format, in the attached initial implementation we introduce a "normal" `nir.Op.{Class,Array}Alloc` instruction which don't have store any information in NIR about hints, and the alternative op-code for less-likely hinted allocation. The binary and source compatibility of Scala/JVM `nir` definitions would be broken as they're defined using case classes. We don't plan to introduce complexity of toolchain by introducing alternative `nir.Op` instructions.
Comment on lines +124 to +125
Copy link

@kubukoz kubukoz Jan 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: how would this affect Scala 2? Would the annotations be no-ops or simply not available at all?

Would @gc be the default (as it is now)? Or would the annotations only have an effect if GC is disabled at build time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: how would this affect Scala 2? Would the annotations be no-ops or simply not available at all?

While prototyping I've observed that we should be able to support it also in Scala 2, the only limitation would be lack of inlining in it. All the annotations used in examples can be also applied in Scala 2.12 and 2.13. If for some reason we would not be able to apply some of annotations they would become no-op.

Would @gc be the default (as it is now)? Or would the annotations only have an effect if GC is disabled at build time?

Yes, the @gc would still be the default. Lack of annotation hint (directly or in the enclosing scope) would always result in allocating using GC. The main purpose of this annotation would be overriding to hint defined for enclosing scope.

locally{ 
  val x = new {} // No direct allocation hint, use enclosing block hint (@stack)
  @gc val y = new {} // Use explicit GC allocation instead of enclosing allocation
  locally {
    val z = new {} // No direct allocation hint, use enclosing block hint (@stack)
  }
}: @stack // Allocate everything in block on stack

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's nice, thanks!


## Related work
- [Safe Zone: Memory-safe zone ensured by capture checking and GC](https://github.com/scala-native/scala-native/pull/3120)