Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-52 - Binary APIs #58

Merged
merged 17 commits into from
Oct 23, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
276 changes: 276 additions & 0 deletions content/binary-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,276 @@
---
layout: sip
permalink: /sips/:title.html
stage: implementation
status: waiting-for-implementation
title: SIP-52 - Binary APIs
---

**By: Author Nicolas Stucki**

## History

| Date | Version |
|---------------|------------------------|
| Feb 27 2022 | Initial Draft |
| Aug 16 2022 | Single Annotation |
| Aug 24 2022 | Change Annotation Name |

## Summary

The purpose of binary APIs is to have publicly accessible definitions in generated bytecode for definitions that are package private or protected.
This proposal introduces the `@publicInBinary` annotation on term definitions and the `-WunstableInlineAccessors` linting flag.


## Motivation

### Provide a sound way to refer to private members in inline definitions

Currently, the compiler automatically generates accessors for references to private members in inline definitions. This scheme interacts poorly with binary compatibility. It causes the following three unsoundness in the system:
* Changing any definition from private to public is a binary incompatible change
* Changing the implementation of an inline definition can be a binary incompatible change
* Removing final from a class is a binary incompatible change

You can find more details in [https://github.com/lampepfl/dotty/issues/16983](https://github.com/lampepfl/dotty/issues/16983)

### Avoid duplication of inline accessors

Ideally, private definitions should have a maximum of one inline accessor, which is not the case now.
When an inline method accesses a private/protected definition that is defined outside of its class, we generate an inline in the class of the inline method. This implies that accessors might be duplicated if a private/protected definition is accessed from different classes.

### Removing deprecated APIs

There is no precise mechanism to remove a deprecated method from a library without causing binary incompatibilities. We should have a straightforward way to indicate that a method is no longer publicly available but still available in the generated code for binary compatibility.
Copy link

@JD557 JD557 May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an actual problem?

Based on the Semantic Versioning spec and a discussion with clarifications, my understanding is that the semver way would require a MAJOR version bump to remove a deprecated method.

Binary incompatibility of different major versions seems like a reasonable expectation.

There are obviously other version schemes other than semver, but I imagine the handling of deprecated APIs is similar (e.g. I think PVP is even more extreme, where both deprecating and removing the API would require a major bump)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an actual problem because the Scala ecosystem, and therefore use of SemVer, is rooted in binary and TASTy compatibility. You want to be able to make source breaking changes while retaining binary and TASTy compatibility, and that should stay a MINOR version bump. Perhaps you are mitigating the source breakage with extension methods or new overloads instead.

(If breaking source compat required a MAJOR version bump in Scala, every new version of a library with a new public class or method would be a MAJOR version, and we don't want that for obvious reasons.)


```diff
- @deprecated(...) def myOldAPI: T = ...
+ private[C] def myOldAPI: T = ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be private[pkg] ? The behavior of private[C] is to generate a name-mangled definition C$$myOldAPI to avoid clashes in case a subclass defines its own myOldAPI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure in which cases private[C] generates a mangled name. I tried the following and non of the names get mangled.

class C:
  private[C] def f = 1
object C:
  private[C] def f = 1
trait B:
  private[B] def f = 1
class A extends B

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do get the mangling for private[this] val in traits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the following and non of the names get mangled.

Ah, it looks like the mangling only happens in Scala 2, not Scala 3. So that's still an issue for cross-compiling codebases.

```
Comment on lines +41 to +48

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this ties-in well with the discussion in lightbend-labs/mima#724. To summarize:

  1. MiMa currently ignores changes to private[package] APIs, because the assumption is that these are internal to the library. This improves the signal-to-noise when making such internal changes.
  2. As noted here, package[private] is often used for removing a public API without breaking bincompat. But, due to (1), MiMa will no longer check these now-package-private APIs for binary-compatibility, which is dangerous. The workaround is to check MiMa against every previous artifact, but this has its own costs/risks.
  3. By adding an explicit annotation to denote public APIs, MiMa would be able to distinguish between internal APIs and once-public-now-removed APIs that must otherwise maintain binary-compatibility.

So, I hope that integration with MiMa will be considered as part of this proposal :)

Big 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this proposal was partially designed to address that MiMa problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the link lightbend-labs/mima#724 to the SIP document.


Related to discussion in [https://github.com/lightbend/mima/discussions/724](https://github.com/lightbend/mima/discussions/724).

### No way to inline reference to private constructors

It is currently impossible to refer to private constructors in inline methods.
```scala
class C private()
object C:
inline def newC: C = new C() // Implementation restriction: cannot use private constructors in inline methods
```
If users want to access one of those, they must write an accessor explicitly. This extra indirection is undesirable.
```scala
class C private()
object C:
private def newCInternal: C = new C()
inline def newC: C = newCInternal
```

## Proposed solution

### High-level overview

This proposal introduces the `@publicInBinary` annotation, and adds a migration path to inline methods in libraries (requiring binary compatibility).

#### `@publicInBinary` annotation

A binary API is a definition that is annotated with `@publicInBinary` or overrides a definition annotated with `@publicInBinary`.
This annotation can be placed on `def`, `val`, `lazy val`, `var`, `object`, and `given` definitions.
A binary API will be publicly available in the bytecode.

This annotation cannot be used on `private`/`private[this]` definitions. With the exception of class constructors.

Removing this annotation from a non-public definition is a binary incompatible change.

Example:

~~~ scala
class C {
@publicInBinary private[C] def packagePrivateAPI: Int = ...
@publicInBinary protected def protectedAPI: Int = ...
@publicInBinary def publicAPI: Int = ... // warn: `@publicInBinary` has no effect on public definitions
}
~~~
will generate the following bytecode signatures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're talking about bytecode it would be good to actually specify what we're generating at the bytecode level exactly. I assume that public in the code example below maps to ACC_PUBLIC in https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html, but we might also want to use ACC_SYNTHETIC so that these definitions stay hidden from javac and java IDEs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if adding ACC_SYNTHETIC is always a good idea.

  • @binaryAPI def publicAPI: Int = ... should not be marked as ACC_SYNTHETIC because it is truly public.
  • Would @binaryAPI protected def publicAPI: Int = ... overridable in JAVA if it has ACC_SYNTHETIC?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would @binaryapi protected def publicAPI: Int = ... overridable in JAVA if it has ACC_SYNTHETIC?

Yeah if you want the method to be overridable then it can't be ACC_SYNTHETIC.

~~~ java
public class C {
public C();
public int packagePrivateAPI();
public int protectedAPI();
public int publicAPI();
}
~~~

In the bytecode, `@publicInBinary` definitions will have the [ACC_PUBLIC](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.1-200-E.1) flag.
<!-- We can also set the [ACC_SYNTHETIC](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.1-200-E.1) to hide these definitions from javac and java IDEs. -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been studied further? Is there a conclusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that we currently need to do that for other accessors. I opened a discussion in scala/scala3#17461. @binaryAPI should have ACC_SYNTHETIC if and only if other accessors.


#### Binary API and inlining

A non-public reference in an inline method is handled as follows:
- if the reference is a `@publicInBinary` the reference is used;
- otherwise, an accessor is automatically generated and used.

Example:
~~~ scala
import scala.annotation.publicInBinary
class C {
@publicInBinary private[C] def a: Int = ...
private[C] def b: Int = ...
@publicInBinary protected def c: Int = ...
protected def d: Int = ...
inline def foo: Int = a + b + c + d
}
~~~
before inlining the compiler will generate the accessors for inlined definitions
~~~ scala
class C {
@publicInBinary private[C] def a: Int = ...
private[C] def b: Int = ...
@publicInBinary protected def c: Int = ...
protected def d: Int = ...
final def C$$inline$b: Int = ...
final def C$$inline$d: Int = ...
inline def foo: Int = a + C$$inline$b + c + C$$inline$d
}
~~~

##### `-WunstableInlineAccessors`

In addition we introduce the `-WunstableInlineAccessors` flag to allow libraries to detect when the compiler generates unstable accessors.
The previous code would show a linter warning that looks like this:

~~~
-- [E...] Compatibility Warning: C.scala -----------------------------
| inline def foo: Int = a + b + c + d
| ^
| Unstable inline accessor C$$inline$b was generated in class C.
|
| longer explanation available when compiling with `-explain`
-- [E...] Compatibility Warning: C.scala -----------------------------
| inline def foo: Int = a + b + c + d
| ^
| Unstable inline accessor C$$inline$d was generated in class C.
|
| longer explanation available when compiling with `-explain`
~~~

When an accessor is detected we can tell the user how to fix the issue. For example we could use the `-explain` flag to add the following details to the message.

<details>
<summary>With `-WunstableInlineAccessors -explain`</summary>

~~~
-- [E...] Compatibility Warning: C.scala -----------------------------
| inline def foo: Int = a + b + c + d
| ^
| Unstable inline accessor C$$inline$b was generated in class C.
|-----------------------------------------------------------------------------
| Explanation (enabled by `-explain`)
|- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| Access to non-public method b causes the automatic generation of an accessor.
| This accessor is not stable, its name may change or it may disappear
| if not needed in a future version.
|
| To make sure that the inlined code is binary compatible you must make sure that
| method b is public in the binary API.
| * Option 1: Annotate method b with @publicInBinary
| * Option 2: Make method b public
|
| This change may break binary compatibility if a previous version of this
| library was compiled with generated accessors. Binary compatibility should
| be checked using MiMa. If binary compatibility is broken, you should add the
| old accessor explicitly in the source code. The following code should be
| added to class C:
| @publicInBinary private[C] def C$$inline$b: Int = this.b
-----------------------------------------------------------------------------
-- [E...] Compatibility Warning: C.scala -----------------------------
| inline def foo: Int = a + b + c + d
| ^
| Unstable inline accessor C$$inline$d was generated in class C.
|-----------------------------------------------------------------------------
| Explanation (enabled by `-explain`)
|- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| Access to non-public method d causes the automatic generation of an accessor.
| This accessor is not stable, its name may change or it may disappear
| if not needed in a future version.
|
| To make sure that the inlined code is binary compatible you must make sure that
| method d is public in the binary API.
| * Option 1: Annotate method d with @publicInBinary
| * Option 2: Make method d public
|
| This change may break binary compatibility if a previous version of this
| library was compiled with generated accessors. Binary compatibility should
| be checked using MiMa. If binary compatibility is broken, you should add the
| old accessor explicitly in the source code. The following code should be
| added to class C:
| @publicInBinary private[C] def C$$inline$d: Int = this.d
-----------------------------------------------------------------------------
~~~

</details>

### Specification

We must add `publicInBinary` to the standard library.

```scala
package scala.annotation

final class publicInBinary extends scala.annotation.StaticAnnotation
```

#### `@publicInBinary` annotation

* Only valid on `def`, `val`, `lazy val`, `var`, `object`, and `given`.
* TASTy will contain references to non-public definitions that are out of scope but `@publicInBinary`. TASTy already allows those references.
* The annotated definitions will be public in the generated bytecode. Definitions should be made public as early as possible in the compiler phases, as this can remove the need to create other accessors. It should be done after we check the accessibility of references.

#### Inline

* Inlining will not require the generation of an inline accessor for binary APIs.
* The user will be warned if a new inline accessor is automatically generated under `-WunstableInlineAccessors`.
The message will suggest `@publicInBinary` and how to fix potential incompatibilities.

### Compatibility

The introduction of the `@publicInBinary` do not introduce any binary incompatibility.

Using references to `@publicInBinary` in inline code can cause binary incompatibilities. These incompatibilities are equivalent to the ones that can occur due to the unsoundness we want to fix. When migrating to binary APIs, the compiler will show the implementation of accessors that the users need to add to keep binary compatibility with pre-publicInBinary code.

### Other concerns

* Tools that analyze inlined TASTy code will need to know about `@publicInBinary`. For example [MiMa](https://github.com/lightbend/mima/) and [TASTy MiMa](https://github.com/scalacenter/tasty-mima).

## Alternatives

### Add a `@binaryAccessor`
This annotation would generate an stable accessor. This annotation could be used on `private` definition. It would also mitigate [migration costs](https://gist.github.com/nicolasstucki/003f7293941836b08a0d53dbcb913e3c) for library authors that have published unstable accessors.

* Implementation https://github.com/lampepfl/dotty/pull/16992


### Make all `private[C]` part of the binary API

Currently, we already make `private[C]` public in the binary API but do not have the same guarantees regarding binary compatibility.
For example, the following change is binary compatible but would remove the existence of the `private[C]` definition in the bytecode.
```diff
class C:
- private[C] def f: T = ...
```
We could change the rules to make all `private[C]` part of binary compatible to flag such a change as binary incompatible. This would imply that all these
methods can be accessed directly from inline methods without generating an accessor.

The drawback of this approach is that that we would need to force users to keep their `private[C]` methods even if they never used inline methods.


## Related work

* Initial discussions: [https://github.com/lampepfl/dotty/issues/16983](https://github.com/lampepfl/dotty/issues/16983)
* Initial proof of concept (outdated): [https://github.com/lampepfl/dotty/pull/16992](https://github.com/lampepfl/dotty/pull/16992)
* Single annotation proof of concept: [https://github.com/lampepfl/dotty/pull/18402](https://github.com/lampepfl/dotty/pull/18402)
* Community migration analysis: [Gist](https://gist.github.com/nicolasstucki/003f7293941836b08a0d53dbcb913e3c)
* Kotlin: [PublishedApi](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/-published-api/) plays the same role as `@publicInBinary`
but its interaction with (inline definitions)[https://kotlinlang.org/docs/inline-functions.html#restrictions-for-public-api-inline-functions]
is stricter as they do not support automatic accessor generation.

<!-- ## FAQ -->