Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WFCORE-6750] Analysis document for the Unstable API Annotation scanning #569

Merged
merged 3 commits into from
Jul 2, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
248 changes: 248 additions & 0 deletions server/WFCORE-6750-unstable-api-annotation-scanner.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,248 @@
---
categories:
- wf-galleon
- core

# Add any category for this proposal
# if missing, add it to _data/widfly-categories and use its id
---
= [Preview] Scan for usage of API elements annotated with annotations indicating unstable API
:author: Kabir Khan
:email: [email protected]
:toc: left
:icons: font
:idprefix:
:idseparator: -

:indexModule: org.wildfly._internal.unstable-api-annotation-index

== Overview

Libraries such as SmallRye and Hibernate have annotations to indicate that parts of their API are unstable, and may change at any time.

As an example, in the case of SmallRye the annotation is `io.smallrye.common.annotation.Experimental`, while Hibernate has `org.hibernate.Incubating`. Other libraries have similar annotations.

The aim is to provide a mechanism to tell the user that they are using parts of APIs that are marked as unstable with an annotation.

In a nutshell, this will be done via a two stage process:

* *Server build time*: Each feature pack will scan the full classpath looking for usage of the specified annotations. This results in an index file, which the feature pack makes sure will appear in a known location when the server is provisioned.
* *Server run time*: When an application is deployed into the server, the application classes will be scanned, looking for usage of API elements that have been recorded in the indices we created in the above step.

The 'server run time' part is all in WildFly Core, while the initial work on the 'server build time' stage is currently in the full WildFly codebase tracked by https://issues.redhat.com/browse/WFLY-19152[WFLY-19152]. Since these efforts are so closely related, it seems best to deal with them as a whole in this proposal.

=== Background/Evaluation of other approaches
My initial thought was to use jandex to do this, and this is the first thing mentioned by 'outsiders'. However, Jandex solves a different problem.

Jandex is great for finding classes, methods, fields etc. in a deploymemt that have been annotated with a known annotation.

In our case we want to inspect the user bytecode and find usage of members in libraries provided by the server, when those members have been annotated with one of the mentioned annotations.

While it _could_ be possible to do this at runtime, it would mean needing to inspect the bytecode to find all class and member references and either:

* load the classes referenced to find if the referenced member has any of the annotations indicating unstable API
* use the JBoss Modules visibility rules to locate the jars for each reference, and inspect the referenced classes/members for the annotations indicating unstable API

Neither of these approaches were seen as viable. While the inspection of the bytecode part would be the same, the checking of the referenced classes/members would be a lot more costly than the work done to reference the index approach we settled for. While implementing the index, we found that converting the `CONSTANT_Utf8_Info` entries in the constant pool from byte arrays to strings added huge overhead. In the current implementation we avoid doing this conversion until needed.

Both the two alternative ideas above would need this conversion to strings, since these `CONSTANT_Utf8_Info` entries contain the names of the classes and members.

== Issue Metadata

=== Issue

* https://issues.redhat.com/browse/WFCORE-6750[WFCORE-6750 - Scan for annotations indicating unstable API on deployment]

=== Related Issues

* https://issues.redhat.com/browse/WFLY-19152[WFLY-19152 - Add indices of APIs annotated with annotations indicating unstable API]

=== Stability Level
// Choose the planned stability level for the proposed functionality
* [ ] Experimental

* [x] Preview

* [ ] Community

* [ ] default

=== Dev Contacts

* mailto:{email}[{author}]
* mailto:[email protected][Paul Ferraro] (SME)
* mailto:[email protected][Emmanuel Hugonnet] (Outside Perspective)

=== QE Contacts

=== Testing By
// Put an x in the relevant field to indicate if testing will be done by Engineering or QE.
// Discuss with QE during the Kickoff state to decide this
* [x] Engineering

* [ ] QE

=== Affected Projects or Components

* WildFly
** the 'wildfly' and 'wildfly-ee' feature packs will scan their classpaths and provide the indices of API elements annotated with `io.smallrye.common.annotation.Experimental` and `org.hibernate.Incubating`.
* WildFly Core
** scans the applications when they are deployed, looking for usage of API elements that exist in the index.
* https://github.com/wildfly/unstable-api-annotation-utils[wildfly/unstable-api-annotation-utils] - contains a library providing:
** A Maven plugin which can be used by the feature packs to do the scanning
** A highly optimised bytecode scanner and lookup mechanism to find usage in user applications of API elements that appear in the indices created by the feature packs


=== Other Interested Projects

Feature packs external to WildFly, as mentioned in the 'Future Work' section.

=== Relevant Installation Types
// Remove the x next to the relevant field if the feature in question is not relevant
// to that kind of WildFly installation
* [x] Traditional standalone server (unzipped or provisioned by Galleon)

* [x] Managed domain

* [x] OpenShift s2i

* [x] Bootable jar

== Requirements

=== Hard Requirements
* Note this is at preview level only!
* There is a maven plugin that can be used in the feature pack builds.
** It will scan for annotations indicating unstable API, and add the following to the `content/` directory of the `{indexModule}` module (the `content/` directory is defined as a resource root:
*** a file containing the indices for annotations concerning the feature pack.
**** The file may be either a .txt or a .zip
** The maven plugin scans the full classpath of the feature pack looking for dependencies to index.
** For this iteration we will scan
*** the 'wildfly-ee' feature pack for the `org.hibernate.Incubating` annotation.
*** the 'wildfly' feature pack for the `io.smallrye.common.annotation.Experimental` annotation.
*** It also contains a simple configuration mechanism to narrow down which classpath entries to index (e.g when scanning for `io.smallrye.common.annotation.Experimental` we only check the SmallRye libraries on the classpath)
*** To reduce size as much as possible we will add the index files in zip format.
* In WildFly Core
** deployment unit processors will be added to:
*** Iterate all resources from the `{indexModule}` module. These resources are the .zip or .txt files containing the indices for each feature pack
*** Use the wildfly/unstable-api-annotation-utils library to scan the bytecode of every class in the user's deployed application and cross-reference against the loaded indices.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be provided by a module that is an optional dependency of whatever uses this. That module has property jboss-stability-level=preview.

That likely means the index files should be in a different module from this lib, one that's at default stability, so the plugin execution that adds them doesn't have to know whether the FP build is configured to allow preview stability modules in the FP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

** The deployment unit processors will use the WildFly classloading rules to make sure that in the case of nested archives, that we only scan each class once. E.g. if a .war contains a `WEB-INF/lib/my-library.jar`, classes in `WEB-INF/lib/my-library.jar` will only be s scanned as part of the `WEB-INF/lib/my-library.jar` resource root scan, and not for the parent .war.
** Once scanned, usage of annotated API elements may be reported depending on configuration in the `core-management` subsystem:
*** A new child resource (currently called `service=unstable-api-annotations`) is added with a `level` attribute to configure the reporting
**** Since this is a resource/feature with a stability level of preview, if the resource is not present, no scanning or reporting is done.
****** The resource is added to the `preview` flavour of the current version of the model. At the time of writing it I was told there is no need to bump the version of management model constructs coming in lower than `community` level.
**** If the attribute is set to `log` (this is the default value for the attribute), the usages will be logged to the server.log. In this case the user application will be deployed.
**** If the attribute is set to `error`, the deployment will fail with an exception containing the usages.



=== Nice-to-Have Requirements
// Requirements in this section do not have to be met to merge the proposed functionality.
// Note: Nice-to-have requirements that don't end up being implemented as part of
// the work covered by this proposal should be moved to the 'Future Work' section.


=== Non-Requirements
// Use this section to explicitly discuss things that readers might think are required
// but which are not required.
* Full indexing of all annotations indicating unstable API
** See the 'Future Work' section for plans to get full coverage
* Inspection of user classes not in an application deployment, such as:
** User provided modules
** Classes provided by the user via the management model with the typical `module`/`class` attribute pairs.
* No attempt is made to not add the indices if the server is provisioned a a higher stability level than 'preview'. In other words, if the feature packs contain the indices, they will always end up in the server.
* Since this is a `preview` feature, and the model version is the same as the original with the added `preview` qualifier, model transformation is not needed.
Copy link
Contributor

@bstansberry bstansberry Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a layer for this. Not having it now can be a 'Non-Requirement' + 'Future Work'.

I think the layer is a requirement for moving to Community stability.

When the layer is created we have to decided if this requires opt-in or opt-out when provisioning with layers. Going to opt-in would be a breaking change vs this initial "always there" approach, but OTOH this is the kind of "I lived without this" functionality that is better handled as an opt-in.


=== Future Work
// Use this section to discuss requirements that are not addressed by this proposal
// but which may be addressed in later proposals.
* Promote this feature to Community or higher. Once we analyse what a community level feature would contain, the below are some ideas which should be taken into consideration when defining the set of enhancements.
* Promoting to Community or higher would allow us to enable the scanning and reporting out of the box by default, which (in my opinion) would make it more useful.
* Investigate the use of layers to provision the feature or not
** Decide whether to opt in or out of provisioning the scanner utility
** At the moment, the current feeling is that the zipped index files are small enough to not matter if they are provisioned or not. So the candidate would be the module containing the wildfly/unstable-api-annotation-utils jar.
* Identify all annotations indicating unstable API used by WildFly itself with the help of component leads.
** Scan for these annotations during feature pack builds, and make them available in a provisioned server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the goal of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To index similar annotations from other libraries than the current SmallRye and Hibernate families. E.g. I think JGroups has something. Apache might have something etc. i.e. get better coverage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question here is more about why we make them available in the provisioned server. But probably I misinterpreted the parent point.

If your meaning in the parent point is just to include more annotations from other components and check for their use in deployment code, I understand. If you're talking about scanning WF code for our own use of things, I understand why we'd do that, I'm just not clear why we'd want to write the result in the provisioned server.

In any case I'm not asking for any change in the document; this is just a discussion of what you had in mind.

* Engage with owners of feature packs external to WildFly, and identify all annotations indicating unstable API in those feature packs.
** Provide instructions on how to do the build time scan, and to make the indices available at runtime once their feature pack has been provisioned.
** Scan for these annotations during feature pack build, and make them available in a provisioned server.
** I think an appropriate set of feature packs would be the ones WildFly Glow is aware of.
* Investigate the feasibility of scanning user provided classes from other locations than deployment archives, such as:
** in user modules
** in classes referenced via the `module`/`class` attribute pairs in the management model
* Investigate adding a deployment marker to force enable/disable the runtime scanning independent of the subsystem setting
** This could possibly be enhanced with a filter so a user could say to ignore usage of constructs annotated with `org.hibernate.Incubating` (or perhaps an expression to ignore classes matching the expression), while reporting usage of everything else (e.g. constructs annotated with `io.smallrye.common.annotation.Experimental`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to allow something like this, which involves the user identifying annotations of interest, then having that config do a mapping to a stability level seems good. Then how we treat the presence of the thing depends on the server's stability level.

If we found something that wasn't configured, that would mean 'not allowed' (so log it or error). IOW, not configured != experimental. Someone running a --stability=experimental server can still reject APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider this more brainstorming, than a "must" have. I've updated the start of the future work section to say we will decide the scope when going to community level, and that these points are ideas

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great; that was my intent. :)


== Backwards Compatibility

// Does this enhancement affect backwards compatibility with previously released
// versions of WildFly?
// Can the identified incompatibility be avoided?

=== Default Configuration

Since this is a preview stability feature, this will not be added to the default configuration.

=== Importing Existing Configuration

Importing existing configuration will be fine, as this is a new feature.

=== Deployments

If the feature is enabled, deployments will be scanned for usage of API elements existing in the mentioned indices.

=== Interoperability

N/A

== Security Considerations

////
Identification if any security implications that may need to be considered with this feature
or a confirmation that there are no security implications to consider.
////
I don't believe there are any security issues.

== Test Plan

Tests will be added in both WildFLy and WildFly Core

WildFly Core:

* The WildFly core feature pack is only used internally for testing, so we will not scan for annotations indicating unstable API with the plugin here. Also, it should be noted that the usage of such annotations in third party components is outside of our control, and likely to change (e.g. an annotated member might 'mature' to no longer have the annotation, or the member might be removed in a future release). Meanwhile, the tests for the scanning mechanism should be in WildFly Core since that is where the deployment unit processors live.
** Thus, a test feature pack will be created offering an API (including its own annotations for indicating unstable API) and a subsystem providing this API on the classpath.
** The feature pack will be indexed and the indices will be added to the `{indexModule}` module.
** The test will check that
*** the index file is added to the `${indexModule}` module, as outlined in the requirements
*** when deploying an application using annotated API elements from the test feature pack API, usage of those is reported in line with the configuration in the `subsystem=core-management/service=unstable-api-annotations` resource (as outlined in the requirements). We will test both with
**** `level=log` - and inspect that the log messages pick up all the usages of annotated API elements.
**** `level=error` - and inspect that the application fails to deploy, and that the error message picks up all the usages of annotated API elements.
** The test will run at the `preview` stability level

WildFly:

* Since we have done the main testing of the mechanism in WildFly Core, we will not test that here
* The verifier plugin in each feature pack will verify that the `{indexModule}` module in each feature pack contains the expected index files.
** The 'wildfly-ee' feature pack expects `wildfly-ee-feature-pack.zip`
** The 'wildfly' feature pack expects `wildfly-ee-feature-pack.zip` and `wildfly-galleon-pack.zip`
* A test will be added to `testsuite/integration/microprofile` which runs in an execution provisioning both the 'wildfly-ee' and 'wildfly' feature packs. It will test a number of differently packaged deployments, to make sure that each class only gets scanned once, in order to test that the annotated class/member usage follows the WildFly classloading rules. The scanner outputs the class count when an undocumented system property is set to `true`.
** This test will run at the `preview` stability level

== Community Documentation
////
Generally a feature should have documentation as part of the PR to wildfly master, or as a follow up PR if the feature is in wildfly-core. In some cases though the documentation belongs more in a component, or does not need any documentation. Indicate which of these will happen.
////

Community documentations will be added as part of the PR to WildFly full. This PR will also contain the parts of the tests which live in WildFly, as mentioned in the previous section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to beef this bit up a lot. I think not implementing everything by the 33 feature freeze is ok, but we should think about things like user journeys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think what you did is ok for Preview. A blog would be good though, if not now then as part of moving to Community.


== Release Note Content
The core-management subsystem now allows you to enable scanning of your deployments for usage of classes/methods in the SmallRye and Hibernate libraries annotated with `org.hibernate.Incubating` and `io.smallrye.common.annotation.Experimental`. These annotations indicate that those API elements are likely to change at any time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/usage/the use/g

////
Draft verbiage for up to a few sentences on the feature for inclusion in the
Release Note blog article for the release that first includes this feature.
Example article: http://wildfly.org/news/2018/08/30/WildFly14-Final-Released/.
This content will be edited, so there is no need to make it perfect or discuss
what release it appears in. "See Overview" is acceptable if the overview is
suitable. For simple features best covered as an item in a bullet-point list
of features containing a few words on each, use "Bullet point: <The few words>"
////