Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Come up with guidance on how to add experimental APIs to stable signals #4257

Closed
tigrannajaryan opened this issue Oct 9, 2024 · 12 comments · Fixed by #4270
Closed

Come up with guidance on how to add experimental APIs to stable signals #4257

tigrannajaryan opened this issue Oct 9, 2024 · 12 comments · Fixed by #4270
Labels
triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted triage:followup Needs follow up during triage

Comments

@tigrannajaryan
Copy link
Member

As existing stable signals in Otel mature we are likely to see the need to evolve these signals such that new APIs are added to them first in unstable form and then are stabilized. We currently don't have any guidance on how this should be achieved and here is for example a case where an attempt to add an experimental implementation becomes difficult because language maintainers would like to have a stable spec before accepting an implementation, however we have a chicken and egg problem here since we don't allow the spec to stabilize before there are implementations in several languages.

I believe we need to come up with a process that explains how these experimental additions to existing stable APIs are done in the spec and in the language implementations. One possible way is to leverage the recently added (but not widely used in the spec) more granular maturity levels. Language implementations will likely also need to find a way to bring unstable additions to existing signals (some implementations already have this ability, e.g. Java).

This issue is a request for comments: language maintainers and spec sponsors please tell what you think about the need for such guidance and if you have ideas please make proposals.

cc @open-telemetry/technical-committee @open-telemetry/spec-sponsors

@mx-psi
Copy link
Member

mx-psi commented Oct 9, 2024

I think this is going to be significantly different depending on the language and the tools available there. For example, Rust features are typically used for this kind of experimental APIs while Golang build tags (the closest equivalent concept in Go) are not typically used in this fashion in the Go ecosystem and may be stranger to use for end-users.

For Go in particular the Collector SIG may be able to provide ideas here, e.g. the recent work on adding profiling as well as part of the 1.0 work has involved a lot of "figuring out how to isolate experimental bits from other parts already marked as 1.0/scheduled to be marked as 1.0". Some tools we have used are optional interfaces that may be implemented by a struct that is presented through a more generic interface (e.g. component.Host is the generic interface while componentstatus.Reporter is the concrete, experimental one) or re-exporting the API of internal packages into two different modules (one stable, one experimental, a simple example here is the constants in pipeline vs the ones in componentprofiles). For some of this work we have collaborated with @dmathieu so I am sure the Go SIG is aware of this, but maybe we can put this into writing :)

@jack-berg
Copy link
Member

@tigrannajaryan does this belong in opentelemetry-specification?

@tigrannajaryan
Copy link
Member Author

@tigrannajaryan does this belong in opentelemetry-specification?

I thought this is not just a spec issue but also impacts language implementations. We can move to the spec if you feel that's a better place for this discussion.

@jack-berg
Copy link
Member

No strong preference - just wasn't sure if it was intentional.

@dmathieu
Copy link
Member

dmathieu commented Oct 10, 2024

I agree with @mx-psi that the behavior really depends on the language, and what it offers.
Go is rather strict there. An interface can't add a new method, that'd be a breaking change for any implementations of that interface (the Go API solved that with embedded/trace/noop interfaces, but that can't be done for the SDK).
So we have to implement new interfaces that define the new behavior (see the WIP OnEnding implementation).

In Ruby for example, it's easy to make new interface methods a trivial change with the use of respond_to (see their implementation of OnEnding).

While we can definitely provide some guidance or ideas, this is so dependent on the language that it seems difficult to provide a general way things should be done.

@svrnm
Copy link
Member

svrnm commented Oct 14, 2024

Moved this to the spec, since the spec drives the implementation. We can discuss this also via the Maintainers Call / via #otel-maintainers for broad feedback.

@svrnm svrnm added the triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted label Oct 14, 2024
@jack-berg
Copy link
Member

I don't think anyone is going to argue that there aren't language-specific idiosyncrasies to work through.

The "guidance" that this issue mentions might be better described as: a strong recommendation, and maybe even a requirement, for languages to develop tooling that allows them to prototype new features in both the API and SDK.

Reasoning:

  • We absolutely need the ability to continue to evolve the API and SDK.
  • When we make changes, the stakes are high, so we absolutely need prototypes that confirm the changes make sense.
  • As we all know, language implementations have their own idiosyncrasies, so we need prototypes across a variety of languages to improve confidence of correctness.

Right now, some but not all languages have invested in developing the tooling for prototyping in the API / SDK. Is it acceptable to rely on all the prototyping to come from a subset of implementations? Do we get have a strong enough signal of correctness with prototypes missing in key languages?

I think we do need prototyping capabilities in all languages, and it should be mandated at the spec. In similar way as the spec mandates the separation of API and SDK artifactes, the ability for language implementations to prototype new features seems to be a key component.

@pellared
Copy link
Member

pellared commented Oct 16, 2024

@jack-berg, just to clarify. I think you want not only "prototyping capabilities" but "capabilities of releasing experimental APIs". A prototype can be simply a draft PR, fork, etc. Prototype is something that can be used just for showcasing/demo and thrown away. Here, I think you want to request for giving means to publish experimental APIs.

Adding experimental APIs can be more problematic in certain scenarios. For instance, adding a new method to an interface in API is a breaking change in Go ecosystem.

The issue is that for instance for Go, the "source code" is the thing that is being released in form of a git tag. A user-friendly way of publishing experimental APIs in Go is by developing those in long-living branch(es) and creating dedicated releases for them e.g. by tagging them as v1.2.0-exp.

The other popular are way is keeping experimental APIs in separate repositories or packages. This approach means that duplicate packages will need to be maintained. It also means users will need to rewrite their code switch between stable/experimental (as the import path is different). You can think of it like separate OpenTelemetry Clients.

Whatever the approach is taken, for both of them the main problem is maintainability (and maybe also lack of active contributors). For instance, in Go we are already behind the specification so we want to avoid such situations. So the issue is not "we cannot technically do it", but more "we may not have capacity to handle it". I think that in such scenarios it may be more pragmatic to only require "prototypes" and not "published experimental APIs".

@dashpole is working hard on documenting possible strategies on publishing experimental features depending on use cases: open-telemetry/opentelemetry-go#5882

@jack-berg
Copy link
Member

Yes. For the purposes of obtaining a signal that a new proposed feature is correct and ready for stabilization, prototypes need to have a higher bar than an unmerged PR. Users need to be able to call the thing.

@MrAlias
Copy link
Contributor

MrAlias commented Oct 17, 2024

Users need to be able to call the thing.

Why can't users call the thing from the branch/fork that sourced the PR?

@jack-berg
Copy link
Member

Why can't users call the thing from the branch/fork that sourced the PR?

Its definitely a grey area because with enough effort, anything is possible. The bar for a user needs to be low enough that they actually use the thing, allowing us to get the signal we need.

@MrAlias
Copy link
Contributor

MrAlias commented Oct 17, 2024

Why can't users call the thing from the branch/fork that sourced the PR?

Its definitely a grey area because with enough effort, anything is possible. The bar for a user needs to be low enough that they actually use the thing, allowing us to get the signal we need.

The bar also needs to be set high enough that users do not just start using experimental features with the expectations of stability.

I do not think it is pragmatic to completely rule out the possibility that things are showcased in PRs/branches/forks. Especially for languages where this makes the most sense.

tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Oct 22, 2024
Resolves open-telemetry#4257

This issue has been discussed in spec SIG meeting on 22 Oct 2024 and
decision was made that we want this to be a requirement for language
implementations.

This is a new requirement for implementations, which we believe
becomes more and more important now that we have Stable signals that
we would like to continue evolving.
@github-actions github-actions bot added the triage:followup Needs follow up during triage label Oct 29, 2024
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 31, 2024
Resolves
open-telemetry#4257

This issue has been discussed in spec SIG meeting on 22 Oct 2024 and
decision was made that we want this to be a requirement for language
implementations.

This is a new requirement for implementations, which we believe becomes
more and more important now that we have Stable signals that we would
like to continue evolving.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted triage:followup Needs follow up during triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants