Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add error handling prototype #2244

Closed
wants to merge 13 commits into from
Closed
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions safety_prototype/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
__pycache__
*.egg
*.egg-info
64 changes: 64 additions & 0 deletions safety_prototype/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
README
======

Overview
--------

This is a prototype to implement `error handling`_ in Python.

The user only calls API functions or methods, never SDK functions or methods.

Every SDK must implement every public function or method defined in the API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if they don't? Will we crash on initialization or will missing/bad methods just always return the "default" value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I am ok with crashing on initialization.

except for the SDK setting function.

A safety mechanism is applied to every API function or method (except for the
SDK setting function or method) and only to every API function or method.

This safety mechanism has 3 main components:

1. A predefined value to be returned in case of an exception being raised. This
value is predefined and independently set for every function or method.
2. A `try` / `except` block that catches any exception raised when executing
the function or method code.
3. A Python `warning`_ that is "raised" if an exception is raised in the code
protected by the safety mechanism.

The API provides a function to set a specific SDK. This function is
intentionally not protected by the safety mechanism mentioned above because the
specification `mentions`_ this:

The API or SDK may fail fast and cause the application to fail on
initialization...

*Initialization* is understood as the process of setting the SDK.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think arguably some would consider creating Tracers/Instruments/Exporters/Views part of the initialization process. Could we make the fail-fast behavior configurable for users who want to make sure long-lived objects like these are created successfully?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what do you mean with fail fast behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing on initialization

Copy link
Contributor Author

@ocelotl ocelotl Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, but how would this work? Imagine this situation:

# application.py

set_sdk("sdk")  # This can fail fast

...  # Here the user does a lot of stuff including using the OpenTelemetry API

meter = create_meter(...)  # This can also fail fast?

That would mean that after a lot of code has been executed something suddenly fails because creating a meter is considered initialization. This behaves like something that breaks the application code. Now, someone may argue that meters, tracers, etc are always created before doing anything else but there is no guarantee that would be the case. We could make it configurable that the creating of these objects fail fast but I see almost no value in this, the moment one of these objects is created and something goes wrong a warning will be raised and the user will know and if they want the application to crash they can run it in error mode. I don't see why we need to handle the creation of these objects in a special way.


When an API function or method is called without an SDK being set, a warning
will be raised and the predefined value of `None` will be returned.

After an SDK is set, calling an API function or method will call its
corresponding SDJ function or method. Any exception raised by the SDK function
ocelotl marked this conversation as resolved.
Show resolved Hide resolved
or method will be caught by the safety mechanism and the predefined value
returned instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the API needs to provide the safety mechanism for the SDK. This would only work if the user does all instrumentation through the API which is not currently the case; instead we have the SDK as an implementation of the API. For example this should work fine:

## server.py
from opentelemetry.trace import TracerProvider

class Server:
  def __init__(self, tracer_provider: TracerProvider):
    self._tracer = tracer_provider.get_tracer(type(self)

  def index(request: Request) -> Response:
    with self._tracer.start_span("do-something"):
      # ...

## main.py
from opentelemetry.sdk.trace import TracerProvider
import server

tracer_provider = TracerProvider(...)

server.Server(tracer_provider).serve()

As I understand it, the error handling mechanism would not work here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it won't work here, but one of the central requirements of this prototype is that the user only uses API objects. I am aware of the fact that the user can currently use the SDK directly, I understand the argument, I just don't think it applies to this prototype.


The Python warning that is "raised" when an exception is raised in the SDK
function or method can be transformed into a full exception by running the
Python interpreter with the `-W error` option. This Python feature is used to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not too familiar with the warnings module. Is there a way to automatically turn this on when running tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is an option that can be passed to Pytest, if I remember correctly, I have done that before.

satisfy this `specification requirement`_.

How to run
----------

0. Create a virtual environment and activate it
1. Run ``pip install -e opentelemetry-api``
2. Run ``pip install -e opentelemetry-sdk``
3. Run ``python application.py``
4. Run ``python -W error application.py``

Noice how even failed operations (divisions by zero) don't crash the
application in step 3, but they do in step 4.


.. _error handling: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/error-handling.md
.. _warning: https://docs.python.org/3/library/warnings.html
.. _specification requirement: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/error-handling.md#configuring-error-handlers
.. _mentions: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/error-handling.md#basic-error-handling-principles
66 changes: 66 additions & 0 deletions safety_prototype/application.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Only API objects are imported because the user only calls API functions or
# methods.
from opentelemetry.configuration import set_sdk
from opentelemetry.trace import function, Class0, Class1, Class2, Class3

print(function(4, 2))
# This is the function that sets the SDK. After this is set, any call to an API
# function or method will end up calling its corresponding SDK function or
# method.
set_sdk("sdk")

# function returns the result of dividing its first argument by its second
# argument.

# This does not raise an exception, the resulting value of 2.0 is returned.
print(function(4, 2))

# This is a division by zero, it raises an exception, the safety mechanism
# catches it and returns the predefined value of 0.0.
print(function(1, 0))


# The class argument is stored in the SDK instance and method uses it to
# multiply the result of the division of it first argument by the second before
# returning the resulting value.
class0 = Class0(2)

# Class0.method returns the result of dividing its first argument by its second
# argument.

# This does not raise an exception, the resulting value of 4.0 is returned.
print(class0.method_0(4, 2))

# This is a division by zero, it raises an exception, the safety mechanism
# catches it and returns the predefined value of 0.0.
print(class0.method_0(1, 0))

class1 = Class1()

# This call returns an instance of a the safe Class0 class. This is necessary
# because the user must only handle safe objects.
class0 = class1.method_0(7)

print(class0.method_0(1, 2))
print(class0.method_0(1, 0))

class2 = Class2()

# This call returns an instance of a the safe Class0 class. This is necessary
# because the user must only handle safe objects.
with class2.method_0(7) as class0:
print(class0.method_0(2, 2))
print(class0.method_0(2, 0))


# This call returns a Class3 object even when the required arguments are
# missing.
class3 = Class3()

print(class3.c)

# This call returns a Class3 object even when the required arguments will raise
# an exception.
class3 = Class3(1, 0)

print(class3.c)
45 changes: 45 additions & 0 deletions safety_prototype/opentelemetry-api/setup.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
[metadata]
name = opentelemetry-api
description = OpenTelemetry Python API
author = OpenTelemetry Authors
author_email = [email protected]
url = https://github.com/open-telemetry/opentelemetry-python/tree/main/opentelemetry-api
platforms = any
license = Apache-2.0
classifiers =
Development Status :: 5 - Production/Stable
Intended Audience :: Developers
License :: OSI Approved :: Apache Software License
Programming Language :: Python
Programming Language :: Python :: 3
Programming Language :: Python :: 3.6
Programming Language :: Python :: 3.7
Programming Language :: Python :: 3.8
Programming Language :: Python :: 3.9
Programming Language :: Python :: 3.10
Typing :: Typed

[options]
python_requires = >=3.6
package_dir=
=src
packages=find_namespace:
zip_safe = False
include_package_data = True

[options.packages.find]
where = src
27 changes: 27 additions & 0 deletions safety_prototype/opentelemetry-api/setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os

import setuptools

BASE_DIR = os.path.dirname(__file__)
VERSION_FILENAME = os.path.join(BASE_DIR, "src", "opentelemetry", "version.py")
PACKAGE_INFO = {}
with open(VERSION_FILENAME, encoding="utf-8") as f:
exec(f.read(), PACKAGE_INFO)

setuptools.setup(
version=PACKAGE_INFO["__version__"],
)
49 changes: 49 additions & 0 deletions safety_prototype/opentelemetry-api/src/opentelemetry/_safety.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from functools import wraps
from warnings import warn, resetwarnings
from traceback import format_exception
from sys import exc_info


def _safe_function(predefined_return_value):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might there be cases where a predefined value won't work and we need a factory function instead? E.g. best effort propagation with NonRecordingSpan(span_context)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a very valid point. In a previous attempt to implement this I tried to make it possible for the safety mechanism to return a value that would be the result of a certain SDK function that would receive the arguments passed to the API function so that different values could be returned depending on the actual call the end user made.

Nevertheless, there is a paradox, the code that makes this new value can also fail, and would also need a predefined value to return if that happened.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevertheless, there is a paradox, the code that makes this new value can also fail, and would also need a predefined value to return if that happened.

That code would also be wrapped or we could explicitly/manually safe-guard it so should be doable I guess?

"""
This is the safety mechanism mentioned in the README file.

This is used as a decorator on every function or method in the API.
"""

def internal(function):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought I had: If we decide to lose the predefined return value (and instead return None), this could be implemented as a class that each class (or base class for that matter) in the SDK inherits from. Something like:

class safetyClass(object):
    def __getattribute__(self, name):
        returned_attribute = object.__getattribute__(self, name)
        if callable(returned_attribute):
            return _safety(returned_attribute)
        return returned_attribute

This way, it's way less of a breaking change that needs to happen all at once

Copy link
Contributor Author

@ocelotl ocelotl Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but we can't always return None. 🤷 The safety mechanism must return the same kind of object as the corresponding function or method is expected to return.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you're right, but we should change the calling functions to understand that an exception occurred and the returned value is a predefined one... For example, if in your snippet someone calls the safe function(1,0) the return value would be 0.0, which isn't the true result of 1/0- it is a result that there was an exception, and the calling function needs to check the return value.
And if we are choosing to do so, we might as well create some object that represents that an exception occurred, and use the option I offered with that value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calling functions do not need to check the return value, they will receive a fake, predefined value when an exception is raised. This is intentional and the way that the user can know that the value is not real is for them to run the application with the -W error option so that warnings are turned into exceptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense

@wraps(function)
def wrapper(*args, **kwargs):

# This is the try / except block mentioned in the README file.
try:
exception = None
return function(*args, **kwargs)
except Exception: # pylint: disable=broad-except
exception = "".join(format_exception(*exc_info()))

if exception is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I think that an expected behaviour of such a feature would be that user made exceptions would be raised, no? OpenTelemetry should be seamless for the user's application, and if it is exception-driven, this will alter the behaviour (unless -W error is used and then OpenTelemetry originated errors will alter the behaviour)

Copy link
Contributor Author

@ocelotl ocelotl Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think there may be a misunderstanding here. The mechanism proposed here is intended to guard against exceptions raised in the SDK. Any exception raised in the application code directly that is caused by the calling of a function or method outside of the OpenTelemetry API is not protected and will be raised normally.

So, you mention "user made exceptions". If you mean by that exceptions raised by code not called by functions or methods in the OpenTelemetry API, then this mechanism will not cause any issues, these "user made exceptions" will be raised normally.

If with "user made exceptions" you mean exceptions in any part of the SDK that were "intentionally coded", for example something like this:

@_safe_function(X)
def some_function(a, b):
    ...
    if some_condition:
        raise Exception("Some exception")
    ...

then the mechanism will also catch these exceptions. This is intentional. The spec says that any exception of this kind must be handled by the safety mechanism, regardless of how "intentional" it is or not.

Copy link
Member

@oxeye-nikolay oxeye-nikolay Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, Maybe I am missing something, but that's not what I meant. What I meant is something where the OpenTelemetry wrapper is "safe", and the original function which it wraps raises an exception (For example, raise Http404() in Django). In this case, the original function will raise the exception, which will raise to the OpenTelemetry Django wrapper, which will raise to the safety wrapper. Perhaps Django isn't the best example as it uses a middleware object and doesn't directly call wrapt.wrap_function_wrapper, but the SDK does wrap user-defined functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sorry but I don't understand what's wrong here. It is intentional to guard against any kind of exception raised in the scope of the safety mechanism, regardless of who raised it or whatever intention was behind raising it. This is a specification requirement, because if we let an exception to be raised it will crash the application which must not happen. So, if your concern is that this mechanism may end up "swallowing" an user made exception, then yes, this mechanism will do that and it is intended to do so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. I just wanted to bring this up because it alters the behavior, so it will a known side effect

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is handled by the Django Framework so the developer can decide how the 404 response looks. If we do choose to address this issue because it might alter the behavior of the program, it can probably be done by parsing the trace stack, but I'm not sure it's something that we want to do because this logic might get complicated.
What do you think about it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can revisit this when we go over all instrumentation to ensure we don't raise errors there. We shouldn't do anything special to handle any known exceptions differently.

We just need to ensure that our context managers (start_as_current_span, use_span) don't swallow exceptions raised from within the context. May be this means they shouldn't be decorated and error protection should be coded manually into their bodies or may be they can catch all exceptions but annotate the ones coming from user code so the decorator can re-raise them but that sounds unnecessarily complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but annotate the ones coming from user code so the decorator can re-raise them...

Hm, but does this mean that the re-raised exceptions will be able to crash the application?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, yes. If the application would crash without instrumentation then it should crash with instrumentation as well, right? We should ensure that:

  1. SDK/instrumentation code never raises exceptions.
  2. For instrumented code paths, we record exceptions as events if the user wants to and then re-raise.

This is how use_span works today: https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-api/src/opentelemetry/trace/__init__.py#L530-L559

As an example, if I have a function that looks like:

def my_func():
    raise Exception()

and I instrument it as:

def my_func():
    with tracer.start_as_current_span("span1"):
        raise Exception()

then the only change that should happen is that now my program should export a span called span1 with an event which has information about the exception I'm raising. It should not swallow the exception. However, if an exception is raised inside start_as_current_span() function or any other function it calls directly, that exception should be swallowed and never reach my code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see.

Ok, I think the current design works as you want it to work, because it will only catch exceptions raised in the execution of start_as_current_span (not in its with-managed context) in the example above.

# This is the warning mentioned in the README file.
warn(f"OpenTelemetry handled an exception:\n\n{exception}")
exception = None
resetwarnings()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand the behavior or warn, only one warning of a certain kind is "raised" unless this method is called. I think the intention is not to swamp the output with similar warnings, but for development purposes I prefer to have them all being displayed, we can reconsider that later.


return predefined_return_value

return wrapper

return internal
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from pkg_resources import iter_entry_points
from importlib import import_module

_sdk = None


def set_sdk(sdk_name: str) -> None:
"""
This is the SDK setting function mentioned in the README file.

It is intentionally not protected to make it possible for the application
to fail fast if it was not possible to set the SDK. In an actual
implementation of this prototype, the SDK setting mechanism may take
different forms to take into consideration that this SDK setting process
may happen also in auto instrumentation. This means that SDK setting may
also happen by using environment variables or other form of configuration.
"""
global _sdk

if _sdk is None:
_sdk = next(
iter_entry_points("opentelemetry_sdk", name=sdk_name)
).load().__package__


def _get_sdk_module(path: str) -> object:
if _sdk is None:
raise Exception("SDK has not been set")

return import_module(".".join([_sdk, path]))
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
This an implementation of the API where every function or method is safe.

Any call to a function or method defined here passes its arguments to a
corresponding underlying SDK function or method. In this way, objects defined
in this module act as proxies and (with the SDK setting function) are the only
objects the user has contact with.
"""
from contextlib import contextmanager

from opentelemetry._safety import _safe_function
from opentelemetry.trace.api import Class0, Class1, Class2
from opentelemetry.configuration import _get_sdk_module


@_safe_function(0.0)
def function(a: int, b: int) -> float:
return _get_sdk_module("trace").function(a, b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am weary of this dynamic behavior–it looks very difficult to test or have static checks protecting the behavior. That combined with the warning mechanism, I am worried it would be easy to have a silent breakage.

As you said in the README, for this to work every SDK must keep the exact same import paths and fully-qualified symbols. This is a big design requirement and easy to accidentally mess up for SDK implementors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am weary of this dynamic behavior–it looks very difficult to test or have static checks protecting the behavior.

Sorry what can be difficult to test?
What do you mean with static checks?

That combined with the warning mechanism, I am worried it would be easy to have a silent breakage.

I think that is a valid objection but to the spec itself. It is possible to have a silent breakage by not immediately raising an exception that crashes the application, but that is what the spec requires if I understand it correctly.

As you said in the README, for this to work every SDK must keep the exact same import paths and fully-qualified symbols. This is a big design requirement and easy to accidentally mess up for SDK implementors.

Well, what we need is a mechanism that can tell the API where is the corresponding SDK object located. We are using the same qualified path of the API object to find its matching SDK object but that can be changed with a mechanism that uses an arbitrary SDK-defined mapping to make it possible for the API to find the SDK objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry what can be difficult to test?

That the SDK is correctly reproducing the exact names and import paths as the API.

What do you mean with static checks?

pylint/mypy/IDEs can check things like that your import paths are correct or that you implement an ABC correctly. In this case, it's impossible to tell these tools "this package must mirror all of the interfaces of OpenTelemetry API".

Well, what we need is a mechanism that can tell the API where is the corresponding SDK object located. We are using the same qualified path of the API object to find its matching SDK object but that can be changed with a mechanism that uses an arbitrary SDK-defined mapping to make it possible for the API to find the SDK objects.

I think a better way to do this is with a class or protocol. You could also expose the SDKs functionality as an interface which SDKs would implement and the API knows how to use:

# could also be typing.Protocol
class SDK(ABC):
  @abstractmethod
  def class0_factory() -> Type[Class0]:
    pass

  def function(a, b):
    pass

def Class0() -> Class0:
  _sdk.class0_factory()

def function(a, b):
  _sdk.function()

Copy link
Contributor Author

@ocelotl ocelotl Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pylint/mypy/IDEs can check things like that your import paths are correct or that you implement an ABC correctly. In this case, it's impossible to tell these tools "this package must mirror all of the interfaces of OpenTelemetry API".

Ok, but I never expected or intended for the static checkers to be the ones responsible of checking that the SDKs are compliant with the API.

This is a very OpenTelemetry-specific requirement, I don't think static checkers or the typing module to be designed with this in mind, so this approach looks like forcing our design to work with a tool that was never intended to check this kind of things.

The approach here is for the API to have functions and abstract classes that the SDK has to implement. We can provide a list of the functions and classes so that the SDK implementations know what they have to implement and they can check that their SDKs implement them fully.

To summarize, I do agree with you @aabmass that it is very important for SDKs to be able to have a test case that pretty much says "the API is being fully implemented" or not. I just think that this kind of testing is very OpenTelemetry-specific and it would be necessary to force the design of this mechanism and the design of static type checkers to make this checking happen in a static type checker. It is better to provide clear documentation for SDK implementations and maybe additional functionality in the API (like a list of all the stuff that any SDK has to implement) so that every SDK can easily implement their own test case for API compliance.



class Class0(Class0):

@_safe_function(0.0)
def method_0(self, a: int, b: int) -> float:
return self._sdk_instance.method_0(a, b)


class Class1(Class1):

@_safe_function(Class0(0))
def method_0(self, a: int) -> Class0:
safe_instance = Class0(0)
safe_instance._sdk_instance = self._sdk_instance.method_0(a)
return safe_instance


class Class2(Class2):

@contextmanager
@_safe_function(Class0(0))
def method_0(self, a: int) -> Class0:
safe_instance = Class0(0)
with self._sdk_instance.method_0(a) as sdk_instance:
safe_instance._sdk_instance = sdk_instance
yield safe_instance


# This is a class that is implemented completely in the API and not in any SDK.
class Class3:

@_safe_function(None)
def __init__(self, a: int, b: int):
self._c = a / b

@property
@_safe_function(0)
def c(self):
return self._c

@_safe_function(0)
def method_0(self, a: int) -> float:
return a / self._c
Loading