-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error and warning improvements #122
Error and warning improvements #122
Conversation
deepgram/__init__.py
Outdated
if 'api_key' not in options: | ||
raise DeepgramSetupError("API key is required") | ||
|
||
if "api_url" in options and options.get("api_url", None) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Consistently use single or double quotes, not both
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree. The existing code has a mix of single quotes and double quotes. Python's standard is double quotes, so that's what I use. I did not want to add a ton of formatting changes to this PR, so I kept them minimal. I can change this one in particular because you raised it, but I'd like to go through and reformat the entire codebase in a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I hadn't noticed that before! Yeah agreed that it's better served by doing a cleanup sweep in a separate PR.
raise ValueError("DG: API key is required") | ||
self.options = t_options | ||
if not isinstance(options, (str, dict)): | ||
raise DeepgramSetupError("`options` must be a dictionary or an API key string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about using the name DeepgramAuthError
rather than DeepgramSetupError
? Is the setup only for API key / authentication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line you highlighted is not an Auth error and instead implies that the user passed the incorrect type to the constructor. We could also throw a TypeError
here, but DeepgramSetupError
keeps the error codes consistent. Also DeepgramSetupError
inherits from ValueError
which is applicable when the user passes the wrong value/type to the constructor.
suppress_warnings: bool | ||
raise_warnings_as_errors: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these have default values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TypedDict
s do not have default values, and none of the other classes in this file have default values. Technically these types should have Optional
around them, but very few of the other optional fields in this file use Optional
so I left it off here as well. This is something we can clean up in the future.
We could also convert the TypedDicts to dataclasses (I prefer dataclasses to TypedDicts). Alternatively, we could use TypedDict's Required
and NotRequired
types defined in this pep: https://peps.python.org/pep-0655/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference - do you think TypedDicts are marginally more performant, but dataclasses are more flexible? We do have an awful lot of TypedDicts already, so it makes sense to defer to that standard for now.
My interest in an explicit default value is mainly to make clear what the behavior will be when the user doesn't set anything. But I guess in this case, it's logical for warnings to be warnings by default... rather than being suppressed or errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, TypedDicts are more performant than dataclasses. I forgot the SDKs are written with high performance as a top priority.
I completely agree with you, it would be really nice to have default values.
if options.get("raise_warnings_as_errors") and "metadata" in body and "warnings" in body["metadata"]: | ||
raise DeepgramApiError({"warnings": body["metadata"]["warnings"]}, http_library_error=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative for when nested dictionary keys might not be present is chaining get
s:
if options.get("raise_warnings_as_errors") and body.get("metadata", {}).get("warnings", {}):
The streaming test suite uses that style and I've been a fan of it.
This applies in a few cases throughout the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I personally prefer the in
operator rather than nested .get
s. For complex "contains" checks, I have to really look a the .get
s and the parentheses to understand how the code branches and which entries are default values vs. conditionals. For example, metadata
and warnings
are conditionals, but both {}
s are default values. If I do not read a comma or parenthesis correctly, I may misunderstand which is which. This is particularly true when strings are used as default values.
On the other hand, I can typically read if statements with with in
operator left-to-right (as if it were a sentence) and understand the logic. (Which I hope is true for the if statement I wrote.)
Python does not have a style recommendation for this, but if Deepgram wants to define it then that works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside I see with the in
syntax is that it gets verbose and repetitive for nested dictionaries. But you're only going 2 levels deep so it's not too bad. I'm fine with it as-is and I don't know of a specific style guide to be followed here. (Seems like the absence of a DG Python style guide is becoming a theme in this PR discussion!)
urllib.error.HTTPError, | ||
urllib.error.URLError, | ||
websockets.exceptions.InvalidHandshake, | ||
aiohttp.ClientResponseError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we are presenting all of these errors as a type of Deepgram error? Would we maybe want other non-DG errors to bubble up rather than wrapping them as our own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous iteration of the code, the Python SDK was catching all non-DG errors and returning them as Exception
, sometimes without using the from
operator (!!!). That means the non-DG errors - and the information available in them - were inaccessible.
I agree that we could allow these errors to bubble up on their own rather than catch them and wrap them in a Deepgram error. It would allow users to be more explicit about how to handle errors, but it would also make error handling more difficult for the user because they would need to handle multiple exception types.
The two approaches are targeted at different audiences: The implementation where errors are wrapped up into a single DG exception makes it easy to do simply error handling and therefore makes it easy for newer programmers to handle failed API requests. On the other hand, the implementation where errors bubble up requires the user to be explicit about how they handle the different use cases, which is likely more useful for power users.
The code I wrote grabs the underlying non-DG error and adds it as a field to the DG error. I hope this satisfies both user groups because: the programmers with less experience can simply catch one error, and the power users can grab the underlying HTTP error and perform more advanced error handling.
Aside: I said, "the implementation where errors bubble up requires the user to be explicit about how they handle the different use cases". That isn't completely true because users could always write except Exception:
to catch all errors, but I try to highly discourage the use of except Exception
. It is almost always too broad. For example, if a user passes an incorrect type that results in a TypeError
, they wouldn't want their retry logic to trigger and keep making API calls with the same TypeError
.
After that long explanation, I hope this code serves both new users and power users by making it easy to handle failed API requests in both simple and complex ways.
deepgram/errors.py
Outdated
if isinstance(args[0], dict) and "err_msg" in args[0]: | ||
self.error = args[0]["err_msg"] | ||
if "metadata" in args[0] and "warnings" in args[0]["metadata"]: | ||
self.warnings = args[0]["metadata"]["warnings"] | ||
elif "warnings" in args[0]: # Occurs when `raise_warnings_as_errors` is enabled | ||
self.warnings = args[0]["warnings"] | ||
if "metadata" in args[0] and "request_id" in args[0]["metadata"]: # Occurs when Deepgram returns a success response (for warnings) | ||
self.request_id = uuid.UUID(args[0]["request_id"]) | ||
elif "request_id" in args[0]: # Occurs when Deepgram returns a failed response | ||
self.request_id = uuid.UUID(args[0]["request_id"]) | ||
elif isinstance(args[0], str): | ||
self.error = args[0] | ||
else: | ||
self.error = str(args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're referring to args[0]
a lot here, it would be helpful for readability to give that a named variable so it's clearer what that arg is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed. I created a new variable: error_or_warning_data = args[0]
Approved! But best to wait until signoff from Luke/DX before merging 🙏 |
Oh one more question - can you add some tests in |
I added two tests that ensure the appropriate errors are raised. |
Summary:
This PR creates custom
Exception
objects when errors are thrown from the Deepgram SDK and provides support for the newwarnings
object in API responses.Details:
Prior to this PR, Deepgram errors were distinguished by prefixing the error message with a
"DG:"
string. This PR removes the string prefix from all error messages and the exceptions were replaced with customException
objects. This allows users to usetry/except
blocks to handle errors from the Deepgram SDK, which is standard practice for Python libraries.Two new exceptions will be thrown:
DeepgramSetupError
will be thrown if an error occurs while instantiating the Deepgram SDK (such as a missing API key), andDeepgramApiError
will be thrown when errors are encountered during API requests. Both errors inherit from aDeepgramError
base class exception, which allows users to catch all Deepgram exceptions in oneexcept
block.The
DeepgramApiError
has a number of fields the user may find useful when handling this exception, such as an HTTP status code if one exists.The new support for warnings defaults to printing a warning to
stdout
using Python's built-inwarnings
module. This is standard practice for handling warnings.Two options were added to allow users to customize how warnings are handled. The
Options
class that is passed to the Deepgram SDK duration initialization has two new fields:suppress_warnings
which prevents the warnings from being printed to the console, andraise_warnings_as_errors
which raises warnings as errors if a warning is found in the API response.Testing:
The
pytest
tests run successfully on Python 3.11 and Python 3.9. I did not test on Python 3.10 but there are no changes that should impact it.The new support for warnings can be tested by hitting the Deepgram API with
summarization=true&detect_language=true
and using a non-English language. Here is the German audio file I used for testing.suppress_warnings
andraise_warnings_with_errors
with this API call and they work correctly.To test the new error handling, I used
tier=nova&language=de
(with any audio file/URL). It works well. There are errors I wasn't able to reproduce, such as 503 errors from Deepgram.I also tested timeout errors, which raise the appropriate exception (no change).