Skip to content

Commit

Permalink
✨ Add error handling docs
Browse files Browse the repository at this point in the history
  • Loading branch information
kammce committed Mar 29, 2024
1 parent 497e4a9 commit 5537b56
Showing 1 changed file with 193 additions and 1 deletion.
194 changes: 193 additions & 1 deletion mkdocs/user_guide/error_handling.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,195 @@
# 🪤 Error Handling in libhal

TBD
libhal utilizes C++ exception handling for transmitting errors. C++ exceptions
were chosen over other error handling mechanisms because they:

1. Improve code performance by separating error handling code from normal
code, thus enhancing the performance of the normal code by reducing the
cost of calling functions that could fail.
2. Make error handling easier by allowing the user to wrap multiple blocks of
code within a handler distinguished by the type/category.
3. Reduce the binary size of libraries and applications by:
1. Using a single algorithm to allocate, construct, and transport errors
and direct the CPU to the appropriate error handling code.
2. Eliminating the need for functions to contain error return paths when
participating in error propagation.
3. Providing an error path using unwind instructions, a compressed form of
machine instructions that simulate the epilog of a function, but without
the requirement to return objects on the stack.
4. Although handler code can increase the code size compared to plain code
(if/else/switch), the number of error handling blocks (`catch` blocks) is
typically much smaller compared to the cost of a distributed error handling
approach (`result<T, E>`, returning error codes, `optional/nil/null`).
5. Offer additional space in which they could be significantly improved upon
beyond their current performance.

With that out of the way, let's delve into how libhal manages errors.

## How to use exceptions in C++

Let's start with signaling an error. This can be done by writing the following
bit of code:

```C++
void check_if_device_is_valid(/* ... */) {
constexpr hal::byte expected_id = 0xAD;

// Get ID info from device ...

if (expected_id != retrieved_id) {
throw hal::no_such_device(this, expected_id);
}
}
```
And to catch the thrown error you do this following:
```C++
void bar() {
try {
check_if_device_is_valid(/* ... */);
} catch(const hal::no_such_device& p_error) {
// do something using the error info.
}
}
```

Note that this is a simplified example.

The `throw` keyword functions similarly to other languages, where you can
throw or raise an error object. This exits the function's scope without
returning normally. This action causes the system to revert the CPU's state
back to the state of the try scope. The exception mechanism then moves the
CPU's program counter to the correct catch block based on the thrown type. In
this case, since we threw `hal::no_such_device`, the catch block for that type
will be selected. If no catch blocks are present with a valid error type in any
scope from which the error object was thrown, then `std::terminate()` is called.

Everything within the scope of the try block is no longer valid memory. The
significance of this is that the exception unwinding mechanism can and must
skip spending cycles on constructing and bubbling objects from a lower stack
frame to a higher one. Since the thrown object is the only thing that escapes
the scope, any information needed for error handling should be copied to the
thrown object as it is being thrown.

## `hal::exception` hierarchy

libhal has a hierarchy of errors, which looks like the following:

```plaintext
hal::exception
├── hal::no_such_device
│ └── hal::stm32f1::i2c_core_dump_io_error
├── hal::io_error
│ └── hal::lpc40::i2c_core_dump_io_error
├── hal::timed_out
├── ...
└── hal::unknown
```

`hal::exception` is the base exception for all libhal exceptions and is
typically not thrown directly. Its descendants are thrown instead, most having
a 1-to-1 correspondence with the enumerated constants in `std::errc`.
`std::errc` follows the POSIX error codes, providing a reasonable approximation
of the types of errors hardware might encounter. An exception to this rule is
`hal::unknown`, which represents an unknown error, used when the exact error is
undetermined. Such cases should be rare in code.

To see the full list of exception types available, refer to
the [error API docs](https://libhal.github.io/3.0/api/libhal/error.html#error).
It is important to consult this documentation to understand which exceptions
should be thrown and under what circumstances they can be recovered from.

## Expectation from libhal libraries

libhal libraries and utilities are required to only use only the direct
descendants of `hal::exception` or a more derived exception with additional
information.

Exceptions outside of the `hal::exception` hierarchy may still be thrown from a libhal library if it comes from a call to a user defined callback. The user is allowed to throw any types they wish, although care should be taken in choosing the types to be thrown. This is useful for application code that wants to bypass
catch blocks provided by libhal libraries.

## How Do You Know What Throws What?

C++ does not currently have a mechanism to inform the user at compile time if
an uncaught exception will terminate your application. Therefore, to know what
may be thrown from a function, you'll need to consult the API documentation
for the function. All libhal interfaces have strict requirements for their
implementations to throw very specific `hal::exception` derived types.

## Knowing when to catch an error

First and foremost, accept that your application may encounter an exception
that will terminate it. Plan with this possibility in mind. Use
hal::set_exception to set the terminate handler function as needed for your
application, such as saving state information and resetting the device.

With this in mind, ONLY catch the errors you know how to handle. If you do not
know how to handle an error, allow it to propagate to higher levels in the
call chain. This gives higher-level code the opportunity to handle errors.

Do not encase each function in a try/catch block, as this is detrimental to
code size and degrades the performance of the unwind mechanism by providing it
more scopes to search through.

## When to catch `hal::exception`

`hal::exception` should only be caught when code wants to swallow all possible
exceptions from libhal OR when translating exceptions from C++ to a C API that
needs an error code that roughly follows `std::errc`.

```C++
int c_callback() {
try {
foo();
bar();
baz();
} catch (const hal::exception& p_error) {
return static_cast<int>(p_error.error_code());
}
}
```

## Using `hal::exception::instance()`

```C++
try {
read_timeout();
bandwidth_timeout();
} catch (const hal::timed_out& p_exception) {
if (&read_timeout == p_exception.instance()) {
hal::print(console, "X");
read_complete = true;
}
// TODO: Replace this exceptional bandwidth timeout with a variant that
// simply returns if the timeout has occurred. This is not its intended
// purpose but does demonstrates proper usage of `p_exception.instance()`.
else if (&bandwidth_timeout == p_exception.instance()) {
hal::print(console, "\n + |");
bandwidth_timeout = hal::create_timeout(counter, graph_cutoff);
} else {
throw;
}
}
```

In this case, `read_timeout` and `bandwidth_timeout` are callable objects that
live in a scope above the try block allowing them to be modified and updated in
the error handling block. Because both of these objects can throw an exception, we may want to know which one throw the exception. We can use the `instance()` function to get the address of the object that threw an exception. If the instance does not match anything in scope, then it may have been from an object that was lower in the stack and is no longer valid.

Note the comment or `bandwidth_timeout`. `bandwidth_timeout` is apart of the normal control flow and should not be reporting errors to move along the normal control flow. `read_timeout` on the other hand does report an actual error in this context. This example is taken from `libhal-esp8266/demos/applications/at_benchmark.cpp`.

!!! caution

DO NOT USE `const_cast` and `reinterpret_cast` to FORCE an address from `instance()` into a pointer to some other type and then attempt to use it. This is strong undefined behavior. ONLY use the address returned from instance as a means to compare it to other objects.

## Why you shouldn't throw an `int` or other primitives

Application callbacks are allowed to throw whatever type they wish although
care should be taken to consider a good type to throw.

Throwing `int` is generally a bad choice because it gives little to no
information about what the kind of error is. And if such a choice was used, it
probably means that the int encodes an error code, meaning many sections of
code would need to catch it, check if its their error code, and rethrow it, if
it is not the correct error code. This resulting in a large number of catch blocks.

0 comments on commit 5537b56

Please sign in to comment.