ProtocolEncoder is a C++ typestate library, making it easy to enforce a protocol on the usage of an object.
Typestate analysis is a way to specify at compile-time what operations are valid on an object in the current state, and enforce these restrictions.
In other words, it allows to define a protocol of how to use an object, and make sure that the program cannot breach this protocol.
HTTP connections have headers and a body (among other things). We want to create a builder for one such connection, making sure that:
- There is at least one header.
- There is a body.
- The headers are specified before the body.
Only then will the user be able to create a connection.
(Bear in mind, this is a simplistic example, which could be better solved with a constructor taking the parameters.)
Let's create a class to build this connection:
class HTTPConnectionBuilder {
public:
void add_header(std::string header) { ... }
void add_body(std::string body) { ... }
HTTPConnection build() { ... }
private:
...
};
Now, we need to enforce the protocol. We can represent it with the following graph:
We start in the "START" state, then we have to take the transition add_header
at least once, but as many more times as we want, then the add_body
transition, then we can build
the connection.
After using ProtEnc, we get a wrapper object that we can use like this:
HTTPConnection connection = GetConnectionBuilder()
.add_header("First header")
.add_header("Second header")
.add_body("Body")
.build();
Any digression from the protocol will result in a compilation error.
To define the protocol, we need to build the graph shown above. Let's start by declaring the possible states:
enum class HTTPBuilderState {
START,
HEADERS,
BODY
};
Now we can declare from which states it is valid to start; in our case, only START:
using MyInitialStates = InitialStates<HTTPBuilderState::START>;
Similarly, we can declare that we can finish only from the BODY state, by
calling the build
function:
using MyFinalTransitions = FinalTransitions<
FinalTransition<HTTPBuilderState::BODY, &HTTPConnectionBuilder::build>
>;
Let's declare the transitions: we can go from START to HEADERS by calling
add_header
. It looks like this:
Transition<HTTPBuilderState::START, HTTPBuilderState::Header,
&HTTPConnectionBuilder::add_header>
We can add the other transitions in the same way, giving us:
using MyTransitions = Transitions<
// START -> HEADERS with add_header
Transition<HTTPBuilderState::START, HTTPBuilderState::HEADERS,
&HTTPConnectionBuilder::add_header>,
// HEADERS -> HEADERS with add_header
Transition<HTTPBuilderState::HEADERS, HTTPBuilderState::HEADERS,
&HTTPConnectionBuilder::add_header>,
// HEADERS -> BODY with add_body
Transition<HTTPBuilderState::HEADERS, HTTPBuilderState::BODY,
&HTTPConnectionBuilder::add_body>
>;
Then we just have to declare our wrapper class with all this info:
PROTENC_START_WRAPPER(
// Wrapper name
HTTPConnectionBuilderWrapper,
// Implementation class
HTTPConnectionBuilder,
// Name of the enum
HTTPBuilderState,
// Initial states
MyInitialStates,
// Transitions
MyTransitions,
// Final transitions
MyFinalTransitions,
// List of valid query methods (const methods on the implementation)
MyValidQueries);
// Declare the list of functions that we are wrapping:
// Transitions
PROTENC_DECLARE_TRANSITION(add_header);
PROTENC_DECLARE_TRANSITION(add_body);
// End functions.
PROTENC_DECLARE_FINAL_TRANSITION(build);
PROTENC_END_WRAPPER;
Now we can create an instance of our wrapper with any of the starting states:
HTTPConnection connection =
HTTPConnectionBuilderWrapper<HTTPBuilderState::Start>{}
.add_header("Never")
.add_header("Gonna")
.add_body("Give you up")
.build();
The full code of this example is in example/http_connection.cc.
Any protocol that can be represented by a FSM can be encoded using this library. This provides compile-time guarantee that the code respects the protocol.
This is a step towards formal verification of code, while being more lightweight and easy to integrate into existing codebases.
The definition of the wrapper could be outsourced to a code generator that checks some properties on the FSM, or simply takes an existing protocol specification and turn the FSM into a wrapper.
Oh, you want to get into the details? TL;DR: lots of template magic :)
The wrapper object is templated by the enum representing the states. The value of the template parameter is the state in which the wrapper is. Every transition method (including the final transition methods) consume the object (taking it by r-value reference), and return a new wrapper in the new state. The underlying object is never copied (provided it has a good move constructor).
The wrapper mainly just forwards the calls (declared by DECLARE_TRANSITION
and friends) to the GenericWrapper
.
GenericWrapper
does the heavy lifting:
- It contains an instance of the wrapped class (by value, there is no indirection).
- It is templated by all the parameters describing the FSM: initial states, transitions and so on.
- It has a
call_transition
method (and friends) that takes a function pointer, checks that it's a valid transition, and returns the new state. - It has a bunch of checks to make sure that we have "pretty" error messages for common errors (not using the correct types to define the FSM, not consuming the object for a final transition, etc).
call_transition
is called with a pointer to the function you want to call
(and arguments). It has a static_assert
that will go over the list of
transitions, looking for one that starts from the current state, and has the
function pointer as label. If the transition is not found, it's a compile
error. If the transition is found, we can extract from it the target state. We
then call the function on the wrapped object, and return a new wrapper
(constructed by moving the current wrapper) in the new state.
The other functions (call_final_transition
, call_valid_query
) work in a
similar fashion, deducing the return type from the function pointer and the
arguments.
The GenericWrapper constructor checks that it was built with one of the initial
states (with a static_assert
).
If you want to get into the template details, I invite you to have a look at the code itself.
- It requires C++20. This is mainly due to the
auto
template parameters. The omitted return types could be fixed with some redundancy (decltype), but dropping theauto
template parameters would mean dropping support for enum classes. It could still be acceptable, and regular enums mapped to ints would also work (it would just be slightly less elegant). The code could be adapted to work with C++14 if needed (maybe even C++11, I'm not sure), but it would be more verbose.
- We could have more checks.
- We currently only have basic checks on the template parameters, but we could check that there is no 2 transitions with the same origin and label, or that there are no unreachable states. I feel that this is more the responsability of the user: they need to make sure the FSM is the right one for their protocol.
- We check the top-level list parameters (
Transitions
,FinalTransitions
, ...) but we could go down and check that the function pointers are from the wrapped class, that the types areTransition
/FinalTransition
/..., that the states are from the right enum, and so on.
- Better error messages. When the transition is not found, we have static information about the state and the function pointer, so it should theoretically be possible to craft an error message with more information than "Transition not found".