Skip to content

Commit

Permalink
port and adapt cpp.react's Introduction.md as doc/introduction.adoc
Browse files Browse the repository at this point in the history
  • Loading branch information
YarikTH committed Aug 6, 2023
1 parent 23a78c9 commit 195106f
Showing 1 changed file with 294 additions and 0 deletions.
294 changes: 294 additions & 0 deletions doc/introduction.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
= Introduction to µReact
:toc:
:author: Sebastian Jeckel

NOTE: The following is an adapted version of the article
http://snakster.github.io/cpp.react/guides/Introduction.html[Introduction]
from the https://snakster.github.io/cpp.react/[cpp.react]'s documentation.
== Motivation
Reacting to events is a common task in modern applications, for example to respond to user input.
Often events themselves are defined and triggered by a framework, but handled in client code.
Callback mechanisms are typically used to implement this.
They allow clients to register and unregister functions at runtime, and have these functions called when specific events occur.
Conceptually, there is nothing wrong with this approach, but problems can arise when distributing complex program logic across multiple callbacks.
The three main reasons for this are:
* The control flow is "scattered"; events may occur at arbitrary times, callbacks may be added and removed on-the-fly, etc
* Data is exchanged via shared state and side effects
* Callback execution is uncoordinated; callbacks may trigger other callbacks etc. and change propagation follows a "flooding" scheme
In combination, these factors make it increasingly difficult to reason about program behaviour and properties like correctness or algorithmic complexity.
Further, debugging is hard and when adding concurrency to the mix, the situation gets worse.
Decentralized control flow is inherent to the creation of interactive applications, but issues of shared state and uncoordinated execution can be addressed.
This is what µReact does.
== Design outline
The issue of uncoordinated callback execution is handled by delegating this task to a dedicated entity - a so-called _propagation engine_.
This engine "intelligently" schedules callbacks that are ready to be executed, potentially using multiple threads, while guaranteeing certain safety and complexity properties.
The aforementioned usage of shared state and side effects is employed due to a lack of alternatives to implement dataflow between callbacks.
Thus, to improve the situation, proper abstractions to model dataflow explicitly are needed.
From a high-level perspective, this dataflow model consists of entities, which can emit and/or receive data, and pure functions to "wire" them together.
Instead of using side effects, these functions pass data through arguments and return values, based on semantics of the connected entities.
There exist multiple types of entities, representing different concepts like time changing values, event occurrences, or actions.
Essentially, this means that callbacks are chained and can pass data in different ways, all of which is done in a composable manner, backed by a clear semantic model.
The following sections will introduce the core abstractions in more detail.
== Signals
Signals are used to model dependency relations between mutable values.
A `signal<S>` instance represents a container holding a single value of type `S`, which will notify dependents when that value changes.
Dependents could be other signals, which will re-calculate their own values as a result.
To explain this by example, consider the following expression:
[source,c++]
----
X = A + B
----
This could be interpreted as an imperative statement:
[source,c++]
----
int X = A + B;
----
If `A` or `B` change later, the statement has to be re-executed to uphold the invariant of `X` being equal to the sum of `A` and `B`.
Alternatively, instead of storing `X` as a variable in memory, it could be defined as a function:
[source,c++]
----
int X() { return A + B; }
----
This removes the need to update `X` imperatively, but also repeats the calculation on every access rather than on changes only.
Signals are implemented as a combination of both approaches.
To demonstrate this, `A` and `B` are now declared as `signal<int>`.
A function is defined, which calculates the value of `X` based on its given arguments:
[source,c++]
----
int CalcX(int a, int b) { return a + b; }
----
This function is used to connect `X` to its dependencies `A` and `B`:
[source,c++]
----
signal<int> X = lift(with(A, B), CalcX);
----
The values of `A` and `B` are passed as arguments to `CalcX` and the return value is used as value of `X`.
`CalcX` is automatically called to set the initial value of `X`, or upon notification by one of its dependencies about a change.
If the new value of `X` is different from the current one, it would send change notifications to its own dependents (if it had any).
A special type of `signal` is `var_signal`.
Instead of automatically calculating its value based on other signals, it can be changed directly.
This enables interaction with signals from imperative code.
Here's the complete example:
[source,c++]
----
var_signal<int> A = make_var(1);
var_signal<int> B = make_var(2);
signal<int> X = lift(with(A, B), CalcX);
cout << X.get() << endl; // output: 3
A.set(2);
cout << X.get() << endl; // output: 4
----
Updates happen pushed-based, i.e. as part of `A.set(2)`.
`get()` is a pull-based interface to retrieve the stored result.
In summary, the value of signal is calculated by a function, stored in memory, and automatically updated when its dependencies change by re-applying that function with new arguments.
When considering deeply nested dependency relations involving computationally expensive operations, signals apply selective updating, as if only the necessary values were re-calculated imperatively, while the calculations themselves are defined as pure functions.
== Event streams
Unlike signals, event streams do not represent a single, time-varying and persistent value, but streams of fleeting, discrete values.
This allows to model general events like mouse clicks or values from a hardware sensor, which do not fit the category of state changes as covered by signals.
Event streams and signals are similar in the sense that both are reactive containers that can notify their dependents.
An event stream of type `events<E>` propagates a list of values `E` and will notify dependents, if that list is non-empty.
Dependents can then process the propagated values, after which they are discarded.
In other words, event streams don't hold on to values, they only forward them.
Here's an example:
[source,c++]
----
bool IsGreaterThan100(int v) { return v > 100; }
bool IsLessThan10(int v) { return v < 10; }
event_source<int> A = make_source<int>();
event_source<int> B = make_source<int>();
events<int> X = merge(A, B);
events<int> Y1 = filter(X, IsGreaterThan100);
events<int> Y2 = filter(X, IsLessThan10);
// Instead of declaring named functions, we can also use C++11 lambdas
events<float> Z = transform(Y1, [] (int v) { return v / 100.0f; });
A.emit(1);
B.emit(2);
----
`event_source` exists analogously to `var_signal` to provide an imperative interface.
Values enter through sources `A` and `B`, both of which get merged into a single stream `X`.
`Y1` and `Y2` are each filtered to contain only certain numbers forwarded from `X`.
`Z` takes values from `Y1` and transforms them.
Of course, for those events to have an effect, they should trigger actions or be stored somewhere persistently.
The example leaves that part out for now and instead focuses on composability, i.e. how functions like `merge`
or `transform` are used to create new event streams from existing ones.
This is possible, because event streams are first-class values, as opposed to being indirectly represented by the API (e.g. `OnMouseClick()` vs. `events<> MouseClick`).
== Signal-events interrelation
Both signal world and events world can be used separately if one of them is not needed.
But the real power comes from the fact that both of them can work together.
For example if we can only poll mouse positions from somewhere, we can easily produce events of new mouse positions if they are changed using `monitor` algorithm:
[source,c++]
----
var_signal<Point> mousePos = make_var(Point(0, 0));
events<Point> mouseMoved = monitor(mousePos);
----
And the opposite.
If we can only receive mouse position events, then we can store the last known mouse position as a signal using `hold` algorithm:
[source,c++]
----
event_source<Point> mouseMoved = make_source<Point>();
signal<Point> mousePos = hold(mouseMoved, Point(0, 0));
----
Additionally, signals can be used as additional arguments for some events related algorithms.
It is called synchronization, because evaluation of these algorithms is postponed after evaluation of all the dependant signals.
For example lets filter events using signal value as a threshold:
[source,c++]
----
event_source<int> src = make_source<int>();
var_signal<int> filterThreshold = make_var(10);
events<int> filtered =
filter(src, with(filterThreshold),
[](int v, int threshold){
return v >= threshold;
});
----
== Observers
Observers fill the role of "common" callbacks with side effects.
They allow to register functions at subjects, which can be any signal or event stream, and have these functions called when notified by the subject.
In particular, a signal observer is called every time the signal value changes, and an event observer is called for every event that passes the stream.
Observers are only ever on the receiving side of notifications, because they have no dependents.
Similar to signals and event streams, observers are first-class objects.
They're created with `observe(theSubject, theCallback)`, which returns an instance of `observer`.
This instance provides an interface to detach itself from the subject explicitly.
An example shows how observers complement event streams to trigger actions (e.g. console output):
[source,c++]
----
events<> LeftClick = ...;
events<> RightClick = ...;
// Note: If the event value type is omitted, the type 'unit' is used as default.

observer obs =
observe(
merge(LeftClick, RightClick),
[] (unit) {
std::cout << "Clicked" << std::endl;
});
...
----

Another example, this time using a signal observer:

[source,c++]
----
var_signal<int> LastValue = ...;
observer obs =
observe(
LastValue,
[] (int v) {
std::cout < "Value changed to " << v << std::endl;
});
...
----

The obvious benefit is that we don't have to implement callback registration and dispatch mechanisms ourselves.
Further, they are already in place and available on every reactive value without any additional steps.

== Coordinated change propagation

A problem that has been mentioned earlier is the uncoordinated execution of callbacks.
To construct a practical scenario for this, imagine a graphical user interface, consisting of multiple components that are positioned on the screen.
If the size of any of the components changes, the screen layout has to be updated.
There exist several controls the user can interact with to change the size of specific components.

In summary, this scenario defines three layers, connected by callbacks:

* the input controls
* the display components
* the screen layout

The question is, what happens if a single control affects multiple components:

* Each component individually triggers an update of the layout.
As we add more layers to our scenario, the number of updates caused by a single change might grow exponentially.
* Some of these updates are executed while part of the components have already been changed, while others have not.
This can lead to intermittent errors - or _glitches_ - which are hard to spot.
* Parallelization requires mutually exclusive access to shared data and critical operations (i.e. updating the layout cannot happen concurrently).

When building the same system based on signals, events and observers, execution of individual callback functions is ordered.
First, all inputs are processed; then, the components are changed; lastly, the layout is updated once.
This ordering is based on the inherent hierarchy of the presented model.

Enabling parallelization comes naturally with this approach, as it becomes an issue of implicitly synchronizing forks and joins of the control flow, rather than managing mutually exclusive access to data. µReact don't support parallelization though.

== Conclusion

The presented reactive types provide us with specialized tools to address requirements that would otherwise be implemented with callbacks and side effects:

* Signals, as an alternative to updating and propagating state changes manually.
* Event streams, as an alternative to transferring data between event handlers explicitly, i.e. through shared message queues.

For cases where callbacks with side effects are not just a means to an end, but what is actually intended, observers exist as an alternative to setting up registration mechanisms by hand.

Change propagation is not just based on flooding, but uses a coordinated approach for

* avoidance of intermittent updates
* glitch freedom
* implicit parallelism (not in µReact yet)

This concludes an overview of the main features µReact has to offer.

=== Further reading

The concepts described in this article are well-established in reactive programming and not original to this library, though semantics may slightly differ between implementations.
An academic paper which has been especially influential for the design of this library is http://lamp.epfl.ch/~imaier/pub/DeprecatingObserversTR2010.pdf[Deprecating the Observer Pattern] by Maier et al.

0 comments on commit 195106f

Please sign in to comment.