Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design documentation for adding a raw-FFI thread manager #31

Open
wants to merge 42 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
38791cf
Add raw-ffi design
acarbonetto Nov 1, 2023
29e2244
Update to add Shachars changes
acarbonetto Nov 1, 2023
32b751e
Update design documentation
acarbonetto Nov 20, 2023
799b248
Update docs
acarbonetto Nov 20, 2023
a86e424
Add API design doc
jonathanl-bq Nov 21, 2023
265daa0
Update section on supported commands in API design doc
jonathanl-bq Nov 21, 2023
8d43095
Update API design doc with more details
jonathanl-bq Nov 22, 2023
11ea2c7
Update type handling policy in API design doc
jonathanl-bq Nov 22, 2023
d7325a1
Push update
acarbonetto Nov 22, 2023
fe1690a
Update API design doc with Routing info
jonathanl-bq Nov 24, 2023
744e92f
Add example showing how executeRaw would work to API design doc
jonathanl-bq Nov 24, 2023
56ebe90
Add Redis to Java and Go encoding
acarbonetto Nov 26, 2023
64e034a
Change to supporting RESP2 instead of RESP3 for now
jonathanl-bq Nov 27, 2023
2d688c1
Add go and java-specific language
acarbonetto Nov 27, 2023
f6b702b
Clean up section on supported commands in API design doc
jonathanl-bq Nov 29, 2023
af4e2a4
Fix typo in API design doc
jonathanl-bq Nov 29, 2023
df00c54
Update docs/design-api.md
jonathanl-bq Nov 30, 2023
ce3eb6c
Add some more details to API design
jonathanl-bq Nov 30, 2023
28c672b
Add java design documentation
acarbonetto Dec 13, 2023
91490bc
Add use-cases as examples of using the API
acarbonetto Dec 15, 2023
7058b02
Add more examples; return Type directly
acarbonetto Dec 20, 2023
f339390
Update customCommand use case
acarbonetto Dec 20, 2023
c4a13da
Update transactional use-cases
acarbonetto Dec 20, 2023
8bb74ba
Add Go API documentation
aaron-congo Jan 23, 2024
eece5f9
add missing period
aaron-congo Jan 23, 2024
73ba55b
Address PR feedback
aaron-congo Jan 23, 2024
b88cbbd
Update struct diagram
aaron-congo Jan 24, 2024
30406bf
PR suggestions
aaron-congo Jan 24, 2024
041bd39
Add documentation for the Go API design
aaron-congo Jan 24, 2024
a25467e
Add documentation for the Go FFI design
aaron-congo Jan 26, 2024
0062ed9
Update diagrams so that maps and arrays of Redis values include an en…
aaron-congo Jan 26, 2024
ae090e6
Fix mistakes in the FFI request success struct diagram
aaron-congo Jan 26, 2024
a852c04
Scale up diagrams to be more readable
aaron-congo Jan 27, 2024
50bc303
Address PR feedback
aaron-congo Jan 30, 2024
fdd659a
Increase size of API struct diagram to make it more readable
aaron-congo Jan 30, 2024
c3b7ae6
Update request success struct diagram
aaron-congo Jan 30, 2024
d5edd9d
Update connection sequence diagram
aaron-congo Jan 30, 2024
3bd84ba
Update connection sequence diagram
aaron-congo Jan 30, 2024
26bafc1
Add glide-core to connection sequence diagram
aaron-congo Jan 31, 2024
765d220
Add documentation for the Go FFI design
aaron-congo Jan 31, 2024
969a896
Update Go use cases with current configuration implementation
aaron-congo Feb 24, 2024
3896446
Update Go use cases to use config without pointer fields
aaron-congo Feb 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions docs/design-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
API Design

# Client Wrapper API design doc

## API requirements:
- The API will be thread-safe.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be use present instead of future?

- The API will accept as inputs all of [RESP2 types](https://github.com/redis/redis-specifications/blob/master/protocol/RESP2.md). We plan to add support for RESP3 types when they are available.
- The API will attempt authentication, topology refreshes, reconnections, etc., automatically. In case of failures concrete errors will be returned to the user.
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved

## Command Interface

### Unix Domain Socket solution
For clients based on Unix Domain Sockets (UDS), we will simply use the existing protobuf messages for creating a connection, sending requests, and receiving responses. Supported commands are enumerated in the [protobuf definition for requests](../babushka-core/src/protobuf/redis_request.proto) and we may add more in the future, although the `CustomCommand` request type is also adequate for all commands. As defined in the [protobuf definition for responses](../babushka-core/src/protobuf/response.proto), client wrappers will receive data as a pointer, which can be passed to Rust to marshal the data back into the wrapper language’s native data types.
jonathanl-bq marked this conversation as resolved.
Show resolved Hide resolved

Transactions will be handled by adding a list of `Command`s to the protobuf request. The response will be a `redis::Value::Bulk`, which should be handled in the same Rust function that marshals the data back into the wrapper language's native data types. This is handled by storing the results in a collection type native to the wrapper language.

When running Redis in Cluster Mode, several routing options will be provided. These are all specified in the protobuf request. The various options are detailed below in the ["Routing Options" section](#routing-options). We will also provide a separate client for handling Cluster Mode responses, which will convert the list of values and nodes into a map, as is done in existing client wrappers.

### Raw FFI solution
jonathanl-bq marked this conversation as resolved.
Show resolved Hide resolved
For clients using a raw FFI solution, in Rust, we will expose a general command that is able to take any command and arguments as strings.

Like in the UDS solution, we will support a separate client for Cluster Mode.

We have 2 options for passing the command, arguments, and any additional configuration to the Rust core from the wrapper language:

#### Protobuf
The wrapper language will pass the commands, arguments, and configuration as protobuf messages using the same definitions as in the UDS solution.

Transactions will be handled by adding a list of `Command`s to the protobuf request. The response will be a `redis::Value::Bulk`, which can be marshalled into a C array of values before being passed from Rust to the wrapper language. The wrapper language is responsible for converting the array of results to its own native collection type.

Cluster Mode support is the same here as in the UDS solution detailed above.

Pros:
- We get to reuse the protobuf definitions, meaning fewer files to update if we make changes to the protobuf definitions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Redis commands can be presented as a simple string array, so passing protobuf messages from the wrapper to the core adds unnecessary complication when we're talking about a raw FFI solution

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. we'll have a generic execute_command function in rust that excepts a string array and all FFI functions from the wrapper will call it

- May be simpler to implement compared to the C data types solution, since we do not need to define our own C data types

Cons:
- There is additional overhead from marshalling data to and from protobuf, which could impact performance significantly

#### C Data Types
The wrapper language will pass commands, arguments, and configuration as C data types.

Transactions will be handled by passing a C array of an array of arguments to Rust from the wrapper language. The response will be a `redis::Value::Bulk`, which can be marshalled in the same way as explained in the protobuf solution.

For Cluster Mode support, [routing options](#routing-options) will be defined as C enums and structs. Like in the protobuf solution, we will provide a separate client for handling Cluster Mode responses, which will convert the list of values and nodes into a map.

Pros:
- No additional overhead from marshalling to and from protobuf, so this should perform better
- May be simpler to implement compared to protobuf solution, since it can be tricky to construct protobuf messages in a performant way and we have to add a varint to the messages as well

Cons:
- Would add an additional file to maintain containing the C definitions (only one file though, since we could share between all raw FFI solutions), which we would need to update every time we want to update the existing protobuf definitions

We will be testing both approaches to see which is easier to implement, as well as the performance impact before deciding on a solution.

To marshal Redis data types back into the corresponding types for the wrapper language, we will convert them into appropriate C types, which can then be translated by the wrapper language into its native data types. Here is what a Redis result might look like:
```
typedef struct redisValue {
enum {NIL, INT, DATA, STATUS, BULK, OKAY} kind;
union Payload {
long intValue;
unsigned char *dataValue;
char *statusValue;
struct redisValue *bulkValue;
} payload;
} RedisValue
```

## Routing Options
We will be supporting routing Redis requests to all nodes, all primary nodes, or a random node. For more specific routing to a node, we will also allow sending a request to a primary or replica node with a specified hash slot or key. When the wrapper given a key route, the key is passed to the Rust core, which will find the corresponding hash slot for it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will

or

We should

or

The client should/will/etc

Also: future or present? (here and below)


## Supported Commands
We will be supporting all Redis commands. Commands with higher usage will be prioritized, as determined by usage numbers from AWS ElastiCache usage logs.

Two different methods of sending commands will be supported:

### Custom Command
We will expose an `executeRaw` method that does no validation of the input types or command on the client side, leaving it up to Redis to reject the command should it be malformed. This gives the user the flexibility to send any type of command they want, including ones not officially supported yet.

For example, if a user wants to implement support for the Redis ZADD command in Java, their implementation might look something like this:
```java
public Long zadd(K key, double score, V member) throws RequestException {
string[] args = { key.toString(), score.toString(), member.toString() };
return (Long) executeRaw(args);
}
```

where `executeRaw` has the following signature:
```java
public Object executeRaw(string[] args) throws RequestException
```

### Explicitly Supported Command
We will expose separate methods for each supported command. There will be a separate version of each method for transactions, as well as another version for Cluster Mode clients. For statically typed languages, we will leverage the compiler of the wrapper language to validate the types of the command arguments as much as possible. Since wrappers should be as lightweight as possible, we will be performing very few to no checks for proper typing for non-statically typed languages.

## Errors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference where these errors came from

ClosingError: Errors that report that the client has closed and is no longer usable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ClosingError: Errors that report that the client has closed and is no longer usable.
`ClosingError`: Errors that report that the client has closed and is no longer usable.

And below


RedisError: Errors that were reported during a request.

TimeoutError: Errors that are thrown when a request times out.

ExecAbortError: Errors that are thrown when a transaction is aborted.

ConnectionError: Errors that are thrown when a connection disconnects. These errors can be temporary, as the client will attempt to reconnect.

Errors returned are subject to change as we update the protobuf definitions.

## Java Specific Details
We will be using the UDS solution for communication between the wrapper and the Rust core. This thin layer is implemented using the [jni-rs library](https://github.com/jni-rs/jni-rs) to start the socket listener and marshal Redis values into native Java data types.

Errors in Rust are represented as Algebraic Data Types, which are not supported in Java by default (at least not in the versions of Java we want to support). Instead, we utilise the [jni-rs library](https://github.com/jni-rs/jni-rs) to throw Java `Exception`s where we receive errors from Redis.

## Golang Specific Details
We will be using a raw FFI solution for communication between the wrapper and the Rust core. TODO: Add more details here
240 changes: 240 additions & 0 deletions docs/design-raw-ffi.md

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename file - it shows both raw FFI and UDS approaches

Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
# Babushka Core Wrappers

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename all references

Suggested change
# Babushka Core Wrappers
# Glide Core Wrappers


## Summary

The Babushka client allows Redis users to connect to Redis using a variety of commands through a thin-client optimized for
various languages. The client uses a performant core to establish and manage connections and communicate with Redis. The thin-client
wrapper talks to the core using an FFI (foreign function interface) to Rust.

The following document discusses two primary communication protocol architectures for wrapping the Babushka clients. Specifically,
it details how Java-Babushka and Go-Babushka each use a different protocol and describes the advantages of each language-specific approach.

# Unix Domain Socket Manager Connector

## High-Level Design

**Summary**: The Babushka "UDS" solution uses a socket listener to manage rust-to-wrapper worker threads, and unix domain sockets
to deliver command requests between the wrapper and redis-client threads. This works well because we allow the unix sockets to pass messages and manage threads
through the OS, and unix sockets are very performant. This results in simple/fast communication. The risk to avoid is that
unix sockets can become a bottleneck for data-intensive commands, and the library can spend too much time waiting on I/O
blocking operations.

```mermaid
stateDiagram-v2
direction LR

Wrapper: Wrapper
UnixDomainSocket: Unix Domain Socket
RustCore: Rust-Core

[*] --> Wrapper: User
Wrapper --> UnixDomainSocket
UnixDomainSocket --> Wrapper
RustCore --> UnixDomainSocket
UnixDomainSocket --> RustCore
RustCore --> Redis
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
Redis --> RustCore
```

## Decision to use UDS Sockets for a Java-Babushka Wrapper
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved

The decision to use Unix Domain Sockets (UDS) to manage the Java-wrapper to Babushka Redis-client communication was thus:
1. Java contains an efficient socket protocol library ([netty.io](https://netty.io/)) that provides a highly configurable environment to manage sockets.
2. Java objects serialization/de-serialization is an expensive operation, and a performing multiple io operations between raw-ffi calls would be inefficient.
3. The async FFI requests with callbacks requires that we manage multiple runtimes (Rust and Java Thread management), and JNI does not provide an out-of-box solution for this.

### Decision Log

acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
| Protocol | Details | Pros | Cons |
|----------------------------------------------|-------------------------------------------------------------|-----------------------------|----------------------------------------------------|
| Unix Domain Sockets (jni/netty) | JNI to submit commands; netty.io for message passing; async | netty.io standard lib; | complex configuration; limited by socket interface |
| Raw-FFI (JNA, uniffi-rs, j4rs, interoptopus) | FFI to submit commands; Rust for message processing | reusable in other languages | slow performance and uses JNI under the hood |
| Panama/jextract | Performance similar to a raw-ffi using JNI | modern | lacks early Java support (JDK 18+); prototype |

### Sequence Diagram

```mermaid
sequenceDiagram

participant Wrapper as Java-Wrapper
participant ffi as FFI
participant manager as Rust-Core
participant worker as Tokio Worker
participant SocketListener as Socket Listener
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
participant Socket as Unix Domain Socket
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
participant Client as Redis

activate Wrapper
activate Client
Wrapper -)+ ffi: connect_to_redis
ffi -)+ manager: start_socket_listener(init_callback)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ffi -)+ manager: start_socket_listener(init_callback)
ffi -)+ manager: start_socket_listener

manager -) worker: Create Tokio::Runtime (count: CPUs)
activate worker
worker ->> SocketListener: listen_on_socket(init_callback)
SocketListener ->> SocketListener: loop: listen_on_client_stream
activate SocketListener
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
SocketListener -->> manager:
manager -->> ffi: socket_path
ffi -->>- Wrapper: socket_path
SocketListener -->> Socket: UnixStreamListener::new
activate Socket
SocketListener -->> Client: BabushkaClient::new
Wrapper ->> Socket: connect
Socket -->> Wrapper:
loop single_request
Wrapper ->> ffi: java_arg_to_redis
ffi -->> Wrapper:
Wrapper -> Wrapper: pack protobuf.redis_request
Wrapper ->> Socket: netty.writeandflush (protobuf.redis_request)
Socket -->> Wrapper:
Wrapper ->> Wrapper: wait
SocketListener ->> SocketListener: handle_request
SocketListener ->> Socket: read_values_loop(client_listener, client)
Socket -->> SocketListener:
SocketListener ->> Client: send(request)
Client -->> SocketListener: ClientUsageResult
SocketListener ->> Socket: write_result
Socket -->> SocketListener:
Wrapper ->> Socket: netty.read (protobuf.response)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not correct. May be we need to split diagram into 2 or 3 ones: java to UDS, UDS to rust and java-uds-rust in zoom out mode.

Socket -->> Wrapper:
Wrapper ->> ffi: redis_value_to_java
ffi -->> Wrapper:
Wrapper ->> Wrapper: unpack protobuf.response
end
Wrapper ->> Socket: close()
Wrapper ->> SocketListener: shutdown
SocketListener ->> Socket: close()
deactivate Socket
SocketListener ->> Client: close()
SocketListener -->> Wrapper:
deactivate SocketListener
deactivate worker
deactivate Wrapper
deactivate Client
```

### Discussion
* `redis_value_to_java`: This ffi call is necessary to evaluate the Redis::Value response that Redis returns to Rust-core,
and needs to be converted to a `JObject` before it can be evaluated by Java. We are looking for alternatives to this call
to avoid an unnecessary ffi call.
* `java_arg_to_redis`: This ffi call is currently unnecessary, because all arguments sent are Strings.


### Elements
* **Java-Wrapper**: Our Babushka wrapper that exposes a client API (java, python, node, etc)
* **Babushka FFI**: Foreign Function Interface definitions from our wrapper to our Rust Babushka-Core
* **Babushka impl**: public interface layer and thread manager
* **Tokio Worker**: Tokio worker threads (number of CPUs)
* **SocketListener**: listens for work from the Socket, and handles commands
* **Unix Domain Socket**: Unix Domain Socket to handle incoming requests and response payloads between Rust-Core and Wrapper
* **Redis**: Our data store

## Wrapper-to-Core Connector with raw-FFI calls

**Summary**: Foreign Function Interface (FFI) calls are simple to implement, cross-language calls. The setup between Golang and the Rust-core
is fairly simple using the well-supported CGO library. While sending language calls is easy, setting it up in an async manner
requires that we handle async callbacks. Golang has a simple, light-weight solution to that, using goroutines and channels,
to pass callbacks and execution between the languages.

```mermaid
stateDiagram-v2
direction LR

Wrapper: Golang Wrapper
FFI: Foreign Function Interface
RustCore: Rust-Core

[*] --> Wrapper: User
Wrapper --> FFI
FFI --> Wrapper
RustCore --> FFI
FFI --> RustCore
RustCore --> Redis
```

## Decision to use Raw-FFI calls directly to Rust-Core for Golang Wrapper

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move go to another doc too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather hold off on splitting until we have an idea of where these documents will be stored.


### Decision Log

The decision to use raw FFI request from Golang to Rust-core was straight forward:
1. Golang contains goroutines as an alternative, lightweight, and performant solution serves as an obvious solution to pass request, even at scale.

acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
Due to lightweight thread management solution, we chose a solution that scales quickly and requires less configuration to achieve a performant solution
on par with existing industrial standards ([go-redis](https://github.com/redis/go-redis)).

| Protocol | Details | Pros | Cons |
|--------------------------|---------|--------------------------------------------------------|--------------------------------------|
| Unix Domain Sockets | | UDS performance; consistent protocol between languages | complex configuration |
| Raw-FFI (CGO/goroutines) | | simplified and light-weight interface | separate management for each request |

## Sequence Diagram - Raw-FFI Client

**Summary**: If we make direct calls through FFI from our Wrapper to Rust, we can initiate commands to Redis. This allows us
to make on-demand calls directly to Rust-core solution. Since the calls are async, we need to manage and populate a callback
object with the response and a payload.

We will need to avoid busy waits while waiting on the async response. The wrapper and Rust-core languages independently track
threads. On the Rust side, they use a Tokio runtime to manage threads. When the Rust-core is complete, and returning a Response,
we can use the Callback object to re-awake the wrapper thread manager and continue work.

Go routines have a performant solution using light-weight go-routines and channels. Instead of busy-waiting, we awaken by
pushing goroutines to the result channel once the Tokio threads send back a callback.

### Sequence Diagram

```mermaid
sequenceDiagram

participant Wrapper as Go-Wrapper
participant channel as Result Channel
participant ffi as Babushka FFI
participant manager as Babushka impl
participant worker as Tokio Worker
participant Client as Redis

activate Wrapper
activate Client
Wrapper -)+ ffi: create_connection(connection_settings)
ffi ->>+ manager: start_thread_manager(init_callback)
manager ->> worker: Create Tokio::Runtime (count: CPUs)
activate worker
manager -->> Wrapper: Ok(BabushkaClient)
worker ->> Client: BabushkaClient::new
worker ->> worker: wait_for_work(init_callback)

loop single_request
Wrapper ->> channel: make channel
activate channel
Wrapper -) ffi: command: single_command(protobuf.redis_request, &channel)
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
Wrapper ->> channel: wait
ffi ->> manager: cmd(protobuf.redis_request)
manager ->> worker: command: cmd(protobuf.redis_request)
worker ->> Client: send(command, args)
Client -->> worker: Result
worker -->> ffi: Ok(protobuf.response<Redis::Value>)
ffi -->> channel: Ok(protobuf.response<Result>)
channel ->> Wrapper: protobuf.response<Result>
Wrapper ->> channel: close
deactivate channel
end

Wrapper -) worker: close_connection
worker -->> Wrapper:
deactivate worker
deactivate Wrapper
deactivate Client
```

### Discussion

Message format interface: When passing messages between the Go-wrapper and Rust-core, we need to use a language-idiomatic
format. Protobuf, for example, passes messages in wire-frame. We could also pass messages using a custom C datatype.
Protobuf is available, but the overhead to encode and decode messages may make a custom C datatype more worthwhile.

### Elements
* **Go-Wrapper**: Our Babushka wrapper that exposes a client API (Go, etc)
* **Result Channel**: Goroutine channel on the Babushka Wrapper
* **Babushka FFI**: Foreign Function Interface definitions from our wrapper to our Rust Babushka-Core
* **Babushka impl**: public interface layer and thread manager
* **Tokio Worker**: Tokio worker threads (number of CPUs)
* **Redis**: Our data store