Skip to content

Commit

Permalink
docs: Added concepts section
Browse files Browse the repository at this point in the history
  • Loading branch information
jruaux committed Jun 12, 2024
1 parent b47b22b commit 80db98d
Show file tree
Hide file tree
Showing 20 changed files with 374 additions and 351 deletions.
2 changes: 0 additions & 2 deletions docs/guide/src/docs/asciidoc/_links.adoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
:link_etl: link:https://en.wikipedia.org/wiki/Extract,_transform,_load[ETL]
:link_releases: link:{project-url}/releases[{project-title} Releases]
:link_redis_pipelining: link:https://redis.io/topics/pipelining[Redis Pipelining]
:link_redis_7: link:https://raw.githubusercontent.com/redis/redis/7.0/00-RELEASENOTES[Redis 7.0]
:link_redis_notif: link:https://redis.io/docs/manual/keyspace-notifications[Redis Keyspace Notifications]
:link_redis_enterprise: link:https://redis.com/redis-enterprise-software/overview/[Redis Enterprise]
:link_redis_crdb: link:https://redis.com/redis-enterprise/technology/active-active-geo-distribution/[Redis Enterprise CRDB]
:link_redis_bigkeys: link:https://developer.redis.com/operate/redis-at-scale/observability/identifying-issues/#scanning-keys[Big keys]
:link_pipeline_tuning: link:https://stackoverflow.com/a/32165090[Redis Pipeline Tuning]
:link_lettuce_api: link:https://lettuce.io/core/release/api/io/lettuce/core/api/sync/RedisCommands.html[Lettuce API]
:link_lettuce_uri: link:https://github.com/lettuce-io/lettuce-core/wiki/Redis-URI-and-connection-details#uri-syntax[Redis URI Syntax]
:link_lettuce_readfrom: link:https://github.com/lettuce-io/lettuce-core/wiki/ReadFrom-Settings#read-from-settings[Read-From Settings]
Expand Down
38 changes: 30 additions & 8 deletions docs/guide/src/docs/asciidoc/concepts.adoc
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
[[_concepts]]
= Concepts

{project-title} is essentially an {link_etl} tool where data is extracted from the source system, transformed (see <<_processing,Processing>>), and loaded into the target system.
{project-title} is essentially an {link_etl} tool where data is extracted from the source system, transformed (see <<_concepts_processing,Processing>>), and loaded into the target system.

image::architecture.svg[]

[[_batching]]
[[_concepts_batching]]
== Batching

Processing in {project-title} is done in batches: a fixed number of records is read from the source, processed, and written to the target.
The default batch size is `50`, which means that an execution step reads 50 items at a time from the source, processes them, and finally writes then to the target.
If the target is Redis, writing is done in a single command ({link_redis_pipelining}) to minimize the number of roundtrips to the server.
If the source/target is Redis, reading/writing of a batch is done in a single https://redis.io/topics/pipelining[command pipeline] to minimize the number of roundtrips to the server.

You can change the batch size (and hence pipeline size) using the `--batch` option.
The optimal batch size in terms of throughput depends on many factors like record size and command types (see {link_pipeline_tuning} for details).
The optimal batch size in terms of throughput depends on many factors like record size and command types (see https://stackoverflow.com/a/32165090[Redis Pipeline Tuning] for details).

[[_threads]]
[[_concepts_threads]]
== Multi-threading

It is possible to parallelize processing by using multiple threads.
By default processing happens in a single thread, but it is possible to parallelize processing by using multiple threads.
In that configuration, each chunk of items is read, processed, and written in a separate thread of execution.
This is different from partitioning where items would be read by multiple readers.
Here, only one reader is being accessed from multiple threads.
Expand All @@ -31,7 +31,7 @@ To set the number of threads, use the `--threads` option.
include::{testdir}/db-import-postgresql-multithreaded[]
----

[[_processing]]
[[_concepts_processing]]
== Processing

{project-title} lets you transform incoming records using processors.
Expand Down Expand Up @@ -69,7 +69,7 @@ You can register your own variables using `--var`.
include::{testdir}/file-import-process-var[]
----

[[_filters]]
[[_concepts_filtering]]
== Filtering

Filters allow you to exclude records that don't match a {_link_spel} boolean expression.
Expand All @@ -80,3 +80,25 @@ For example this filter will only keep records where the `value` field is a seri
----
riot file-import --filter "value matches '\\d+'" ...
----

[[_concepts_replication]]
== Replication
Most Redis migration tools available today are offline in nature.
Migrating data from AWS ElastiCache to Redis Enterprise Cloud for example means backing up your Elasticache data to an AWS S3 bucket and importing it into Redis Enterprise Cloud using its UI.
Redis has a replication command called https://redis.io/commands/replicaof[REPLICAOF] but it is not always available (see https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/RestrictedCommands.html[ElastiCache restrictions]).

Instead, {project-title} implements using *dump & restore* or *type-based read & write*.
Both snapshot and live replication modes are supported.

image::replication-architecture.svg[]

WARNING: Please note that {project-title} replication is NEITHER recommended NOR officially supported by Redis, Inc.

The basic replication mechanism is as follows:

1. Identify source keys to be replicated using scan and/or keyspace notifications depending on the <<_replication_mode,replication mode>>.

2. Read data associated with each key using <<_replication_type_dump,dump>> or <<_replication_type_struct,type-specific commands>>.

3. Write each key to the target using <<_replication_type_dump,restore>> or <<_replication_type_struct,type-specific commands>>.

2 changes: 1 addition & 1 deletion docs/guide/src/docs/asciidoc/cookbook.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Here are various recipes using {project-title}.

:leveloffset: +1
include::{includedir}/elasticache.adoc[]
include::{includedir}/changelog.adoc[]
include::{includedir}/elasticache.adoc[]
include::{includedir}/ping.adoc[]
:leveloffset: -1
122 changes: 122 additions & 0 deletions docs/guide/src/docs/asciidoc/databases.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
[[_db]]
= Databases

{project-title} includes two commands for interaction with relational databases:

* <<_db_import, `db-import`>>: Import database tables into Redis
* <<_db_export, `db-export`>>: Export Redis data structures to a database

[[_db_drivers]]
== Drivers

{project-title} relies on JDBC to interact with databases.
It includes JDBC drivers for the most common database systems:

* {link_jdbc_oracle}
+
`jdbc:oracle:thin:@myhost:1521:orcl`

* {link_jdbc_mssql}
+
`jdbc:sqlserver://[serverName[\instanceName][:portNumber]][;property=value[;property=value]]`

* {link_jdbc_mysql}
+
`jdbc:mysql://[host]:[port][/database][?properties]`

* {link_jdbc_postgres}
+
`jdbc:postgresql://host:port/database`

[TIP]
====
For non-included databases you must install the corresponding JDBC driver under the `lib` directory and modify the `CLASSPATH`:
* *nix: `bin/riot` -> `CLASSPATH=$APP_HOME/lib/myjdbc.jar:$APP_HOME/lib/...`
* Windows: `bin\riot.bat` -> `set CLASSPATH=%APP_HOME%\lib\myjdbc.jar;%APP_HOME%\lib\...`
====

[[_db_import]]
== Database Import

The `db-import` command imports data from a relational database into Redis.

NOTE: Ensure {project-title} has the relevant JDBC driver for your database.
See the <<_db_drivers,Drivers>> section for more details.

[source,console]
----
riot db-import -h <redis host> -p <redis port> --url <jdbc url> SQL [REDIS COMMAND...]
----

To show the full usage, run:

[source,console]
----
riot db-import --help
----

You must specify at least one Redis command as a target.

[IMPORTANT]
====
Redis connection options apply to the root command (`riot`) and not to subcommands.
In this example the Redis options will not be taken into account:
[source,subs="verbatim,attributes"]
----
riot db-import "SELECT * FROM customers" hset -h myredis.com -p 6380
----
====


The keys that will be written are constructed from input records by concatenating the keyspace prefix and key fields.

image::mapping.svg[]

.PostgreSQL Import Example
[source,console]
----
include::{testdir}/db-import-postgresql[]
----

.Import from PostgreSQL to JSON strings
[source,console]
----
include::{testdir}/db-import-postgresql-set[]
----

This will produce Redis strings that look like this:
[source,json]
----
include::{includedir}/../resources/order.json[]
----

[[_db_export]]
== Database Export

Use the `db-export` command to read from a Redis database and writes to a SQL database.

NOTE: Ensure {project-title} has the relevant JDBC driver for your database.
See the <<_db_drivers,Drivers>> section for more details.

The general usage is:
[source,console]
----
riot db-export -h <redis host> -p <redis port> --url <jdbc url> SQL
----

To show the full usage, run:
[source,console]
----
riot db-export --help
----

.Example: export to PostgreSQL
[source,console]
----
include::{testdir}/db-export-postgresql[]
----


125 changes: 121 additions & 4 deletions docs/guide/src/docs/asciidoc/datagen.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,124 @@
[[_datagen]]
= Data Generators
= Data Generation

{project-title} includes 2 commands for data generation.
{project-title} includes 2 commands for data generation:

* <<_datagen_struct,`generate`>>: Generate Redis data structures
* <<_datagen_faker,`faker`>>: Import data from {link_datafaker}

[[_datagen_struct]]
== Data Structure Generator

The `generate` command generates Redis data structures as well as JSON and Timeseries.

[source,console]
----
riot generate [OPTIONS]
----

.Example
[source,console]
----
include::{testdir}/generate[]
----

[[_datagen_faker]]
== Faker Generator

The `faker` command generates data using {link_datafaker}.

[source,console]
----
riot faker [OPTIONS] EXPRESSION... [REDIS COMMAND...]
----

where `EXPRESSION` is a {link_spel} field in the form `field="expression"`.

To show the full usage, run:

[source,console]
----
riot faker --help
----

You must specify at least one Redis command as a target.

[IMPORTANT]
====
Redis connection options apply to the root command (`riot`) and not to subcommands.
In this example the Redis options will not be taken into account:
[source,subs="verbatim,attributes"]
----
riot faker id="index" hset -h myredis.com -p 6380
----
====

[[_datagen_faker_keys]]
=== Keys

Keys are constructed from input records by concatenating the keyspace prefix and key fields.

image::mapping.svg[]

.Import into hashes
[source,console]
----
include::{testdir}/faker-hset[]
----

.Import into sets
[source,console]
----
include::{testdir}/faker-sadd[]
----

[[_datagen_faker_providers]]
=== Data Providers

Faker offers many data providers.
Most providers don't take any arguments and can be called directly:

.Simple Faker example
[source,console]
----
riot faker firstName="name.firstName"
----

Some providers take parameters:

.Parameter Faker example
[source,console]
----
riot faker lease="number.digits(2)"
----

Refer to {link_datafaker_doc} for complete documentation.

[[_datagen_faker_fields]]
=== Built-in Fields

In addition to the Faker fields specified with `field="expression"` you can use these built-in fields:

`index`:: current iteration number.

`thread`:: current thread id.
Useful for multithreaded data generation.

.Multithreaded data generator
[source,console]
----
include::{testdir}/faker-threads[]
----

[[_datagen_faker_search]]
=== Redis Search

You can infer Faker fields from a Redis Search index using the `--infer` option:

[source,console]
----
include::{testdir}/faker-infer[]
----

include::{includedir}/faker.adoc[leveloffset=+1]
include::{includedir}/generate.adoc[leveloffset=+1]
25 changes: 0 additions & 25 deletions docs/guide/src/docs/asciidoc/db-export.adoc

This file was deleted.

Loading

0 comments on commit 80db98d

Please sign in to comment.