Batched PUT requests #205

robagar · 2024-09-03T09:38:05Z

This adds the ability to send multiple data points in a single PUT to InfluxDB, giving a huge performance gain for high frequency data.

Volume configuration:

put_batch_size - Maximum number of data points per PUT request
put_batch_timeout_ms - Milliseconds before sending a batch if not full

If put_batch_size is not set it reverts to the original behaviour of sending each data point in its own PUT request.

gabrik · 2024-09-03T09:57:18Z

Hi @robagar, thank you for your contribution.

In order to accept it we need you to sing the Eclipse Contribution Agreement.

robagar · 2024-09-03T10:04:21Z

ECA signed

Mallets · 2024-09-04T14:49:26Z

@Charles-Schleich could you review the PR?
@robagar I see you are only targeting v1, before merging this kind of changes it would be good to ensure that it is available also for v2.

robagar · 2024-09-04T15:28:58Z

We don't use InfluxDB v2, and have no intention to. It looks like a dead end, with Flux being abandoned for v3.

Charles-Schleich · 2024-09-05T08:48:27Z

Hi @robagar, thank you for the contribution !
A couple of things to note:
We strive to keep the different versions of the influxdb plugin aligned in terms of features and behavior, so to consider merging PR's we would need the same batching feature in version 2 as well.
The configuration options would also need to be added to the README for both versions so the users know they are available.
PR are generally acceptable if they cover config, documentation, tests, and ideally some performance tests/setup that we can reproduce our side.

robagar · 2024-09-05T09:07:59Z

Skimming the code it looks fairly simple to add the batching to v2 - the client write method takes a stream as input - but frankly that would be better done by someone who is going to use & test it properly.

Charles-Schleich · 2024-09-05T10:27:58Z

v1/src/lib.rs

+
+            let client_clone = client.clone();
+            let name_clone = config.name.clone();
+            TOKIO_RUNTIME.spawn(async move {


The plugins can be linked statically or dynamically to an instance of zenohd.
In the case that we link statically, we want to reuse the Tokio runtime of zenohd and not use the TOKIO_RUNTIME stored in the lazy static:
This can be achieved by a check on the tokio handle, and running it on the correct executor.

let batch_future = async move{...}; match tokio::runtime::Handle::try_current() { Ok(handle) => handle.spawn(batch_future), Err(_) => TOKIO_RUNTIME.spawn(batch_future), };

Charles-Schleich · 2024-09-05T10:29:07Z

v1/src/lib.rs

+pub const PROP_STORAGE_PUT_BATCH_SIZE: &str = "put_batch_size";
+pub const PROP_STORAGE_PUT_BATCH_TIMEOUT_MS: &str = "put_batch_timeout_ms";


These config options must be documented in the README (there is a pending PR for an example Config file)

Charles-Schleich · 2024-09-05T11:08:10Z

v1/src/lib.rs

+                    // TODO - add pending status
+                    Ok(StorageInsertionResult::Inserted)


I'm not 100% sure about returning aStorageInsertionResult::Inserted as the insertion could fail inside the batch, but it has been reported to zenoh as inserted. This will likely have implications inside replication and storage alignment.

The suggestion to add a Pending status could work, however then we will also have to add a Completed status that will have to be signaled to zenoh that all of those Puts have been successful as a group.
This may require an additional ID per Put when batching, and upon successful write, the ID's of the batch would have to be signaled back to zenoh.

@JEnoch, What do we think about adding a Pending and Complete system ?
it adds a little bit of complication, and i'd like to make sure we don't break alignment due to batching.

Charles-Schleich · 2024-09-05T11:11:44Z

v1/src/lib.rs

+            Some(tx)
+        } else {
+            None
+        };


Suggestion / nit: let mut put_batch_tx = None; above 322 and just over write with a Some() value on 410.

robagar · 2024-09-26T11:55:21Z

HIya.. any chance of this making it into the 1.0.0 release?

Mallets · 2024-10-17T10:52:22Z

After some discussion, the proper support of batching would require some work on the backend API.
There are internal mechanisms in Zenoh, like storage alignment, that requires having the knowledge on wether the put of a message succeeded or not. With the current backend API, it's not possible to bubble up this information when batching is in place. This would lead to an incorrect behaviour when replication is activated.

So I'm afraid some more work is required in Zenoh to properly support this use case.

robagar · 2024-10-17T11:01:33Z

OK, but please be aware that as it stands writing to the InfluxDB backend is just too slow - it falls behind here even with ~100Hz updates, which is hardly excessive.

batched put requests

f5be737

spawn using TOKIO_RUNTIME

e2b805f

robagar marked this pull request as ready for review September 3, 2024 10:59

gabrik requested a review from Charles-Schleich September 3, 2024 13:54

typo

54c797f

Charles-Schleich suggested changes Sep 5, 2024

View reviewed changes

Rob Agar added 4 commits September 5, 2024 12:40

use zenohd tokio runtime if statically linked

c93710d

put_batch_size and put_batch_timeout_ms config options in READMEs

a72158f

put_batch_tx mut for readability

6926beb

use Option::map to avoid mut put_batch_tx

1831b52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batched PUT requests #205

Batched PUT requests #205

robagar commented Sep 3, 2024

gabrik commented Sep 3, 2024

robagar commented Sep 3, 2024

Mallets commented Sep 4, 2024

robagar commented Sep 4, 2024

Charles-Schleich commented Sep 5, 2024

robagar commented Sep 5, 2024

Charles-Schleich Sep 5, 2024

Charles-Schleich Sep 5, 2024

Charles-Schleich Sep 5, 2024

Charles-Schleich Sep 5, 2024 •

edited

Loading

robagar commented Sep 26, 2024

Mallets commented Oct 17, 2024

robagar commented Oct 17, 2024

		pub const PROP_STORAGE_PUT_BATCH_SIZE: &str = "put_batch_size";
		pub const PROP_STORAGE_PUT_BATCH_TIMEOUT_MS: &str = "put_batch_timeout_ms";

		// TODO - add pending status
		Ok(StorageInsertionResult::Inserted)

Batched PUT requests #205

Are you sure you want to change the base?

Batched PUT requests #205

Conversation

robagar commented Sep 3, 2024

gabrik commented Sep 3, 2024

robagar commented Sep 3, 2024

Mallets commented Sep 4, 2024

robagar commented Sep 4, 2024

Charles-Schleich commented Sep 5, 2024

robagar commented Sep 5, 2024

Charles-Schleich Sep 5, 2024

Choose a reason for hiding this comment

Charles-Schleich Sep 5, 2024

Choose a reason for hiding this comment

Charles-Schleich Sep 5, 2024

Choose a reason for hiding this comment

Charles-Schleich Sep 5, 2024 • edited Loading

Choose a reason for hiding this comment

robagar commented Sep 26, 2024

Mallets commented Oct 17, 2024

robagar commented Oct 17, 2024

Charles-Schleich Sep 5, 2024 •

edited

Loading