Minimal reproduction code for a clickhouse issue related to Distributed+Replicated+MaterializedView

EDIT This is actually an expected behavior : if the internal_replication cluster parameter isn't set to true your data will be both replicated at the Distributed engine level and at the ReplicatedMerge... engine level, leading to duplicates in the destination tables. internal_replication=true thus solves the problem. As explained in clickhouse's doc https://clickhouse.com/docs/en/engines/table-engines/special/distributed#distributed-writing-data you may choose one strategy among (both not both 😅) :

write to local tables (by e.g. loadbalancing your client connections to your various clickhouse nodes via a TCP proxy) and SELECT from Distributed tables,

write to the Distributed tables but always et internal_replication=true in your cluster's XML configuration

This is a minimal docker-compose stack to reproduce a ~~weird behavior/"bug"~~ we observed when combining Clickhouse Distributed tables to Replicated... tables for which a Materialized View that compute some basic aggregates (e.g., sums) is setup.

INSERTING to the Distributed table, then SELECTing the overall counts from the input table and the MV's aggregated sums, we see that the counts are completely broken :

./it_fails.sh

If, instead, we INSERT to local Replicated... tables at random nodes (mimicking, e.g., an HAProxy round robin loadbalancing) all the counts are perfectly correct in both the input tables and the MV's aggregates

./it_works.sh

As you can see in main.py the only change between ./it_fails.sh and ./it_works.sh is in the USE_DISTRIBUTED flag that either choose the Distributed table as the INSERT point or a randomly picked node's local ReplicatedAggregatingMergeTree table.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
conf		conf
.gitignore		.gitignore
README.md		README.md
docker-compose.yaml		docker-compose.yaml
it_fails.sh		it_fails.sh
it_works.sh		it_works.sh
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minimal reproduction code for a clickhouse issue related to Distributed+Replicated+MaterializedView

About

Releases

Packages

Languages

jeromefellus-sekoia/clickhouse-mv-with-distributed-and-replicated-issue

Folders and files

Latest commit

History

Repository files navigation

Minimal reproduction code for a clickhouse issue related to Distributed+Replicated+MaterializedView

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages