-
Notifications
You must be signed in to change notification settings - Fork 40
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Scale out Clickhouse to a multinode cluster (temporarily disabled) (#…
…3494) ## Replication This PR implements an initial 2 replica 3 coordinator ClickHouse set up. I've settled on this initial lean architecture as I want to avoid cluttering with what may be unnecessary additional nodes and using up our customers resources. As we gauge the system alongside our first customers we can decide if we really do need more replicas or not. Inserting an additional replica is very straightforward, as we only need to make a few changes to the templates/service count and restart the ClickHouse services. ## Sharding Sharding can prove to be very resource intensive, and we have yet to fully understand our customer's needs. I'd like to avoid a situation where we are prematurely optimising when we have so many unknowns. We also have not had time to perform long running testing. See official ClickHouse [recommendations](https://clickhouse.com/blog/common-getting-started-issues-with-clickhouse#2-going-horizontal-too-early). Like additional replicas, we can have additional shards if we find them to be necessary down the track. ## Testing I have left most tests as a single node set up. It feels unnecessary to spin up so many things constantly. If people disagree, I can modify this. I have run many many manual tests, starting and stopping services and so far the set up has held up. Using a ClickHouse client: ```console root@oxz_clickhouse_af08dce0-41ce-4922-8d51-0f546f23ff3e:~# ifconfig <redacted> oxControlService13:1: flags=21002000841<UP,RUNNING,MULTICAST,IPv6,FIXEDMTU> mtu 9000 index 2 inet6 fd00:1122:3344:101::f/64 root@oxz_clickhouse_af08dce0-41ce-4922-8d51-0f546f23ff3e:~# cd /opt/oxide/clickhouse/ root@oxz_clickhouse_af08dce0-41ce-4922-8d51-0f546f23ff3e:/opt/oxide/clickhouse# ./clickhouse client --host fd00:1122:3344:101::f ClickHouse client version 22.8.9.1. Connecting to fd00:1122:3344:101::f:9000 as user default. Connected to ClickHouse server version 22.8.9 revision 54460. oximeter_cluster node 2 :) SELECT * FROM oximeter.fields_i64 SELECT * FROM oximeter.fields_i64 Query id: dedbfbba-d949-49bd-9f9c-0f81a1240798 ┌─timeseries_name───┬───────timeseries_key─┬─field_name─┬─field_value─┐ │ data_link:enabled │ 9572423277405807617 │ link_id │ 0 │ │ data_link:enabled │ 12564290087547100823 │ link_id │ 0 │ │ data_link:enabled │ 16314114164963669893 │ link_id │ 0 │ │ data_link:link_up │ 9572423277405807617 │ link_id │ 0 │ │ data_link:link_up │ 12564290087547100823 │ link_id │ 0 │ │ data_link:link_up │ 16314114164963669893 │ link_id │ 0 │ └───────────────────┴──────────────────────┴────────────┴─────────────┘ 6 rows in set. Elapsed: 0.003 sec. ``` To retrieve information about the keepers you can use the provided [commands](https://clickhouse.com/docs/en/guides/sre/keeper/clickhouse-keeper#four-letter-word-commands) within each of the keeper zones. Example: ```console root@oxz_clickhouse_keeper_9b70b23c-a7c4-4102-a7a1-525537dcf463:~# ifconfig <redacted> oxControlService17:1: flags=21002000841<UP,RUNNING,MULTICAST,IPv6,FIXEDMTU> mtu 9000 index 2 inet6 fd00:1122:3344:101::11/64 root@oxz_clickhouse_keeper_9b70b23c-a7c4-4102-a7a1-525537dcf463:~# echo mntr | nc fd00:1122:3344:101::11 9181 zk_version v22.8.9.1-lts-ac5a6cababc9153320b1892eece62e1468058c26 zk_avg_latency 0 zk_max_latency 1 zk_min_latency 0 zk_packets_received 60 zk_packets_sent 60 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 6 zk_watch_count 1 zk_ephemerals_count 0 zk_approximate_data_size 1271 zk_key_arena_size 4096 zk_latest_snapshot_size 0 zk_followers 2 zk_synced_followers 2 ``` Closes: #2158 ## Update As mentioned previously, we came to an agreement at the last control plane meeting that software installed on the racks should not diverge due to replicated ClickHouse. This means that while ClickHouse [replication is functional](https://github.com/oxidecomputer/omicron/runs/16327261230) in this PR, it has been disabled in the last commit in the following manner: - The `method_script.sh` for the `clickhouse` service is set to run single node mode by default, but can be switched to run on replicated mode by [swapping a variable to false](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-5475f31ccc4d46ea5ed682a38970067eacc337f0c6cd52581b1609f4ecce6071R31-R49). When we migrate all racks to a replicated ClickHouse setup, all logic related to running on single node will be removed from that file. - The number of zones defined through RSS will stay the same. Instructions on how to tweak them to launch in replicated mode have been [left in the form of comments](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-9ea2b79544fdd0a21914ea354fba0b3670258746b1350d900285445d399861e1R59-R64). ### Testing I ran the full CI testing suite on both replicated and single node mode. You can find the replicated test results [here](https://github.com/oxidecomputer/omicron/runs/16327261230), and the single node with disabled replication [here](https://github.com/oxidecomputer/omicron/runs/16360694678) Additionally, I have added tests that validate the replicated db_init file [here](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-6f5af870905bd92fc2c62db62d674d1e033edee57adbe7ea70d929c79cd03ba1R671), and incorporated [checks](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-6f5af870905bd92fc2c62db62d674d1e033edee57adbe7ea70d929c79cd03ba1R664) in tests that validate whether a CH instance is part of a cluster or not. ### Next steps To keep this PR compact (if you can call 2000 lines compact), I have created several issues to tackle after this PR is merged from the review comments. In prioritised order, these are: - #3982 from this [comment](#3494) - #3824 from this [comment](#3494) - #3823 from this [comment](#3494)
- Loading branch information
Showing
39 changed files
with
2,140 additions
and
131 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.