Skip to content

Commit

Permalink
Scale out Clickhouse to a multinode cluster (temporarily disabled) (#…
Browse files Browse the repository at this point in the history
…3494)

## Replication

This PR implements an initial 2 replica 3 coordinator ClickHouse set up.

I've settled on this initial lean architecture as I want to avoid
cluttering with what may be unnecessary additional nodes and using up
our customers resources. As we gauge the system alongside our first
customers we can decide if we really do need more replicas or not.
Inserting an additional replica is very straightforward, as we only need
to make a few changes to the templates/service count and restart the
ClickHouse services.

## Sharding

Sharding can prove to be very resource intensive, and we have yet to
fully understand our customer's needs. I'd like to avoid a situation
where we are prematurely optimising when we have so many unknowns. We
also have not had time to perform long running testing. See official
ClickHouse
[recommendations](https://clickhouse.com/blog/common-getting-started-issues-with-clickhouse#2-going-horizontal-too-early).

Like additional replicas, we can have additional shards if we find them
to be necessary down the track.

## Testing

I have left most tests as a single node set up. It feels unnecessary to
spin up so many things constantly. If people disagree, I can modify
this.

I have run many many manual tests, starting and stopping services and so
far the set up has held up.


Using a ClickHouse client:

```console
root@oxz_clickhouse_af08dce0-41ce-4922-8d51-0f546f23ff3e:~# ifconfig
<redacted>
oxControlService13:1: flags=21002000841<UP,RUNNING,MULTICAST,IPv6,FIXEDMTU> mtu 9000 index 2
	inet6 fd00:1122:3344:101::f/64 
root@oxz_clickhouse_af08dce0-41ce-4922-8d51-0f546f23ff3e:~# cd /opt/oxide/clickhouse/
root@oxz_clickhouse_af08dce0-41ce-4922-8d51-0f546f23ff3e:/opt/oxide/clickhouse# ./clickhouse client --host fd00:1122:3344:101::f
ClickHouse client version 22.8.9.1.
Connecting to fd00:1122:3344:101::f:9000 as user default.
Connected to ClickHouse server version 22.8.9 revision 54460.

oximeter_cluster node 2 :) SELECT * FROM oximeter.fields_i64

SELECT *
FROM oximeter.fields_i64

Query id: dedbfbba-d949-49bd-9f9c-0f81a1240798

┌─timeseries_name───┬───────timeseries_key─┬─field_name─┬─field_value─┐
│ data_link:enabled │  9572423277405807617 │ link_id    │           0 │
│ data_link:enabled │ 12564290087547100823 │ link_id    │           0 │
│ data_link:enabled │ 16314114164963669893 │ link_id    │           0 │
│ data_link:link_up │  9572423277405807617 │ link_id    │           0 │
│ data_link:link_up │ 12564290087547100823 │ link_id    │           0 │
│ data_link:link_up │ 16314114164963669893 │ link_id    │           0 │
└───────────────────┴──────────────────────┴────────────┴─────────────┘

6 rows in set. Elapsed: 0.003 sec. 

```

To retrieve information about the keepers you can use the provided
[commands](https://clickhouse.com/docs/en/guides/sre/keeper/clickhouse-keeper#four-letter-word-commands)
within each of the keeper zones.

Example:

```console
root@oxz_clickhouse_keeper_9b70b23c-a7c4-4102-a7a1-525537dcf463:~# ifconfig
<redacted>
oxControlService17:1: flags=21002000841<UP,RUNNING,MULTICAST,IPv6,FIXEDMTU> mtu 9000 index 2
	inet6 fd00:1122:3344:101::11/64 
root@oxz_clickhouse_keeper_9b70b23c-a7c4-4102-a7a1-525537dcf463:~# echo mntr | nc fd00:1122:3344:101::11 9181
zk_version	v22.8.9.1-lts-ac5a6cababc9153320b1892eece62e1468058c26
zk_avg_latency	0
zk_max_latency	1
zk_min_latency	0
zk_packets_received	60
zk_packets_sent	60
zk_num_alive_connections	1
zk_outstanding_requests	0
zk_server_state	leader
zk_znode_count	6
zk_watch_count	1
zk_ephemerals_count	0
zk_approximate_data_size	1271
zk_key_arena_size	4096
zk_latest_snapshot_size	0
zk_followers	2
zk_synced_followers	2

```

Closes: #2158

## Update

As mentioned previously, we came to an agreement at the last control
plane meeting that software installed on the racks should not diverge
due to replicated ClickHouse. This means that while ClickHouse
[replication is
functional](https://github.com/oxidecomputer/omicron/runs/16327261230)
in this PR, it has been disabled in the last commit in the following
manner:

- The `method_script.sh` for the `clickhouse` service is set to run
single node mode by default, but can be switched to run on replicated
mode by [swapping a variable to
false](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-5475f31ccc4d46ea5ed682a38970067eacc337f0c6cd52581b1609f4ecce6071R31-R49).
When we migrate all racks to a replicated ClickHouse setup, all logic
related to running on single node will be removed from that file.
- The number of zones defined through RSS will stay the same.
Instructions on how to tweak them to launch in replicated mode have been
[left in the form of
comments](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-9ea2b79544fdd0a21914ea354fba0b3670258746b1350d900285445d399861e1R59-R64).

### Testing

I ran the full CI testing suite on both replicated and single node mode.
You can find the replicated test results
[here](https://github.com/oxidecomputer/omicron/runs/16327261230), and
the single node with disabled replication
[here](https://github.com/oxidecomputer/omicron/runs/16360694678)

Additionally, I have added tests that validate the replicated db_init
file
[here](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-6f5af870905bd92fc2c62db62d674d1e033edee57adbe7ea70d929c79cd03ba1R671),
and incorporated
[checks](https://github.com/oxidecomputer/omicron/pull/3494/files#diff-6f5af870905bd92fc2c62db62d674d1e033edee57adbe7ea70d929c79cd03ba1R664)
in tests that validate whether a CH instance is part of a cluster or
not.

### Next steps

To keep this PR compact (if you can call 2000 lines compact), I have
created several issues to tackle after this PR is merged from the review
comments. In prioritised order, these are:

- #3982 from this
[comment](#3494)
- #3824 from this
[comment](#3494)
- #3823 from this
[comment](#3494)
  • Loading branch information
karencfv authored Sep 5, 2023
1 parent 4011e4a commit 0319d2c
Show file tree
Hide file tree
Showing 39 changed files with 2,140 additions and 131 deletions.
1 change: 1 addition & 0 deletions .github/buildomat/jobs/package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ ptime -m ./tools/build-global-zone-packages.sh "$tarball_src_dir" /work
mkdir -p /work/zones
zones=(
out/clickhouse.tar.gz
out/clickhouse_keeper.tar.gz
out/cockroachdb.tar.gz
out/crucible-pantry.tar.gz
out/crucible.tar.gz
Expand Down
1 change: 1 addition & 0 deletions common/src/address.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ pub const PROPOLIS_PORT: u16 = 12400;
pub const COCKROACH_PORT: u16 = 32221;
pub const CRUCIBLE_PORT: u16 = 32345;
pub const CLICKHOUSE_PORT: u16 = 8123;
pub const CLICKHOUSE_KEEPER_PORT: u16 = 9181;
pub const OXIMETER_PORT: u16 = 12223;
pub const DENDRITE_PORT: u16 = 12224;
pub const DDMD_PORT: u16 = 8000;
Expand Down
2 changes: 1 addition & 1 deletion dev-tools/src/bin/omicron-dev.rs
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ async fn cmd_clickhouse_run(args: &ChRunArgs) -> Result<(), anyhow::Error> {

// Start the database server process, possibly on a specific port
let mut db_instance =
dev::clickhouse::ClickHouseInstance::new(args.port).await?;
dev::clickhouse::ClickHouseInstance::new_single_node(args.port).await?;
println!(
"omicron-dev: running ClickHouse with full command:\n\"clickhouse {}\"",
db_instance.cmdline().join(" ")
Expand Down
18 changes: 16 additions & 2 deletions internal-dns-cli/src/bin/dnswait.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,21 +23,31 @@ struct Opt {
#[clap(long, action)]
nameserver_addresses: Vec<SocketAddr>,

/// service name to be resolved (should be the target of a DNS name)
/// Service name to be resolved (should be the target of a DNS name)
#[arg(value_enum)]
srv_name: ServiceName,

/// Output service host names only, omitting the port
#[clap(long, short = 'H', action)]
hostname_only: bool,
}

#[derive(Debug, Clone, Copy, ValueEnum)]
#[value(rename_all = "kebab-case")]
enum ServiceName {
Cockroach,
Clickhouse,
ClickhouseKeeper,
}

impl From<ServiceName> for internal_dns::ServiceName {
fn from(value: ServiceName) -> Self {
match value {
ServiceName::Cockroach => internal_dns::ServiceName::Cockroach,
ServiceName::Clickhouse => internal_dns::ServiceName::Clickhouse,
ServiceName::ClickhouseKeeper => {
internal_dns::ServiceName::ClickhouseKeeper
}
}
}
}
Expand Down Expand Up @@ -91,7 +101,11 @@ async fn main() -> Result<()> {
.context("unexpectedly gave up")?;

for (target, port) in result {
println!("{}:{}", target, port)
if opt.hostname_only {
println!("{}", target)
} else {
println!("{}:{}", target, port)
}
}

Ok(())
Expand Down
4 changes: 4 additions & 0 deletions internal-dns/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,10 @@ mod test {
#[test]
fn display_srv_service() {
assert_eq!(ServiceName::Clickhouse.dns_name(), "_clickhouse._tcp",);
assert_eq!(
ServiceName::ClickhouseKeeper.dns_name(),
"_clickhouse-keeper._tcp",
);
assert_eq!(ServiceName::Cockroach.dns_name(), "_cockroach._tcp",);
assert_eq!(ServiceName::InternalDns.dns_name(), "_nameservice._tcp",);
assert_eq!(ServiceName::Nexus.dns_name(), "_nexus._tcp",);
Expand Down
3 changes: 3 additions & 0 deletions internal-dns/src/names.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ pub const DNS_ZONE_EXTERNAL_TESTING: &str = "oxide-dev.test";
#[derive(Clone, Debug, Hash, Eq, Ord, PartialEq, PartialOrd)]
pub enum ServiceName {
Clickhouse,
ClickhouseKeeper,
Cockroach,
InternalDns,
ExternalDns,
Expand All @@ -38,6 +39,7 @@ impl ServiceName {
fn service_kind(&self) -> &'static str {
match self {
ServiceName::Clickhouse => "clickhouse",
ServiceName::ClickhouseKeeper => "clickhouse-keeper",
ServiceName::Cockroach => "cockroach",
ServiceName::ExternalDns => "external-dns",
ServiceName::InternalDns => "nameservice",
Expand All @@ -61,6 +63,7 @@ impl ServiceName {
pub(crate) fn dns_name(&self) -> String {
match self {
ServiceName::Clickhouse
| ServiceName::ClickhouseKeeper
| ServiceName::Cockroach
| ServiceName::InternalDns
| ServiceName::ExternalDns
Expand Down
2 changes: 1 addition & 1 deletion nexus/benches/setup_benchmark.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ async fn do_crdb_setup() {
// Wraps exclusively the ClickhouseDB portion of setup/teardown.
async fn do_clickhouse_setup() {
let mut clickhouse =
dev::clickhouse::ClickHouseInstance::new(0).await.unwrap();
dev::clickhouse::ClickHouseInstance::new_single_node(0).await.unwrap();
clickhouse.cleanup().await.unwrap();
}

Expand Down
4 changes: 4 additions & 0 deletions nexus/db-model/src/dataset_kind.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ impl_enum_type!(
Crucible => b"crucible"
Cockroach => b"cockroach"
Clickhouse => b"clickhouse"
ClickhouseKeeper => b"clickhouse_keeper"
ExternalDns => b"external_dns"
InternalDns => b"internal_dns"
);
Expand All @@ -35,6 +36,9 @@ impl From<internal_api::params::DatasetKind> for DatasetKind {
internal_api::params::DatasetKind::Clickhouse => {
DatasetKind::Clickhouse
}
internal_api::params::DatasetKind::ClickhouseKeeper => {
DatasetKind::ClickhouseKeeper
}
internal_api::params::DatasetKind::ExternalDns => {
DatasetKind::ExternalDns
}
Expand Down
2 changes: 1 addition & 1 deletion nexus/db-model/src/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1130,7 +1130,7 @@ table! {
///
/// This should be updated whenever the schema is changed. For more details,
/// refer to: schema/crdb/README.adoc
pub const SCHEMA_VERSION: SemverVersion = SemverVersion::new(3, 0, 3);
pub const SCHEMA_VERSION: SemverVersion = SemverVersion::new(4, 0, 0);

allow_tables_to_appear_in_same_query!(
system_update,
Expand Down
4 changes: 4 additions & 0 deletions nexus/db-model/src/service_kind.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ impl_enum_type!(

// Enum values
Clickhouse => b"clickhouse"
ClickhouseKeeper => b"clickhouse_keeper"
Cockroach => b"cockroach"
Crucible => b"crucible"
CruciblePantry => b"crucible_pantry"
Expand Down Expand Up @@ -54,6 +55,9 @@ impl From<internal_api::params::ServiceKind> for ServiceKind {
internal_api::params::ServiceKind::Clickhouse => {
ServiceKind::Clickhouse
}
internal_api::params::ServiceKind::ClickhouseKeeper => {
ServiceKind::ClickhouseKeeper
}
internal_api::params::ServiceKind::Cockroach => {
ServiceKind::Cockroach
}
Expand Down
4 changes: 3 additions & 1 deletion nexus/test-utils/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,9 @@ impl<'a, N: NexusServer> ControlPlaneTestContextBuilder<'a, N> {
let log = &self.logctx.log;
debug!(log, "Starting Clickhouse");
let clickhouse =
dev::clickhouse::ClickHouseInstance::new(0).await.unwrap();
dev::clickhouse::ClickHouseInstance::new_single_node(0)
.await
.unwrap();
let port = clickhouse.port();

let zpool_id = Uuid::new_v4();
Expand Down
5 changes: 4 additions & 1 deletion nexus/tests/integration_tests/oximeter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,10 @@ async fn test_oximeter_reregistration() {
);
let client =
oximeter_db::Client::new(ch_address.into(), &context.logctx.log);
client.init_db().await.expect("Failed to initialize timeseries database");
client
.init_single_node_db()
.await
.expect("Failed to initialize timeseries database");

// Helper to retrieve the timeseries from ClickHouse
let timeseries_name = "integration_target:integration_metric";
Expand Down
4 changes: 4 additions & 0 deletions nexus/types/src/internal_api/params.rs
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ pub enum DatasetKind {
Crucible,
Cockroach,
Clickhouse,
ClickhouseKeeper,
ExternalDns,
InternalDns,
}
Expand All @@ -136,6 +137,7 @@ impl fmt::Display for DatasetKind {
Crucible => "crucible",
Cockroach => "cockroach",
Clickhouse => "clickhouse",
ClickhouseKeeper => "clickhouse_keeper",
ExternalDns => "external_dns",
InternalDns => "internal_dns",
};
Expand Down Expand Up @@ -168,6 +170,7 @@ pub struct ServiceNic {
#[serde(rename_all = "snake_case", tag = "type", content = "content")]
pub enum ServiceKind {
Clickhouse,
ClickhouseKeeper,
Cockroach,
Crucible,
CruciblePantry,
Expand All @@ -186,6 +189,7 @@ impl fmt::Display for ServiceKind {
use ServiceKind::*;
let s = match self {
Clickhouse => "clickhouse",
ClickhouseKeeper => "clickhouse_keeper",
Cockroach => "cockroach",
Crucible => "crucible",
ExternalDns { .. } => "external_dns",
Expand Down
15 changes: 15 additions & 0 deletions openapi/nexus-internal.json
Original file line number Diff line number Diff line change
Expand Up @@ -922,6 +922,7 @@
"crucible",
"cockroach",
"clickhouse",
"clickhouse_keeper",
"external_dns",
"internal_dns"
]
Expand Down Expand Up @@ -2803,6 +2804,20 @@
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"clickhouse_keeper"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
Expand Down
33 changes: 33 additions & 0 deletions openapi/sled-agent.json
Original file line number Diff line number Diff line change
Expand Up @@ -1091,6 +1091,20 @@
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"clickhouse_keeper"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
Expand Down Expand Up @@ -2524,6 +2538,24 @@
"type"
]
},
{
"type": "object",
"properties": {
"address": {
"type": "string"
},
"type": {
"type": "string",
"enum": [
"clickhouse_keeper"
]
}
},
"required": [
"address",
"type"
]
},
{
"type": "object",
"properties": {
Expand Down Expand Up @@ -3115,6 +3147,7 @@
"type": "string",
"enum": [
"clickhouse",
"clickhouse_keeper",
"cockroach_db",
"crucible_pantry",
"crucible",
Expand Down
7 changes: 6 additions & 1 deletion oximeter/collector/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,12 @@ impl OximeterAgent {
)
};
let client = Client::new(db_address, &log);
client.init_db().await?;
let replicated = client.is_oximeter_cluster().await?;
if !replicated {
client.init_single_node_db().await?;
} else {
client.init_replicated_db().await?;
}

// Spawn the task for aggregating and inserting all metrics
tokio::spawn(async move {
Expand Down
8 changes: 4 additions & 4 deletions oximeter/db/src/bin/oxdb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ async fn make_client(
let address = SocketAddr::new(address, port);
let client = Client::new(address, &log);
client
.init_db()
.init_single_node_db()
.await
.context("Failed to initialize timeseries database")?;
Ok(client)
Expand Down Expand Up @@ -261,13 +261,13 @@ async fn populate(
Ok(())
}

async fn wipe_db(
async fn wipe_single_node_db(
address: IpAddr,
port: u16,
log: Logger,
) -> Result<(), anyhow::Error> {
let client = make_client(address, port, &log).await?;
client.wipe_db().await.context("Failed to wipe database")
client.wipe_single_node_db().await.context("Failed to wipe database")
}

async fn query(
Expand Down Expand Up @@ -313,7 +313,7 @@ async fn main() {
.unwrap();
}
Subcommand::Wipe => {
wipe_db(args.address, args.port, log).await.unwrap()
wipe_single_node_db(args.address, args.port, log).await.unwrap()
}
Subcommand::Query {
timeseries_name,
Expand Down
Loading

0 comments on commit 0319d2c

Please sign in to comment.