-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Distributed controlplane #2350
Comments
Running a geo-distributed controlplane does impose some challenges. Like for example:
I'll break this to bit more details below. As in many other cases in engineering, this boils down to kinda compromises. 😄 What I'd propose, and have seen lot of k0s deployments (IoT, manufacturing, etc. use cases) doing is the following:
EtcdThe issue is that Etcd does have some practical limits on latency when spread over network, thus it is most often deployed as cluster in a single "region". To mitigate HA issues you could run it e.g. in a 3 node setup where each node is on a specific "zone". LBLB is indeed a bit of a challenge, not only for this kinda deployment scenario but for k0s/k8s in general. What I can see as alternatives are something like:
We're working on a solution to mitigate the need for external LB in k0s itself. The WIP solution is to have each worker run a node-local LB (of sorts) and all the node-local components (kubelet, kube-proxy, etc.) connect to the API via that. That provides HA connectivity to all controlplane nodes (kube API mostly). We're hoping to get that into 1.26 release. |
@jnummelin yes, aware of etcd but I would probably use kine and use either postgres/mysql or even nats? |
In case of mysql/postgres if they can tolerate higher latencies etc. then it might work much better. (I'm definitely no expert in clustering those) So the main problem in your use case boils down to how to create a single address (LB or something else) to which all node components can connect to. That includes kubelet, kube-proxy and konnectivity-agent. Konnectivity-agent is actually maybe the most problematic one due to how it manages HA connections. For that there has to be LB of sorts as the agent can be configured only with single address (the LB) through which it expects to get "routed" to all the servers. I know, this is somewhat awkward way of doing it and we've opened discussion in the upstream if there's some other ways to handle this. |
One idea that came to mind for the "LB" is to utilize services like Ngrok, CloudFlare tunnels etc. So in a nutshell each controller registers to a (TCP) "tunnel" on the service (e.g. ngrok). A tunnel has a dedicated DNS (I assume) which you'd use as Note: I have not tested this approach at all, but conceptually it should work :) |
@jnummelin interesting, hadn't thought about ngrok etc.. But would be interesting as well. Not sure if either of these can distribute requests between hosts though? I thought they were all about exposing one host, but it looks like they have fallbacks in place. I really like your idea of running local loadbalancers to basically ensure that local worker nodes, use local control plane nodes as well. Not entirely sure where it would abstract that "these are the cp nodes" part though. As to, would it only use local or prefer local and then fallback cp nodes further away? "Clustering" mysql/postgressql is not simple either, but maybe a replication topology would help so for example to be able to do proper backup and DR. It doesn't look like kine can split operations (yet), and I am note sure if e.g. MySQL Galera is the answer or MaxSQL (or whatever the MariaDB proxy was called). Maybe kine would take a contribution so it would act as a proxy layer itself and split read to replicas and use the main server for INSERT/UPDATE/DELETE. Besides, latency, I think 80ms latency is normal for mysql/postgresql if you'd run a distributed database setup in let's say AWS US-East and AWS US-West. I've seen these work even across continents (or oceans). Not entirely sure how k8s deals with the lag or if it matters. In the video I posted, they mention that nats is fine up until ~250ms. |
In these sort of deployment cases you'd run plain controller nodes (i.e. no
As said, I'm no expert in MySQL/Postgres but in many other data stores these sort of things are often managed/configured on the driver level. I.e. the driver selects "master" for writes and replicas for read operations. But maybe the one compiled in kine for MySQL does not understand these sort of things. |
It seems like r/w splitting is not supported by the go-drivers used by kine. I asked in k3s-io/kine#117. I guess depending on the SQL, you can use something like:
I remember pgpool also had a "nice" consensus algorithm which was painful to work with (in my experience). Not sure about the others. My colleague also mentioned tailscale. Could be helpful to have the controller nodes available globally. |
Tailscale is pretty cool in general. And it actually works with k0s in this sort of use case. We did some demo with it like a year ago. Of course the Tailscale part is something you need to handle, k0s is not able to integrate with it directly. |
The issue is marked as stale since no activity has been recorded in 30 days |
Hi,
was wondering if anyone had given any thoughts to a completely distributed controlplane setup?
As in, that I want to run controlplane nodes across different providers/datacenters. Right now the loadbalancer setup, etc. seems to assume that I run a loadbalancer which proxies all my controlplane nodes. But how would this work across different providers (where I don't have the liberty of of announcing IP space and giant mesh networks, etc.)?
Or let's say in an IoT use-case where clusters are heavily distributed. Or is there an alternative (proposed) setup for these cases? E.g. run lots of single-node k0s'?
The alternative is running the controlplane in one place and workers in others, but then it's kinda like putting all 🥚 in one 🧺 .
Thanks!
The text was updated successfully, but these errors were encountered: