Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add proposal for Locality LoadBalance #574

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

derekwin
Copy link
Contributor

What type of PR is this?
/kind enhancement

What this PR does / why we need it:
add proposal for Locality LB

@kmesh-bot kmesh-bot added the kind/enhancement New feature or request label Jul 15, 2024
@kmesh-bot
Copy link
Collaborator

Welcome @derekwin! It looks like this is your first PR to kmesh-net/kmesh 🎉

@LiZhenCheng9527
Copy link
Collaborator

Would you like to share your issue at Thursday's community meeting?


### Motivation

Currently, kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currently, kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.
Currently, Kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.

Unified capitalisation of initial letters in Kmesh


#### case 1. locality failover
1. Destination Rule
Same as Istion. Parse rules specify configuration for Locality load balancing. (todo: outlier detection settings to detect and evict unhealthy hosts from the load balancing pool.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is istion? Istio?

@hzxuzhonghu
Copy link
Member

/ok-to-test

Copy link

codecov bot commented Jul 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 52.80%. Comparing base (433592b) to head (b2caa7c).
Report is 217 commits behind head on main.

see 29 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dda7049...b2caa7c. Read the comment docs.


Currently, kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.

#### Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Goals
### Goals


1. prioritize add locality load balancing capabilities in the workload mode.

2. two types of locality load balancing : locality failover, locality weighted distribution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how locality weighted distribution can be implemented in workload mode. The workload api does not support weight actually

#### case 1. locality failover
1. Destination Rule
Same as Istion. Parse rules specify configuration for Locality load balancing. (todo: outlier detection settings to detect and evict unhealthy hosts from the load balancing pool.)
- Outlier detection should occur before load balancing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not suite worklaod mode as workload api does not include outlier setting. It do LB based on where the endpoint resides.

@derekwin
Copy link
Contributor Author

Would you like to share your issue at Thursday's community meeting?

yes

@kmesh-bot kmesh-bot added size/L and removed size/M labels Jul 25, 2024
@derekwin
Copy link
Contributor Author

I have updated the proposal.

@derekwin
Copy link
Contributor Author

Propose a new implementation for a location matching algorithm that avoids circular computations while also reducing the amount of data needed to be stored in BPF maps. detail: https://github.com/derekwin/treemap/tree/master
Welcome to offer suggestions to further improve the approach.

@Okabe-Rintarou-0
Copy link
Member

Okabe-Rintarou-0 commented Jul 30, 2024

if no conflict, there is no need to merge main branch.
If there are some conflicts, to get a clearer commit history, you should:

git rebase main

then fix some conflicts, and then

git rebase --continue
git push --force

the DCO github action failed, because it asks you to commit with your signature, which can be attached with -s flag:

git commit -s -m 'something to say'

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish to see more api design instead of function implement in the proposal

How do you express the priority level, and how do you match the client locality with the endpoints


1. prioritize add locality load balancing capabilities in the workload mode.

2. locality load balancing mode: locality failover.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about strict mode

```
https://pkg.go.dev/istio.io/istio/pkg/workloadapi#LoadBalancing_Scope

2. calculate locality match rank
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group endpoints with prority


3. choose endpoint

Randomly select one endpoint from the group with the highest rank as the service backend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Randomly select one endpoint from the group with the highest rank as the service backend.
Randomly select one endpoint from the group with the highest priority

And add more comments what we do if all the endpoints of high priority is unhealthy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for the strict mode, how would you select the endpoint, i would like to see that


4. maybe more? Panic threshold

When the proportion of healthy endpoints in the high-rank group falls below the panic threshold, select endpoints from the next rank group.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I donot care about this at first. First respect workload healthy status

__u32 waypoint_addr;
__u32 waypoint_port;
// 增加健康状态 healthStatus
// 增加locality信息
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please supplement what does this field look like.

Add corresponding fields to the `pkg/controller/workload/bpfcache/service.go`, and update logic to `pkg/controller/workload/workload_processor.go`

2. Configure the locality (region, zone, subzone) and health status (HEALTHY, UNHEALTHY)of the backend. This corresponds to the message in workload.proto.
> Although the current workload API defines seven scopes, when configuring a pod's locality, only region, zone, and subzone are configured. Therefore, matching capabilities can only be realized for these three scopes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only region, zone, and subzone are configured

where do you get this conclusion, at least NODE is supported now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood it before. I saw that in ztunnel, the NODE, NETWORK, and CLUSTER information were maintained within the workload, and I considering adding these informations to the bpf map of the backend later.

__u32 service[MAX_SERVICE_COUNT];
struct ip_addr wp_addr;
__u32 waypoint_port;
__u8 health_status; // workload_health_status_t: HEALTHY, UNHEALTHY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we filtered out unhealthy workload

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, only healthy workloads are stored in bpf map by the control plane. Does locality load balance not need to concern whether the workload is healthy or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it simpler, even the priority set can be calculated in user space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Priority computation between localities occurs when a new flow is came. If priority calculation takes place at the control plane, my understanding is that we would need to precompute all possible scenarios(We are unable to perform event-driven programming that interoperates with user space, right?), then hash different situations and store them in a BPF map. The kernel space would then query the map using source and destination locality information to obtain priority information. To simplify the problem, we could arrange combinations based on the specific values pointed to by the six routing options in scope (also including cases where only some of these match). This approach has two potential issues:
Firstly, userspace must enumerate all possible scenarios, which becomes particularly burdensome as the richness of locality information increases, leading to an exponential growth in the number of situations to be stored. Secondly, the BPF map would have to store all aforementioned scenarios, with each scenario existing in the form of the prio_map as currently designed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern mainly on : 1. ebpf instruction limit 2. dataplane sorting performance Worth a try though

struct ip_addr wp_addr;
__u32 waypoint_port;
__u8 health_status; // workload_health_status_t: HEALTHY, UNHEALTHY
locality_t locality;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is locality_t then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add it in next commit.

@derekwin
Copy link
Contributor Author

derekwin commented Sep 7, 2024

new proposal of locality LB in user-space logic

typedef struct {
__u32 service_id; // service id
__u32 rank; // rank
} prio_key;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the relationship with endpoint_map?

When we use this map, and when we use the other?

} prio_key;
typedef struct {
__u32 count; // count of current prio
__u32 uid_list[MAP_SIZE_OF_PRIO]; // workload_uid to backend
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can waste memory

So why not add priority to endpoint key

update endpoint_key {

typedef struct {
__u32 service_id; // service id
__u32 Priority,
__u32 backend_index; // if endpoint_count = 3, then backend_index = 0/1/2
} endpoint_key;

@derekwin
Copy link
Contributor Author

New design has been updated to the proposal and the correspond code Pr is here #900

```
typedef struct {
__u32 service_id; // service id
__u32 prio; // prio means rank, 6 means match all, and 0 means match nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By adding this, how do we select a endpoint now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For random lb mode,workload will only be added to endpoint with maxprio(6).
For locality lb mode,workload will be added to endpoint with rank that calculated by matching kmesh processor's locality info with workload's locality info.
We also record the count number of endpoints belongs to which prio in serviceValue, so that we can use it as it before.
In the bpf prog, If service is in random lb mode, we can search endpoint with maxprio. if it is in locality lb mode, we will iter prio from maxprio to 0, if count of that prio >0, which means there have one or more endpoints in that prio, we can choose one workload index by random int value with count, and get endpoint whith serviceId, prio and workloadIndex.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, not the bpf map update should be a little bit tricky

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 means nothing means what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prio value is from 0 to 6, 0 means the lowest priolity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i would suggest the opposite. Because we can search from the highest priority easily

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have updated it.


workload.h
```
#define MAX_PRIO 6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the max priority rank can be 7

with
// Prefer traffic in the same region.
LoadBalancing_REGION LoadBalancing_Scope = 1
// Prefer traffic in the same zone.
LoadBalancing_ZONE LoadBalancing_Scope = 2
// Prefer traffic in the same subzone.
LoadBalancing_SUBZONE LoadBalancing_Scope = 3
// Prefer traffic on the same node.
LoadBalancing_NODE LoadBalancing_Scope = 4
// Prefer traffic in the same cluster.
LoadBalancing_CLUSTER LoadBalancing_Scope = 5
// Prefer traffic in the same network.
LoadBalancing_NETWORK LoadBalancing_Scope = 6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prio value is from 0 to 6, so i set MAX_PRIO as 6, which actually is the 7th rank.

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants