Split fleet-agent #1905

manno · 2023-10-26T16:18:27Z

Refers to #1772

To prepare the agent for controller-runtime, split the code into three processes:

registration
cluster status updater
agent controller, watches bundledeployments

Only the last part is an actual controller. Cluster status updates work on a timer and registration is done when the agent starts up. A new function "registration.Get" was added, to just return an existing registration to the other processes.

This PR switches the agent to a statefulset, to avoid leader election. I don't expect the agent to scale horizontally, as a controller.
Updating the cluster status with information about the cluster nodes seems unrelated to Fleet's purpose and we might discuss this in the future.
The code of the handlers in the bundledeployment controller is currently spread around the "handler", a deploy "manager" and the helm deployer. The PR takes first steps to consolidate the functionality into the deployer package, making it more obvious which client is used when during reconciliation.

"Trigger" was renamed to "driftdetect" as it uses the helm history to detect drift for deployments. Which is similar to the "monitor", which updates the bundle deployments status.

internal/cmd/controller/agent/manifest_test.go

internal/cmd/controller/controllers/cluster/import.go

internal/cmd/agent/root.go

internal/cmd/agent/register.go

weyfonk

Looking good, thanks! Leaving a few nitpicks.

dev/setup-fleet

.github/scripts/deploy-fleet.sh

internal/cmd/agent/deployer/plan/plan.go

weyfonk · 2023-10-31T11:39:07Z

internal/cmd/agent/clusterstatus.go

+func (cs *ClusterStatus) Run(cmd *cobra.Command, args []string) error {
+	// provide a logger in the context to be compatible with controller-runtime
+	zopts := zap.Options{
+		Development: true,


This enables stacktraces on warnings (instead of errors) and disables sampling; are we sure we want this level of verbosity in production?
Same question about register.go and root.go.

Yes, I would say we will have to come up with logging configuration later. I was hoping we could switch to https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/log/zap#Options.BindFlags

internal/cmd/agent/clusterstatus.go

aruiz14

LGTM, just a minor suggestion

aruiz14 · 2023-11-02T11:57:16Z

internal/cmd/agent/register.go

+	ctx, cancel := context.WithCancel(ctx)
+	// try to register with upstream fleet controller by obtaining
+	// a kubeconfig for the upstream cluster
+	agentInfo, err := register.Register(ctx, r.Namespace, kc)
+	if err != nil {
+		logrus.Fatal(err)
+	}
+
+	ns, _, err := agentInfo.ClientConfig.Namespace()
+	if err != nil {
+		logrus.Fatal(err)
+	}
+
+	_, err = agentInfo.ClientConfig.ClientConfig()
+	if err != nil {
+		logrus.Fatal(err)
+	}
+
+	setupLog.Info("successfully registered with upstream cluster", "namespace", ns)
+	cancel()


Why not using defer?

ctx, cancel := context.WithCancel(ctx) defer cancel()

I'll add it in the next PR

aruiz14 · 2023-11-02T11:59:57Z

internal/cmd/agent/controller.go

+
+	agentInfo, err := register.Get(ctx, namespace, kc)
+	if err != nil {
+		logrus.Fatal(err)


Probably just moved from somewhere else, but why all these logrus.Fatal if this method already returns an error?

Yes, all these logrus calls should be removed in the future https://github.com/rancher/fleet/pull/1772/files#diff-a08a0ecbbab37eaea426b6b8bc55a0d5107c75cac3db92153ef7d3329dc47650R79. But the second point you make is interesting. Why do we exit instead of return... I copied that from start.go 🤷

I'll change it in the next PR :)

manno force-pushed the split-agent branch 9 times, most recently from 57892b5 to 77a8eca Compare October 30, 2023 14:59

manno marked this pull request as ready for review October 30, 2023 15:11

manno requested a review from a team as a code owner October 30, 2023 15:11

manno force-pushed the split-agent branch from 77a8eca to eb3ccfe Compare October 30, 2023 15:27

aruiz14 reviewed Oct 30, 2023

View reviewed changes

weyfonk reviewed Oct 31, 2023

View reviewed changes

Mario Manno added 5 commits October 31, 2023 14:18

Split agent into controller, registration, clusterstatus

977bf3c

Agent uses statefulset instead of deployment

764319a

Add controller-runtime logger in context

593dc2e

Split deployManager into packages

a17fc99

Test scripts wait for agent readiness

b49809c

manno force-pushed the split-agent branch from 6c1fa6b to b49809c Compare October 31, 2023 13:19

aruiz14 approved these changes Nov 2, 2023

View reviewed changes

manno merged commit ba319cb into master Nov 2, 2023
10 checks passed

manno deleted the split-agent branch November 2, 2023 13:24

manno mentioned this pull request Nov 2, 2023

Fix debug flag in agent commands #1924

Merged

manno mentioned this pull request Nov 24, 2023

Convert fleet-agent to controller-runtime #1772

Merged

11 tasks

manno mentioned this pull request Jan 11, 2024

Spike: Reduce agent footprint #1733

Closed

2 tasks

manno changed the title ~~Split agent~~ Split fleet-agent Jan 11, 2024

manno mentioned this pull request Jan 11, 2024

Convert fleet-agent to controller-runtime #1734

Closed

BrewTestBot mentioned this pull request Jul 17, 2024

fleet-cli 0.10.0 Homebrew/homebrew-core#177602

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split fleet-agent #1905

Split fleet-agent #1905

manno commented Oct 26, 2023 •

edited by zube bot

Loading

weyfonk left a comment

weyfonk Oct 31, 2023

manno Oct 31, 2023

aruiz14 left a comment

aruiz14 Nov 2, 2023

manno Nov 2, 2023

aruiz14 Nov 2, 2023

manno Nov 2, 2023 •

edited

Loading

Split fleet-agent #1905

Split fleet-agent #1905

Conversation

manno commented Oct 26, 2023 • edited by zube bot Loading

weyfonk left a comment

Choose a reason for hiding this comment

weyfonk Oct 31, 2023

Choose a reason for hiding this comment

manno Oct 31, 2023

Choose a reason for hiding this comment

aruiz14 left a comment

Choose a reason for hiding this comment

aruiz14 Nov 2, 2023

Choose a reason for hiding this comment

manno Nov 2, 2023

Choose a reason for hiding this comment

aruiz14 Nov 2, 2023

Choose a reason for hiding this comment

manno Nov 2, 2023 • edited Loading

Choose a reason for hiding this comment

manno commented Oct 26, 2023 •

edited by zube bot

Loading

manno Nov 2, 2023 •

edited

Loading