Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(tests): add sweeper to remove workspaces #355

Merged
merged 10 commits into from
Jan 16, 2025

Conversation

mitchnielsen
Copy link
Contributor

@mitchnielsen mitchnielsen commented Jan 10, 2025

Summary

Adds sweepers to clean up workspaces in cases where they were not automatically cleaned up.

https://developer.hashicorp.com/terraform/plugin/testing/acceptance-tests/sweepers

Related to https://linear.app/prefect/issue/PLA-642/ensure-ephemeral-workspaces-are-cleaned-up-consistently

Testing

Initially I want to disable the actual deletion so we can maintain some dangling workspaces in the account for testing purposes. So I made a small change:

diff --git a/internal/provider/resources/sweeper_test.go b/internal/provider/resources/sweeper_test.go
index ce3f4a9..bc58065 100644
--- a/internal/provider/resources/sweeper_test.go
+++ b/internal/provider/resources/sweeper_test.go
@@ -55,10 +55,11 @@ func addWorkspaceSweeper() {
 				if strings.HasPrefix(workspace.Name, testutils.TestAccPrefix) {
 					log.Printf("found acceptance testing workspace %s, deleting...\n", workspace.Name)
 
-					err := workspacesClient.Delete(context.Background(), workspace.ID)
-					if err != nil {
-						log.Printf("unable to delete workspaces %s during sweep: %s\n", workspace.Name, err)
-					}
+					log.Printf("pretending to delete workspaces %s\n", workspace.Name)
 				} else {
 					log.Printf("workspace %s does not match acceptance testing prefix, skipping...\n", workspace.Name)
 				}

And ran the following:

$ SWEEP="yes" make testacc-dev
./scripts/testacc-dev "" "INFO" yes
sweeping...
2025/01/13 16:46:13 [DEBUG] Running Sweepers for region (all):
2025/01/13 16:46:13 [DEBUG] Running Sweeper (workspaces) in region (all)
2025/01/13 16:46:13 [DEBUG] POST https://api.stg.prefect.dev/api/accounts/bb19c492-73c2-4ecd-9cd7-d82c4aac08e6/workspaces/filter
2025/01/13 16:46:13 workspace github-ci-tests does not match acceptance testing prefix, skipping...
2025/01/13 16:46:13 found acceptance testing workspace terraformacc07to6nvdgi, deleting...
2025/01/13 16:46:13 pretending to delete workspaces terraformacc07to6nvdgi

... and so on ...

2025/01/13 16:46:13 found acceptance testing workspace terraformaccy3jwhpmhkx, deleting...
2025/01/13 16:46:13 pretending to delete workspaces terraformaccy3jwhpmhkx
2025/01/13 16:46:13 found acceptance testing workspace terraformaccz6zhmdb8ep, deleting...
2025/01/13 16:46:13 pretending to delete workspaces terraformaccz6zhmdb8ep
2025/01/13 16:46:13 workspace test does not match acceptance testing prefix, skipping...
2025/01/13 16:46:13 [DEBUG] Completed Sweeper (workspaces) in region (all) in 188.201667ms
2025/01/13 16:46:13 Completed Sweepers for region (all) in 188.543042ms
2025/01/13 16:46:13 Sweeper Tests for region (all) ran successfully:
2025/01/13 16:46:13     - workspaces
ok      github.com/prefecthq/terraform-provider-prefect/internal/provider/resources     0.731s

@mitchnielsen mitchnielsen added the maintenance Maintenance work - won't show in release notes label Jan 10, 2025
@mitchnielsen mitchnielsen self-assigned this Jan 10, 2025
@mitchnielsen
Copy link
Contributor Author

Hitting some snags here. The sweeper docs are pretty short and vague, which led someone to create hashicorp/terraform-plugin-testing#378 with very fair questions. It's a few months old now with no response, so will keep an eye on that. Haven't found many other sweeper-related issues logged yet.

I also noticed that even if I put an os.Exit(1) in the sweeper function, I never see that exit failure in the output. It just always looks like:

2025/01/10 16:51:12 [DEBUG] Running Sweepers for region (foo):
2025/01/10 16:51:12 Completed Sweepers for region (foo) in 145.458µs
2025/01/10 16:51:12 Sweeper Tests for region (foo) ran successfully:

Note that foo is what you pass in with -sweep=foo. I've tried different values here and aligned them with the sweeper function, but it always just exits without any of my logging statements present (even with tflog.Debug).

In general, I'm a bit hesitant to pursue this much further because:

  • Documentation is so scarce
  • It's not very clear when in the testing workflow this runs
  • I'm not able to get the sweeper to run so far

I'll dig into this a bit more, but if I don't make much progress (assuming no developments in the sweeper docs) we can use the logic in this function and just call it directly after the tests run, rather than using Terraform's sweeper facilities.

FYI @parkedwards (no action required)

@@ -33,4 +33,5 @@ TF_ACC=1 \
PREFECT_API_URL=$(op read "${vault_entry}/PREFECT_API_URL") \
PREFECT_API_KEY=$(op read "${vault_entry}/PREFECT_API_KEY") \
PREFECT_CLOUD_ACCOUNT_ID=$(op read "${vault_entry}/PREFECT_CLOUD_ACCOUNT_ID") \
gotestsum --max-fails=50 ./... -count=1 -v ${run_arg}
go test ./... -v ${run_arg} -sweep=foo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im guessing -sweep=all didn't work? or is all supposed to be an arbitrary selector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like sweep specifies the provider to sweep, and sweep-run lets you specify specific sweepers. Also learned that providing the sweep flag only runs the sweepers, and that it's often recommended to:

  1. Run the sweepers (for a clean slate)
  2. Run the acceptance tests
  3. Run the sweepers again

Only sets the workspace handle filter if one is provided. Otherwise,
passes in an empty filter struct.

This allows the List method to return all workspaces. Before this
change, the empty "Any" field was causing no matches to be returned.

This will help us with the sweeper so it can return all workspaces and
let us filter the ones that match the prefix we use for acceptance
testing.
The ephemeral workspaces we make during tests have random values.
Because sweepers are run in a separate command from the acceptance
tests, there's currently no way to persist those unique values between
commands so the sweeper can delete those specific workspaces.

Instead, as a first iteration, this change lists all workspaces in an
account and deletes any that match the acceptance testing prefix we use.
This is designed to run either by hand or by CI at a given interval,
such as overnight, when other acceptance tests are not running.
Comment on lines 25 to 30
// addWorkspaceSweeper adds a sweeper that deletes any workspaces that match
// the prefix we use for ephemeral workspaces in acceptance tests.
//
// This is designed to run at a given interval when other acceptance tests are
// not likely running.
func addWorkspaceSweeper() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context in a4d16f7

The ephemeral workspaces we make during tests have random values.
Because sweepers are run in a separate command from the acceptance
tests, there's currently no way to persist those unique values between
commands so the sweeper can delete those specific workspaces.

Instead, as a first iteration, this change lists all workspaces in an
account and deletes any that match the acceptance testing prefix we use.
This is designed to run either by hand or by CI at a given interval,
such as overnight, when other acceptance tests are not running.

Welcoming feedback here. If we want to do the more traditional implementation, we'll need to find a way to persist the unique names of the ephemeral workspaces between running the acceptance tests and then running the sweepers.

Also, the nature of us using an entirely separate (ephemeral) workspace for each test is not reflected in the basic scenario shown in the documentation. It would be more fitting if we had a static prefix for the resource in the test, and could then turn around and delete all of those specific resources matching that prefix.

I think there's a few ways we can approach this, so now that I understand sweepers a bit better, I'm very open to making adjustments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put another way: we're not fully following the true spirit of sweepers here, in the sense that we're just making a sweeper separated from any specific resource test. That's because of our approach with making ephemeral workspaces as part of each test, so it's most effective to just go clean up those ephemeral workspaces.

But it gets the ball rolling because we can start moving in that direction if it makes sense, and we implemented the cleanup following (loosely) a Terraform framework rather than writing external scripts/etc to do the work.

It even makes me wonder if ephemeral workspaces are as vital as long as we use ephemeral name prefixes for each resource, and then use sweepers to make sure they're cleaned up. Both seem to have their own pros and cons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im OK with this. maybe I had misunderstood the original intention of implementing sweepers, because I had always imagined them running independently / being idempotent based off of that resource name prefix we use

@@ -50,7 +51,7 @@ testacc:
.PHONY: testacc

testacc-dev:
./scripts/testacc-dev $(TESTS) $(LOG_LEVEL)
./scripts/testacc-dev $(TESTS) $(LOG_LEVEL) $(SWEEP)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: pending the approach we land on, we'll want to add a scheduled CI action to run the sweeper at a given interval (such as overnight when things are quiet).

@mitchnielsen mitchnielsen marked this pull request as ready for review January 13, 2025 22:56
@mitchnielsen mitchnielsen requested a review from a team as a code owner January 13, 2025 22:56
Makefile Outdated
@@ -17,7 +18,7 @@ help:
@echo " lint - run static code analysis"
@echo " test - run automated unit tests"
@echo " testacc - run automated acceptance tests"
@echo " testacc-dev - run automated acceptance tests from a local machine (args TESTS=<tests> LOG_LEVEL=<level>)"
@echo " testacc-dev - run automated acceptance tests from a local machine (args: TESTS=<tests or empty> LOG_LEVEL=<level> SWEEP=<yes or empty>)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should SWEEP be a bool?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see, the shell script checks for any value

Follows other patterns and allows us to test this specific package
instead of other packages with other tests.
Adds a CI workflow that will run the Terraform sweepers each night.
@mitchnielsen mitchnielsen force-pushed the clean-up-ephemeral-workspaces branch from 4eb557f to ebbe68b Compare January 16, 2025 19:55
@mitchnielsen mitchnielsen deployed to Acceptance Tests January 16, 2025 19:55 — with GitHub Actions Active
@mitchnielsen mitchnielsen merged commit 5a6386d into main Jan 16, 2025
7 checks passed
@mitchnielsen mitchnielsen deleted the clean-up-ephemeral-workspaces branch January 16, 2025 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Maintenance work - won't show in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants