-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for msk connect resources #1162
Conversation
/test-examples="examples/kafkaconnect/v1beta1/customplugin.yaml" |
/test-examples="examples/kafkaconnect/v1beta1/workerconfiguration.yaml" |
/test-examples="examples/kafkaconnect/v1beta1/connector-broken.yaml" |
/test-examples="examples/kafkaconnect/v1beta1/connector-broken.yaml" The previous run actually failed because of the aws account hitting a VPC limit. It would be great if someone from upbound could submit some AWS quota increase requests for the number of VPCs per region. |
/test-examples="examples/kafkaconnect/v1beta1/connector-broken.yaml" |
The previous run failed with
I haven't seen this particular failure before, but it doesn't seem at all related to this PR. Maybe it will pass a second time? /test-examples="examples/kafkaconnect/v1beta1/connector-broken.yaml" |
The previous failure was caused by upbound/universal-crossplane#442. Pinning the UXP version in the makefile resolved the issue. /test-examples="examples/kafkaconnect/v1beta1/connector-nokafka.yaml" |
Uptest is failing because of too many VPCs. This is a problem in the upbound AWS account, which I don't have access to. @jeanduplessis Could you please take a look at why cloud-nuke doesn't seem to be cleaning them up? Requesting a limit increase from AWS on the number of VPCs per region would also be very helpful. |
/test-examples="examples/kafkaconnect/v1beta1/connector-nokafka.yaml" |
@mbbush its should be fine now again. Ping me directly if the uptest run fails again with that issue. |
/test-examples="examples/kafkaconnect/v1beta1/connector-nokafka.yaml" |
I believe this is ready for review. @turkenf do you think you could take a look before the next provider-aws release? I don't see any path forward to improving the behavior of the provider when creating a Connector results in creating a connector in state Failed in aws. The state is not exposed by the tf provider, so the only way I could even possibly query it would be to extract the aws client from the provider and make an explicit API call. I tried that, and was able to extract the aws client, but it was wrapped in a different type, and I couldn't figure out how to extract the underlying aws sdk client (it's probably possibly, I just don't know how). But regardless of whether it's possible I'm not sure if it's wise. My preference would be to merge this as-is, and release it, with the known issue that the provider does not expose error states with the connector. If there's consensus that it is wise to somehow add the connector's status field to the schema by invoking the SDK client directly, then I can work on that for a future release, as it would be a non-breaking change. Nothing significant has changed, so I'm confident the e2e tests will still pass. I'll rerun the ones that are testable automatically, but the full example which requires manual intervention is kind of a pain to run and takes a very long time, so I'd rather only do it one more time after this is otherwise approved. |
/test-examples="examples/kafkaconnect/v1beta1/customplugin.yaml" |
/test-examples="examples/kafkaconnect/v1beta1/connector-nokafka.yaml" |
/test-examples="examples/kafkaconnect/v1beta1/connector-nokafka.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your effort in this PR @mbbush, I left a few initial review comments for you to consider.
4e6dfb0
to
7a22d4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mbbush, we can merge after resolving the conflicts.
Signed-off-by: Matt Bush <[email protected]>
Signed-off-by: Matt Bush <[email protected]>
Signed-off-by: Matt Bush <[email protected]>
Signed-off-by: Matt Bush <[email protected]>
Signed-off-by: Matt Bush <[email protected]>
Signed-off-by: Matt Bush <[email protected]>
Signed-off-by: Matt Bush <[email protected]> Co-authored-by: Fatih Türken <[email protected]>
@turkenf Rebased and resolved conflicts. It's not clear to me if you've already started the process of releasing v1.3.0. It would be great if this made it in, but it's not a huge problem for me if it doesn't. |
Signed-off-by: Matt Bush <[email protected]>
@mbbush, we will include this PR. |
Thanks! I'm rebasing/committing from the train on my way in to the office, so I didn't have the bandwidth to run |
/test-examples="examples/ec2/v1beta1/vpc.yaml" Uptest run: https://github.com/crossplane-contrib/provider-upjet-aws/actions/runs/8470993317 |
Description of your changes
Adds support for the three Msk Connect resources.
Note: AWS has very limited api support for these resources. Other than the
scaling
object, every field on the terraform schema hasForceNew: true
, so from the crossplane perspective, the resources are mostly immutable once created. If you want to make a change, you'll have to delete it and make a new one.The WorkerConfiguration also has no delete method in the aws API, so once you've created a config with a given name, that's it, forever.
There is some surprising lifecycle behavior around the Connector resource, that I'm not sure if it's possible to resolve.
The AWS api exposes the
State
of a connector, which can be either "Creating", "Running", "Deleting" or "Failed". The terraform provider uses this state to determine whether a creation call succeeds, but does not expose it in the observedstatus
as a property of the resource, so I can't think of a way to get it into crossplane.When I create a Connector resource that is broken (as in the
connector-nokafka.yaml
example, where the kafka cluster doesn't exist), what happens is that the create call runs for just over 20 minutes (I had to increase the timeout slightly from the 20 minutes configured in the terraform provider), and then fails, and sets the following conditions:The provider then immediately runs an Observe, which finds that the connector does exist, and that its
status
matches thespec
(since theState
is not exposed in the terraform schema), and decides that creation must have succeeded after all, filling instatus.atProvider
and setting the following conditions:I'm not quite sure how best to reconcile this behavior of the aws resource, and the terraform provider, with what crossplane expects.
Fixes #374
Fixes #1146
I have:
make reviewable test
to ensure this PR is ready for review.How has this code been tested
Manually, in kind, using the manifests I committed. I was able to get the connector to enter a Running state, in which it did nothing because there was no traffic, and no actual working code to run, but the infrastructure was all healthy.
I'll run uptest on the manifests that don't require manual intervention.
Output from local run with the
connector.yaml
example: