Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 #1351

Closed
1 task done
mihaelabalas84 opened this issue Jun 10, 2024 · 6 comments
Labels
bug Something isn't working needs:triage stale

Comments

@mihaelabalas84
Copy link

mihaelabalas84 commented Jun 10, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Affected Resource(s)

ReplicationGroup.elasticache.aws.upbound.io/v1beta2

Resource MRs required to reproduce the bug

The following replication group was created using the provider version 1.1.0

apiVersion: elasticache.aws.upbound.io/v1beta2
kind: ReplicationGroup
metadata:
  name: ***-staging-redis
spec:
  deletionPolicy: Orphan
  forProvider:
    applyImmediately: true
    autoMinorVersionUpgrade: "true"
    automaticFailoverEnabled: true
    description: '***-staging Redis cache '
    engine: redis
    engineVersion: "7.1"
    ipDiscovery: ipv4
    maintenanceWindow: fri:05:00-fri:06:00
    multiAzEnabled: true
    networkType: ipv4
    nodeType: cache.t3.small
    numNodeGroups: 1
    parameterGroupName: default.redis7
    port: 6379
    region: eu-west-1
    replicasPerNodeGroup: 1
    securityGroupIdRefs:
    - name: ***-staging-redis-security-group
    securityGroupIds:
    - ****
    snapshotWindow: 03:30-04:30
    subnetGroupName: ***-staging-redis-csg
    subnetGroupNameRef:
      name: ***-staging-redis-csg
  providerConfigRef:
    name: provider-aws-upbound-elasticache

Steps to Reproduce

Using the manifest above create replication group with all upbpund prioviders and aws family in version 1.1.0. Upgrade elasticache provider to 1.1.4 (all providers were upgraded including provider-family-aws).

What happened?

All replication groups went into Async state.

Relevant Error Output Snippet

conditions:
  - lastTransitionTime: "2024-06-10T09:15:08Z"
    message: "update failed: async update failed: failed to update the resource: [{0
      changing auth_token for ElastiCache Replication Group (***-staging-redis):
      InvalidParameterValue: The AUTH token modification is only supported when encryption-in-transit
      is enabled.\n\tstatus code: 400, request id: daa20ded-8655-41bd-a278-f2bed79877b6
      \ []}]"
    reason: ReconcileError
    status: "False"
    type: Synced
  - lastTransitionTime: "2024-06-10T09:15:08Z"
    message: "async update failed: failed to update the resource: [{0 changing auth_token
      for ElastiCache Replication Group (***-staging-redis): InvalidParameterValue:
      The AUTH token modification is only supported when encryption-in-transit is
      enabled.\n\tstatus code: 400, request id: daa20ded-8655-41bd-a278-f2bed79877b6
      \ []}]"
    reason: AsyncUpdateFailure
    status: "False"
    type: LastAsyncOperation
  - lastTransitionTime: "2024-06-06T07:59:48Z"
    reason: Available
    status: "True"
    type: Ready

Crossplane Version

1.15.2

Provider Version

1.1.4

Kubernetes Version

1.28.1

Kubernetes Distribution

EKS

Additional Info

I understand where this comes from, it is from terrafrom-provider-aws change hashicorp/terraform-provider-aws#34460 that now forces to set auth_token_update_strategy. For replication groups where in transit encryption is not enabled, AWS does not accept this update and all our Replication Group remain in unSync state. So far the only solution is to downgrade the provider or to recreate the cache in the new version.

@mihaelabalas84 mihaelabalas84 added bug Something isn't working needs:triage labels Jun 10, 2024
@turkenf
Copy link
Collaborator

turkenf commented Jun 10, 2024

Hi @mihaelabalas84,

Thank you for raising this issue, kindly consider the following;

  • please add a title, briefly state the problem/bug, and indicate which family provider is causing the problem
  • check the versions and make sure you wrote it correctly
  • add explicit reproduction steps so we can reproduce the issue again

@mihaelabalas84 mihaelabalas84 changed the title [Bug]: [Bug]: Replicationgroup.elasticache.aws.upbound.io in error after upgrade provider from 1.1.0 to 1.1.4 Jun 11, 2024
@mihaelabalas84
Copy link
Author

Hi @mihaelabalas84,

Thank you for raising this issue, kindly consider the following;

  • please add a title, briefly state the problem/bug, and indicate which family provider is causing the problem
  • check the versions and make sure you wrote it correctly
  • add explicit reproduction steps so we can reproduce the issue again

done. Sorry for the mess.

@mihaelabalas84 mihaelabalas84 changed the title [Bug]: Replicationgroup.elasticache.aws.upbound.io in error after upgrade provider from 1.1.0 to 1.1.4 [Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 Jun 11, 2024
@caiofralmeida
Copy link

The same issue is happening at version 1.3.1

async update failed: failed to update the resource: [{0 changing auth_token for ElastiCache Replication Group (kafka-operator): InvalidParameterValue: The AUTH token modification is only supported when encryption-in-transit is enabled.

I notice that in the version v1.6.1 there is a new field autoGenerateAuthToken to disable this behavior.

@chlunde
Copy link
Contributor

chlunde commented Jun 20, 2024

I wonder if it is related to the introduction of hashicorp/terraform-provider-aws@0b7e4ba#diff-5d55dcf3aa8ffba3437fb3ff6b7a96b74c9f9196d47dbb4bb63369259cc083bc a few releases back

I think if someone can install the old provider (1.1.4?) in a lab cluster, setup a cluster without auth, kubectl get -o yaml --show-managed-fields=true and then upgrade to >= 1.3.1, and run the same command, maybe we get a hint?

You are running without any auth, right? The AWS API has an explicit field for that, but not the terraform and crossplane provider: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/in-transit-encryption-disable.html

Same as #1370

Copy link

This provider repo does not have enough maintainers to address every issue. Since there has been no activity in the last 90 days it is now marked as stale. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

@github-actions github-actions bot added the stale label Sep 19, 2024
Copy link

github-actions bot commented Oct 3, 2024

This issue is being closed since there has been no activity for 14 days since marking it as stale. If you still need help, feel free to comment or reopen the issue!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs:triage stale
Projects
None yet
Development

No branches or pull requests

4 participants