[Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 #1351

mihaelabalas84 · 2024-06-10T10:29:47Z

Is there an existing issue for this?

I have searched the existing issues

Affected Resource(s)

ReplicationGroup.elasticache.aws.upbound.io/v1beta2

Resource MRs required to reproduce the bug

The following replication group was created using the provider version 1.1.0

apiVersion: elasticache.aws.upbound.io/v1beta2
kind: ReplicationGroup
metadata:
  name: ***-staging-redis
spec:
  deletionPolicy: Orphan
  forProvider:
    applyImmediately: true
    autoMinorVersionUpgrade: "true"
    automaticFailoverEnabled: true
    description: '***-staging Redis cache '
    engine: redis
    engineVersion: "7.1"
    ipDiscovery: ipv4
    maintenanceWindow: fri:05:00-fri:06:00
    multiAzEnabled: true
    networkType: ipv4
    nodeType: cache.t3.small
    numNodeGroups: 1
    parameterGroupName: default.redis7
    port: 6379
    region: eu-west-1
    replicasPerNodeGroup: 1
    securityGroupIdRefs:
    - name: ***-staging-redis-security-group
    securityGroupIds:
    - ****
    snapshotWindow: 03:30-04:30
    subnetGroupName: ***-staging-redis-csg
    subnetGroupNameRef:
      name: ***-staging-redis-csg
  providerConfigRef:
    name: provider-aws-upbound-elasticache

Steps to Reproduce

Using the manifest above create replication group with all upbpund prioviders and aws family in version 1.1.0. Upgrade elasticache provider to 1.1.4 (all providers were upgraded including provider-family-aws).

What happened?

All replication groups went into Async state.

Relevant Error Output Snippet

conditions:
  - lastTransitionTime: "2024-06-10T09:15:08Z"
    message: "update failed: async update failed: failed to update the resource: [{0
      changing auth_token for ElastiCache Replication Group (***-staging-redis):
      InvalidParameterValue: The AUTH token modification is only supported when encryption-in-transit
      is enabled.\n\tstatus code: 400, request id: daa20ded-8655-41bd-a278-f2bed79877b6
      \ []}]"
    reason: ReconcileError
    status: "False"
    type: Synced
  - lastTransitionTime: "2024-06-10T09:15:08Z"
    message: "async update failed: failed to update the resource: [{0 changing auth_token
      for ElastiCache Replication Group (***-staging-redis): InvalidParameterValue:
      The AUTH token modification is only supported when encryption-in-transit is
      enabled.\n\tstatus code: 400, request id: daa20ded-8655-41bd-a278-f2bed79877b6
      \ []}]"
    reason: AsyncUpdateFailure
    status: "False"
    type: LastAsyncOperation
  - lastTransitionTime: "2024-06-06T07:59:48Z"
    reason: Available
    status: "True"
    type: Ready

Crossplane Version

1.15.2

Provider Version

1.1.4

Kubernetes Version

1.28.1

Kubernetes Distribution

EKS

Additional Info

I understand where this comes from, it is from terrafrom-provider-aws change hashicorp/terraform-provider-aws#34460 that now forces to set auth_token_update_strategy. For replication groups where in transit encryption is not enabled, AWS does not accept this update and all our Replication Group remain in unSync state. So far the only solution is to downgrade the provider or to recreate the cache in the new version.

The text was updated successfully, but these errors were encountered:

turkenf · 2024-06-10T22:18:54Z

Hi @mihaelabalas84,

Thank you for raising this issue, kindly consider the following;

please add a title, briefly state the problem/bug, and indicate which family provider is causing the problem
check the versions and make sure you wrote it correctly
add explicit reproduction steps so we can reproduce the issue again

mihaelabalas84 · 2024-06-11T07:43:32Z

Hi @mihaelabalas84,

Thank you for raising this issue, kindly consider the following;

please add a title, briefly state the problem/bug, and indicate which family provider is causing the problem

check the versions and make sure you wrote it correctly

add explicit reproduction steps so we can reproduce the issue again

done. Sorry for the mess.

caiofralmeida · 2024-06-19T11:57:43Z

The same issue is happening at version 1.3.1

async update failed: failed to update the resource: [{0 changing auth_token for ElastiCache Replication Group (kafka-operator): InvalidParameterValue: The AUTH token modification is only supported when encryption-in-transit is enabled.

I notice that in the version v1.6.1 there is a new field autoGenerateAuthToken to disable this behavior.

chlunde · 2024-06-20T09:01:29Z

I wonder if it is related to the introduction of hashicorp/terraform-provider-aws@0b7e4ba#diff-5d55dcf3aa8ffba3437fb3ff6b7a96b74c9f9196d47dbb4bb63369259cc083bc a few releases back

I think if someone can install the old provider (1.1.4?) in a lab cluster, setup a cluster without auth, kubectl get -o yaml --show-managed-fields=true and then upgrade to >= 1.3.1, and run the same command, maybe we get a hint?

You are running without any auth, right? The AWS API has an explicit field for that, but not the terraform and crossplane provider: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/in-transit-encryption-disable.html

Same as #1370

github-actions · 2024-09-19T04:32:24Z

This provider repo does not have enough maintainers to address every issue. Since there has been no activity in the last 90 days it is now marked as stale. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

github-actions · 2024-10-03T04:34:35Z

This issue is being closed since there has been no activity for 14 days since marking it as stale. If you still need help, feel free to comment or reopen the issue!

mihaelabalas84 added bug Something isn't working needs:triage labels Jun 10, 2024

mihaelabalas84 changed the title ~~[Bug]:~~ [Bug]: Replicationgroup.elasticache.aws.upbound.io in error after upgrade provider from 1.1.0 to 1.1.4 Jun 11, 2024

mihaelabalas84 changed the title ~~[Bug]: Replicationgroup.elasticache.aws.upbound.io in error after upgrade provider from 1.1.0 to 1.1.4~~ [Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 Jun 11, 2024

github-actions bot added the stale label Sep 19, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 #1351

[Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 #1351

mihaelabalas84 commented Jun 10, 2024 •

edited

Loading

turkenf commented Jun 10, 2024

mihaelabalas84 commented Jun 11, 2024

caiofralmeida commented Jun 19, 2024

chlunde commented Jun 20, 2024

github-actions bot commented Sep 19, 2024

github-actions bot commented Oct 3, 2024

[Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 #1351

[Bug]: Replicationgroup.elasticache.aws.upbound.io in async after upgrade provider from 1.1.0 to 1.1.4 #1351

Comments

mihaelabalas84 commented Jun 10, 2024 • edited Loading

Is there an existing issue for this?

Affected Resource(s)

Resource MRs required to reproduce the bug

Steps to Reproduce

What happened?

Relevant Error Output Snippet

Crossplane Version

Provider Version

Kubernetes Version

Kubernetes Distribution

Additional Info

turkenf commented Jun 10, 2024

mihaelabalas84 commented Jun 11, 2024

caiofralmeida commented Jun 19, 2024

chlunde commented Jun 20, 2024

github-actions bot commented Sep 19, 2024

github-actions bot commented Oct 3, 2024

mihaelabalas84 commented Jun 10, 2024 •

edited

Loading