Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(aws-cdk-lib/aws-glue): Cannot update AWS Glue schema after the first version is deployed #25129

Open
DanazSdiq opened this issue Apr 14, 2023 · 2 comments
Labels
@aws-cdk/aws-glue Related to AWS Glue blocked Work is blocked on this issue for this codebase. Other labels or comments may indicate why. bug This issue is a bug. effort/medium Medium work item – several days of effort needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. p2

Comments

@DanazSdiq
Copy link

Describe the bug

I have an existing schema which is defined in JSON format and is deployed using AWS CDK. The schema can be deployed successfully the first time. However, when I try to update the schema, like add a new column, and re-deploy the stack, I get an error saying that the table already exists.

Expected Behavior

I want to be able to evolve the schema and add new columns/modify existing ones without throwing an error.

Current Behavior

I am getting the following error:

4:43:53 PM | UPDATE_FAILED        | AWS::Glue::Schema  | SchemasUsers
Resource handler returned message: "Resource of type 'AWS::Glue::Schema' with identifier 'users' already exists." (RequestToken: 4b0fbe99-10d5-5a75-84e3-5efd3208514b, HandlerErrorCode: AlreadyExist
s)

Reproduction Steps

I have defined a schema in JSON format and I am trying to release new updates of the same schema after it is deployed.

The schema I am trying to deploy the first time:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://random.com/users.json#",
  "title": "users",
  "type": "object",
  "properties": {
    "id": {"type": "string", "format": "uuid"},
    "payload": {"type": "object"}
  }
}

After creating the schema for the very first time and deploying it, add a new column to properties, like:

"name": {"type": "string"}

I am deploying this schema with the following script:

import { Construct } from "constructs";
import { Stack, StackProps } from "aws-cdk-lib";
import * as glue from "aws-cdk-lib/aws-glue";

import * as users from "../../schemas/users.json";

export class Schemas extends Stack {
  constructor(scope: Construct, id: string, props?: StackProps) {
    super(scope, id, props);

    const namespace = "Schemas";

    new glue.CfnSchema(this, `${namespace}_Users`, {
      compatibility: "NONE",
      dataFormat: "JSON",
      name: "users",
      schemaDefinition: JSON.stringify(users),
      checkpointVersion: {
        versionNumber: 1,
      },
    });
  }
}

I then run the following command to deploy the stack:

npm run cdk deploy Schemas

My package.json:

{
  "name": "cdk",
  "version": "0.1.0",
  "bin": {
    "cdk": "bin/cdk.js"
  },
  "scripts": {
    "build": "tsc",
    "watch": "tsc -w",
    "test": "jest",
    "cdk": "cdk"
  },
  "devDependencies": {
    "@types/jest": "^27.5.0",
    "@types/node": "^10.17.27",
    "@types/prettier": "2.6.0",
    "aws-cdk": "^2.73.0",
    "jest": "^27.5.1",
    "ts-jest": "^27.1.4",
    "ts-node": "^10.7.0",
    "typescript": "~3.9.7"
  },
  "dependencies": {
    "@aws-cdk/aws-glue-alpha": "^2.73.0-alpha.0",
    "aws-cdk-lib": "^2.73.0",
    "constructs": "^10.0.0"
  }
}

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

8.5.5

Framework Version

No response

Node.js Version

16.16.0

OS

Mac OS

Language

Typescript

Language Version

3.9.7

Other information

No response

@DanazSdiq DanazSdiq added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Apr 14, 2023
@github-actions github-actions bot added the @aws-cdk/aws-glue Related to AWS Glue label Apr 14, 2023
@pahud pahud added needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. blocked Work is blocked on this issue for this codebase. Other labels or comments may indicate why. effort/medium Medium work item – several days of effort p2 and removed needs-triage This issue or PR still needs to be triaged. labels Apr 17, 2023
@pahud
Copy link
Contributor

pahud commented Apr 17, 2023

Updating the schemaDefinition requires CFN resource replacement with the same schema name i.e. "users" which causes the duplication error. I believe this could be a bug or limitation from cloudformation and you will need to update the construct property name as well e.g. from users to new-users. I have reported this bug to the CFN team at aws-cloudformation/cloudformation-coverage-roadmap#1598

@ashishdhingra
Copy link
Contributor

Refer aws-cloudformation/cloudformation-coverage-roadmap#1598 (comment) for details.
In order to add a new schema version, you need to use AWS:Glue:SchemaVersion. The schema definition used in AWS:Glue:Schema is the initial schema version and cannot be updated as schema versions are immutable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-glue Related to AWS Glue blocked Work is blocked on this issue for this codebase. Other labels or comments may indicate why. bug This issue is a bug. effort/medium Medium work item – several days of effort needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. p2
Projects
None yet
Development

No branches or pull requests

3 participants