Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(pipelines): bundle assets in the assets project instead of synth #19157

Open
2 tasks
pedrovanzella opened this issue Feb 25, 2022 · 8 comments
Open
2 tasks
Labels
@aws-cdk/pipelines CDK Pipelines library effort/large Large work item – several weeks of effort feature-request A feature should be added or improved. p1

Comments

@pedrovanzella
Copy link

Description

Whenever you use a Docker image in a stack deployed by the pipelines module, it gets built in the assets project. However, if you use an asset bundled in docker (for example, by using a lambda_nodejs_function or a s3_deployment), the assets get bundled in the synth project.

Ideally all assets would be built in the same project, so they can be built in parallel.

Use Case

I want to use s3_deployment inside a pipeline to deploy a React app. This is already possible, but the bundling happens in the synth step, and not in parallel with the other assets. This adds 15+ minutes to every deploy.

Proposed Solution

In the synth step, just copy or zip the code from the asset, and do the bundling in the assets step. This could be a feature flag or an option, so as to not break anyone's workflow that might depend on the current behavior.

I might even just be missing information here and there might be an escape hatch or something else to control this. In this case, the solution might just be documenting that.

Other information

This is what I've added to my stack:

new s3deploy.BucketDeployment(this, 'DeployWithInvalidation', {
      sources: [
        s3deploy.Source.asset('projects/frontend', {
          bundling: {
            image: DockerImage.fromRegistry(
              'public.ecr.aws/docker/library/node:lts-alpine'
            ),
            command: [
              'sh',
              '-c',
              'yarn install && yarn full-build && cp -R ./build/* /asset-output/',
            ],
            environment: {
              // redacted
              REACT_APP_API_URL: props.apiURL,
            },
          },
        }),
      ],
      destinationBucket: siteBucket,
      distribution,
      distributionPaths: ['/*'],
    });

Acknowledge

  • I may be able to implement this feature request
  • This feature might incur a breaking change
@pedrovanzella pedrovanzella added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Feb 25, 2022
@github-actions github-actions bot added the @aws-cdk/pipelines CDK Pipelines library label Feb 25, 2022
@rix0rrr
Copy link
Contributor

rix0rrr commented Feb 26, 2022

Bundling currently happens during synth because we need to have the produced output files in order to calculate their hashes (which means we need to have done the work already).

I'm not happy with how bundling works, and have thoughts in my head for how to change it, but I can't promise it's going to happen any time soon.

@rix0rrr rix0rrr added effort/large Large work item – several weeks of effort p2 and removed needs-triage This issue or PR still needs to be triaged. labels Feb 26, 2022
@rix0rrr rix0rrr removed their assignment Feb 26, 2022
@pedrovanzella
Copy link
Author

Couldn't we just hash the source instead (as mentioned here)?

@adri1wald
Copy link

adri1wald commented May 21, 2022

Have also run into this issue. Is there any way to avoid this by changing when deployment to S3 is done in user land?

I've been trying to implement something along the lines of a custom Step that contains a build action for my webapp (would handle build, deployment to s3, and cloudfront invalidation). However, it is dependent on a codebuild project within the webapp stack (part of my Stage), since this is where the s3 bucket and cloudfront distribution are defined.

It's not clear to me how I can untangle that mess of dependencies or whether it is even possible.

Maybe another approach could be to use a lambda DockerImageFunction with a cloudformation custom resource? E.g.

  • React app, but there is a Dockerfile and a lambda handler
  • Dockerfile builds the react app into the image but runs as a lambda
  • A custom cloudformation resource that triggers the lambda
  • Provide the custom resource with the image uri so it updates when the image changes, e.g.
function getImageUri(lambda: lambda.DockerImageFunction): string | undefined {
  const cfnFunction = lambda.node.defaultChild as lambda.CfnFunction
  const code = cfnFunction.code as lambda.CfnFunction.CodeProperty
  return code.imageUri
}
  • The lambda when invoked copies the react build to s3 bucket and creates the cloudfront invalidation

Solution?

Implemented the second approach and it seems to work. It's definitely a bit out there, but have copied code below incase anyone finds it useful. Would love to get some feedback!

Webapp deployment stack

import * as path from 'path'
import { Construct } from 'constructs'
import { Stack, StackProps, RemovalPolicy, Duration, CustomResource } from 'aws-cdk-lib'
import * as cf from 'aws-cdk-lib/aws-cloudfront'
import * as cm from 'aws-cdk-lib/aws-certificatemanager'
import * as cr from 'aws-cdk-lib/custom-resources'
import * as iam from 'aws-cdk-lib/aws-iam'
import * as lambda from 'aws-cdk-lib/aws-lambda'
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins'
import * as route53 from 'aws-cdk-lib/aws-route53'
import * as route53Targets from 'aws-cdk-lib/aws-route53-targets'
import * as s3 from 'aws-cdk-lib/aws-s3'
import { TApiConfig, TWebappConfig } from '../config'
import { extractTopLevelDomain } from '../utils'

type WebappStackProps = StackProps & {
  /**
   * Api configuration
   */
  apiConfig: TApiConfig
  /**
   * Webapp configuration
   */
  webappConfig: TWebappConfig
}

export class WebappStack extends Stack {
  constructor(scope: Construct, id: string, props: WebappStackProps) {
    super(scope, id, props)

    const { apiConfig, webappConfig } = props

    const zone = route53.HostedZone.fromLookup(this, 'HostedZone', {
      domainName: extractTopLevelDomain(webappConfig.domainName)
    })

    const certificate = new cm.DnsValidatedCertificate(this, 'Certificate', {
      hostedZone: zone,
      domainName: webappConfig.domainName,
      region: 'us-east-1'
    })

    const bucket = new s3.Bucket(this, 'Bucket', {
      autoDeleteObjects: true,
      blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
      removalPolicy: RemovalPolicy.DESTROY
    })

    const originAccessIdentity = new cf.OriginAccessIdentity(this, 'OriginAccessIdentity')
    bucket.grantRead(originAccessIdentity)

    const distribution = new cf.Distribution(this, 'Distribution', {
      domainNames: [webappConfig.domainName],
      certificate,
      defaultBehavior: {
        origin: new origins.S3Origin(bucket, { originAccessIdentity }),
        viewerProtocolPolicy: cf.ViewerProtocolPolicy.REDIRECT_TO_HTTPS
      },
      minimumProtocolVersion: cf.SecurityPolicyProtocol.TLS_V1_2_2021,
      defaultRootObject: 'index.html',
      errorResponses: [
        {
          httpStatus: 404,
          responseHttpStatus: 200,
          responsePagePath: '/index.html',
          ttl: Duration.minutes(5)
        },
        {
          httpStatus: 403,
          responseHttpStatus: 200,
          responsePagePath: '/index.html',
          ttl: Duration.minutes(5)
        }
      ]
    })

    new route53.ARecord(this, 'Alias', {
      zone,
      recordName: webappConfig.domainName,
      target: route53.RecordTarget.fromAlias(new route53Targets.CloudFrontTarget(distribution))
    })

    const deploymentLambda = new lambda.DockerImageFunction(this, 'DeploymentLambda', {
      code: lambda.DockerImageCode.fromImageAsset(path.join(__dirname, '..', '..', 'services', 'webapp'), {
        buildArgs: {
          REACT_APP_SERVER_BASE_URL: `https://${apiConfig.domainName}`
        }
      }),
      timeout: Duration.minutes(10),
      memorySize: 512,
      environment: {
        BUCKET_NAME: bucket.bucketName,
        DISTRIBUTION_ID: distribution.distributionId
      }
    })

    bucket.grantReadWrite(deploymentLambda)
    deploymentLambda.addToRolePolicy(
      new iam.PolicyStatement({
        actions: [
          'cloudfront:CreateInvalidation'
        ],
        resources: [
          `arn:aws:cloudfront::${this.account}:distribution/${distribution.distributionId}`
        ]
      })
    )

    const deploymentProvider = new cr.Provider(this, 'DeploymentProvider', {
      onEventHandler: deploymentLambda
    })

    new CustomResource(this, 'DeploymentResource', {
      serviceToken: deploymentProvider.serviceToken,
      properties: {
        refresh: getImageUri(deploymentLambda)
      }
    })
  }
}

// TODO this is _super_ hacky, but...
// As long as this takes a DockerImageFunction, should be no problemo
function getImageUri(lambda: lambda.DockerImageFunction): string | undefined {
  const cfnFunction = lambda.node.defaultChild as lambda.CfnFunction
  const code = cfnFunction.code as lambda.CfnFunction.CodeProperty
  return code.imageUri
}

Dockerfile in React app folder

#*---------- Base ----------
FROM public.ecr.aws/lambda/nodejs:14 AS base

WORKDIR "${LAMBDA_TASK_ROOT}"

#*---------- React App Builder ----------
FROM base AS builder

COPY package*.json tsconfig.json ./

RUN npm config set ignore-engines true
RUN npm ci

COPY src ./src
COPY public ./public

RUN npm run build

#*---------- Lambda ----------
FROM base as lambda

COPY deployment-lambda/package*.json deployment-lambda/tsconfig.json ./

RUN npm config set ignore-engines true
RUN npm ci

COPY deployment-lambda/src ./src

RUN npm run build

RUN npm prune --production # Remove dev dependencies

#*---------- Release ----------
FROM base AS release

COPY --from=lambda "${LAMBDA_TASK_ROOT}/package*.json" ./
COPY --from=lambda "${LAMBDA_TASK_ROOT}/node_modules" ./node_modules
COPY --from=lambda "${LAMBDA_TASK_ROOT}/dist" ./dist
COPY --from=builder "${LAMBDA_TASK_ROOT}/build" ./build

CMD [ "dist/index.handler" ]

Deployment lambda package in React app deployment-lambda subfolder

import { promises as fs } from 'fs'
import * as path from 'path'
import * as response from 'cfn-response'
import * as mime from 'mime-types'
import { PutObjectCommand, S3Client } from '@aws-sdk/client-s3'
import { CloudFrontClient, CreateInvalidationCommand } from '@aws-sdk/client-cloudfront'

type Params = Parameters<typeof response.send>
type Event = Params[0]
type Context = Params[1]

const BUCKET_NAME = getenv('BUCKET_NAME', true)
const DISTRIBUTION_ID = getenv('DISTRIBUTION_ID', true)

export async function handler(event: Event, context: Context) {
  console.log('Received event: ', JSON.stringify(event, null, 2))
  if (event.RequestType === 'Delete') return
  await deployReactApp()
    .then(() => response.send(event, context, 'SUCCESS', { message: 'ok' }))
    .catch((error) => response.send(event, context, 'FAILED', { message: error.message }))
}

async function deployReactApp() {
  const buildDir = path.join(__dirname, '..', 'build')
  await deployBuild(buildDir)
  await createInvalidation()
}

async function deployBuild(buildDir: string) {
  const s3 = new S3Client({})
  for await (const [currPath, isDir] of getEntries(buildDir)) {
    if (isDir) continue
    const ext = path.extname(currPath)
    const key = currPath.substring(buildDir.length + 1)
    const body = await fs.readFile(currPath)
    const contentType = mime.lookup(ext)
    const command = new PutObjectCommand({
      Bucket: BUCKET_NAME,
      Key: key,
      Body: body,
      ContentType: contentType || undefined
    })
    await s3.send(command)
  }
}

async function createInvalidation() {
  const cf = new CloudFrontClient({})
  const command = new CreateInvalidationCommand({
    DistributionId: DISTRIBUTION_ID,
    InvalidationBatch: {
      CallerReference: new Date().toISOString(),
      Paths: {
        Quantity: 1,
        Items: ['/*']
      }
    }
  })
  await cf.send(command)
}

export function getenv(key: string): string | undefined
export function getenv(key: string, defaultValue: string): string
export function getenv(key: string, mustExist: false): string | undefined
export function getenv(key: string, mustExist: true) : string
export function getenv(key: string, mustExist: boolean | string = false) {
  const value = process.env[key]
  if (typeof mustExist === 'string') {
    return value || mustExist
  }
  if (!mustExist) {
    return value
  }
  if (value == null) {
    throw new Error(`Missing value for environment variable ${key}`)
  }
  return value
}

async function* getEntries(dirPath: string): AsyncGenerator<[string, boolean]> {
  const dirents = await fs.readdir(dirPath, { withFileTypes: true })
  for (const dirent of dirents) {
    const fileOrDir = path.join(dirPath, dirent.name)
    const isDir = dirent.isDirectory()
    yield [ fileOrDir, isDir ]
    if (isDir) {
      yield* getEntries(fileOrDir)
    }
  }
}

@github-actions
Copy link

This issue has received a significant amount of attention so we are automatically upgrading its priority. A member of the community will see the re-prioritization and provide an update on the issue.

@github-actions github-actions bot added p1 and removed p2 labels May 22, 2022
@kamzil
Copy link

kamzil commented Jun 16, 2022

I'm having this same issue. @rix0rrr care to share the thoughts you've had regarding this? Could hashing the source work instead, like OP suggested?

@Naumel
Copy link
Contributor

Naumel commented Aug 31, 2022

Hello all,
We have noted the community engagement on this issue and we understand its importance.
Since there are a number of such items, we need to make sure that we’re working on the most important ones.
This means that the issue will be moved into the backlog; it will be checked against the other projects for complexity, risk and effort versus benefit.
We will keep posting updates as they become available!

@Richardmbs12
Copy link

This I believe will be valuable for my use case as well and anyone running multiple native java functions/stacks.

We run Java spring boot functions compiled with GraalVM. Each maven module/function takes about 10 - 15 minutes to compile (a lot of time but that's how long native java/spring native takes). We have 10+ maven modules/functions.

I've split every function in their own stack, to be able to run them in parallel with CDK pipeline's parallel stage/wave capability. My main goal/hope was that the asset building would be done in parallel for the deployment, making everything build in 10-15 minutes instead of 150 minutes.

I then saw CDK Synth goes and runs each of the function's asset bundling via bundling options, sequentially. It takes forever to build and deploy. So I was sadly left without an easy way to run asset bundling in parallel

@Hi-Fi
Copy link
Contributor

Hi-Fi commented Oct 1, 2022

Just to think out loud this would require:

  1. New asset type (e.g. LambdaAsset) to cloud-assembly-schema that would work with bundling (commands etc. needed to bundle at separate step). Or just extending the FileAsset, as props for bundling are already at s3_assets.
  2. cdk-assets to build that new asset type correctly from definition
  3. aws-s3-assets to generate bundling information to manifest
    • It might be that at first it's easier to go with flag to separate build to be made inside container, as it might be hard to implement local bundling things to manifest side
  4. Flag to pipelines to use manifest based bundling instead of even trying local one.

Type itself is a bit of mix of existing FileAsset and DockerImageAsset, as output is (has to be) still FileDestination, but input is closer to DockerImageSource.

Only possible breaking change might come from step 3, as 2 first ones are just additions that can be separately made as long as there's some clear understanding of the props needed.

In a way for pipelines part it would be probably clearest to use directly bundling image in CodeBuild and not even try to build with default image (e.g. Standard:5). But that's probably something that doesn't need to be thought much before pipelines part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/pipelines CDK Pipelines library effort/large Large work item – several weeks of effort feature-request A feature should be added or improved. p1
Projects
None yet
Development

No branches or pull requests

7 participants