Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modernization #8

Open
wants to merge 2 commits into
base: airflow-v2
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 22 additions & 4 deletions airflow/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
FROM apache/airflow:1.10.11
FROM apache/airflow:2.2.2-python3.8


ENV AIRFLOW_HOME=/usr/local/airflow

Expand All @@ -8,9 +9,24 @@ RUN apt-get update && apt-get install -y python3-pip \
libcurl4-gnutls-dev \
librtmp-dev \
python3-dev \
libpq-dev

RUN python3 -m pip install PyGreSQL argcomplete pycurl
python3-setuptools \
libpq-dev \
gcc \
build-essential \
g++ \
git-all \
unixodbc-dev \
apt-utils \
apt-transport-https \
debconf-utils \
vim \
telnet
Comment on lines +12 to +23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of bundling in too many packages, consider moving them to optional requirements.txt, which user can fill in.
If needed, update ReadMe with details

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't requirements.txt a Python thing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. And all these packages are being installed using pip.
I should have included Line #11. It looks like this..

RUN python3 -m pip install PyGreSQL argcomplete pycurl
    python3-setuptools \
    .....


RUN python3 -m pip install --upgrade pip

RUN python3 -m pip install awscli
RUN export PATH=~/.local/bin:$PATH; \
aws --version

COPY ./config/* /
COPY ./dags ${AIRFLOW_HOME}/dags
Expand All @@ -21,6 +37,8 @@ EXPOSE 8080

USER airflow

RUN python3 -m pip install --user argcomplete pycurl concurrent-log-handler

WORKDIR ${AIRFLOW_HOME}

# ENTRYPOINT ["/entrypoint.sh"]
5 changes: 2 additions & 3 deletions airflow/config/webserver_entry.sh
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
#!/usr/bin/env bash

set -Eeuxo pipefail

airflow initdb
airflow db init
sleep 5

airflow create_user -r Admin -u admin -f FirstName -l LastName -p ${ADMIN_PASS} -e [email protected]
airflow users create -r Admin -u admin -f FirstName -l LastName -p ${ADMIN_PASS} -e [email protected]
sleep 5

airflow webserver
3 changes: 2 additions & 1 deletion airflow/config/worker_entry.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/usr/bin/env bash

set -Eeuxo pipefail

sleep 30
airflow worker
airflow celery worker
6 changes: 3 additions & 3 deletions app/config.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { DBConfig } from "./constructs/rds";
import {InstanceClass, InstanceSize, InstanceType} from "@aws-cdk/aws-ec2";
import { RetentionDays } from "@aws-cdk/aws-logs";
import {InstanceClass, InstanceSize, InstanceType} from "aws-cdk-lib/aws-ec2";
import { RetentionDays } from "aws-cdk-lib/aws-logs";

export interface AirflowTaskConfig {
readonly cpu: number;
Expand Down Expand Up @@ -66,7 +66,7 @@ export const defaultDBConfig: DBConfig = {
dbName: "farflow",
port: 5432,
masterUsername: "airflow",
instanceType: InstanceType.of(InstanceClass.T2, InstanceSize.SMALL),
instanceType: InstanceType.of(InstanceClass.T3, InstanceSize.SMALL),
allocatedStorageInGB: 25,
backupRetentionInDays: 30
};
Expand Down
15 changes: 8 additions & 7 deletions app/constructs/airflow-construct.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
import {CfnOutput, Construct} from "@aws-cdk/core";
import { IVpc } from "@aws-cdk/aws-ec2";

import ecs = require('@aws-cdk/aws-ecs');
import ec2 = require("@aws-cdk/aws-ec2");
import { DockerImageAsset } from '@aws-cdk/aws-ecr-assets';
import { FargateTaskDefinition } from '@aws-cdk/aws-ecs';
import { CfnOutput } from "aws-cdk-lib/core";
import { Construct } from 'constructs';
import { IVpc } from "aws-cdk-lib/aws-ec2";

import ecs = require('aws-cdk-lib/aws-ecs');
import ec2 = require("aws-cdk-lib/aws-ec2");
import { DockerImageAsset } from 'aws-cdk-lib/aws-ecr-assets';
import { FargateTaskDefinition } from 'aws-cdk-lib/aws-ecs';

import {airflowTaskConfig, ContainerConfig} from "../config";
import { ServiceConstruct } from "./service-construct";
Expand Down
13 changes: 7 additions & 6 deletions app/constructs/dag-tasks.ts
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import { Construct } from "@aws-cdk/core";
import {AwsLogDriver, } from "@aws-cdk/aws-ecs";
import { RetentionDays } from "@aws-cdk/aws-logs";
import {IVpc, ISecurityGroup, Port} from "@aws-cdk/aws-ec2";
import efs = require('@aws-cdk/aws-efs');
import { LogGroup } from '@aws-cdk/aws-logs';

import { Construct } from 'constructs';
import {AwsLogDriver, } from "aws-cdk-lib/aws-ecs";
import { RetentionDays } from "aws-cdk-lib/aws-logs";
import {IVpc, ISecurityGroup, Port} from "aws-cdk-lib/aws-ec2";
import efs = require('aws-cdk-lib/aws-efs');
import { LogGroup } from 'aws-cdk-lib/aws-logs';

import { AirflowDagTaskDefinition, EfsVolumeInfo } from "./task-construct"

Expand Down
15 changes: 8 additions & 7 deletions app/constructs/rds.ts
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
import { Duration, Construct } from "@aws-cdk/core";
import { Duration} from "aws-cdk-lib/core";
import { Construct } from 'constructs';
import {
DatabaseInstance,
DatabaseInstanceEngine, PostgresEngineVersion,
StorageType
} from "@aws-cdk/aws-rds";
import { ISecret, Secret } from "@aws-cdk/aws-secretsmanager";
} from "aws-cdk-lib/aws-rds";
import { ISecret, Secret } from "aws-cdk-lib/aws-secretsmanager";
import {
InstanceType,
ISecurityGroup,
IVpc,
SubnetType
} from "@aws-cdk/aws-ec2";
} from "aws-cdk-lib/aws-ec2";

import { defaultDBConfig } from "../config";

Expand Down Expand Up @@ -60,13 +61,13 @@ export class RDSConstruct extends Construct {

this.rdsInstance = new DatabaseInstance(this, "RDSInstance", {
engine: DatabaseInstanceEngine.postgres({
version: PostgresEngineVersion.VER_12_4
version: PostgresEngineVersion.VER_13_4
}),
instanceType: defaultDBConfig.instanceType,
instanceIdentifier: defaultDBConfig.dbName,
vpc: props.vpc,
securityGroups: [props.defaultVpcSecurityGroup],
vpcPlacement: { subnetType: SubnetType.PRIVATE },
vpcSubnets: { subnetType: SubnetType.PRIVATE_WITH_NAT },
storageEncrypted: true,
multiAz: false,
autoMinorVersionUpgrade: false,
Expand Down Expand Up @@ -94,6 +95,6 @@ export class RDSConstruct extends Construct {
endpoint: string,
password: string
): string {
return `postgresql+pygresql://${dbConfig.masterUsername}:${password}@${endpoint}:${dbConfig.port}/${dbConfig.dbName}`;
return `postgresql+psycopg2://${dbConfig.masterUsername}:${password}@${endpoint}:${dbConfig.port}/${dbConfig.dbName}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motive behind this change? Are there any added benefits, as this is used only for metadata?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

psycopg is much more active open source project than pygresql https://github.com/psycopg https://github.com/PyGreSQL so that is why the change was made

}
}
18 changes: 10 additions & 8 deletions app/constructs/service-construct.ts
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
import {CfnOutput, Construct, Duration} from "@aws-cdk/core";
import {IVpc} from "@aws-cdk/aws-ec2";
import {FargatePlatformVersion, FargateTaskDefinition} from '@aws-cdk/aws-ecs';
import { CfnOutput, Duration } from "aws-cdk-lib/core";
import { Construct } from 'constructs';
import {IVpc} from "aws-cdk-lib/aws-ec2";
import { FargatePlatformVersion, FargateTaskDefinition } from 'aws-cdk-lib/aws-ecs';

import {PolicyConstruct} from "../policies";
import {workerAutoScalingConfig} from "../config";
import ecs = require('@aws-cdk/aws-ecs');
import ec2 = require("@aws-cdk/aws-ec2");
import elbv2 = require("@aws-cdk/aws-elasticloadbalancingv2");
import ecs = require('aws-cdk-lib/aws-ecs');
import ec2 = require("aws-cdk-lib/aws-ec2");
import elbv2 = require("aws-cdk-lib/aws-elasticloadbalancingv2");

export interface ServiceConstructProps {
readonly vpc: IVpc;
Expand All @@ -29,14 +30,15 @@ export class ServiceConstruct extends Construct {
policies.managedPolicies.forEach(managedPolicy => props.taskDefinition.taskRole.addManagedPolicy(managedPolicy));
}
if (policies.policyStatements) {
policies.policyStatements.forEach(policyStatement => props.taskDefinition.taskRole.addToPolicy(policyStatement));
policies.policyStatements.forEach(policyStatement => props.taskDefinition.taskRole.addToPrincipalPolicy(policyStatement));
}

// Create Fargate Service for Airflow
this.fargateService = new ecs.FargateService(this, name, {
cluster: props.cluster,
taskDefinition: props.taskDefinition,
securityGroup: props.defaultVpcSecurityGroup,
securityGroups: [props.defaultVpcSecurityGroup,],
enableExecuteCommand: true,
platformVersion: FargatePlatformVersion.VERSION1_4
});
const allowedPorts = new ec2.Port({
Expand Down
13 changes: 8 additions & 5 deletions app/constructs/task-construct.ts
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
import { Construct } from "@aws-cdk/core";
//import { Construct } from "aws-cdk-lib/core";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete commented code from this file and other files as well

import { Construct } from 'constructs';

import ecs = require('aws-cdk-lib/aws-ecs');
import { DockerImageAsset } from 'aws-cdk-lib/aws-ecr-assets';
import { FargateTaskDefinition } from 'aws-cdk-lib/aws-ecs';
import {ManagedPolicy} from "aws-cdk-lib/aws-iam";


import ecs = require('@aws-cdk/aws-ecs');
import { DockerImageAsset } from '@aws-cdk/aws-ecr-assets';
import { FargateTaskDefinition } from '@aws-cdk/aws-ecs';
import {ManagedPolicy} from "@aws-cdk/aws-iam";

export interface AirflowDagTaskDefinitionProps {
readonly taskFamilyName: string;
Expand Down
6 changes: 3 additions & 3 deletions app/farflow.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import ec2 = require('@aws-cdk/aws-ec2');
import ecs = require('@aws-cdk/aws-ecs');
import cdk = require('@aws-cdk/core');
import ec2 = require('aws-cdk-lib/aws-ec2');
import ecs = require('aws-cdk-lib/aws-ecs');
import cdk = require('aws-cdk-lib/core');
import {RDSConstruct} from "./constructs/rds";
import {AirflowConstruct} from "./constructs/airflow-construct";
import { DagTasks } from './constructs/dag-tasks';
Expand Down
5 changes: 3 additions & 2 deletions app/policies.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { Construct } from "@aws-cdk/core";
import { IManagedPolicy, ManagedPolicy, PolicyStatement } from "@aws-cdk/aws-iam";
//import { Construct } from "aws-cdk-lib/core";
import { Construct } from 'constructs';
import { IManagedPolicy, ManagedPolicy, PolicyStatement } from "aws-cdk-lib/aws-iam";

export class PolicyConstruct extends Construct {
public readonly policyStatements?: PolicyStatement[];
Expand Down
Loading