Skip to content
This repository has been archived by the owner on Jun 18, 2020. It is now read-only.

PostgreSQL -> Redshift? #72

Open
joannechan opened this issue Apr 20, 2017 · 1 comment
Open

PostgreSQL -> Redshift? #72

joannechan opened this issue Apr 20, 2017 · 1 comment

Comments

@joannechan
Copy link

joannechan commented Apr 20, 2017

Is there any working sample / template for loading postgresql data onto redshift?
What is the ideal way to handle schema creation, and deleted / updated data?

@joannechan joannechan changed the title Postgres -> Redshift? PostgreSQL -> Redshift? Apr 20, 2017
@jonathanpdiaz
Copy link

Here is a template built with Architect.
What we did is create a RDS to S3 task from scratch.
Then, used the S3 to Redshift from the template.

Here is the JSON definition:

  "objects": [
    {
      "failureAndRerunMode": "CASCADE",
      "resourceRole": "DataPipelineDefaultResourceRole",
      "role": "DataPipelineDefaultRole",
      "pipelineLogUri": "#{myS3LogsPath}",
      "scheduleType": "ONDEMAND",
      "name": "Default",
      "id": "Default"
    },
    {
      "database": {
        "ref": "DatabaseId_WC2j5"
      },
      "name": "DefaultSqlDataNode1",
      "id": "SqlDataNodeId_VevnE",
      "type": "SqlDataNode",
      "selectQuery": "#{myRDSSelectQuery}",
      "table": "#{myRDSTable}"
    },
    {
      "*password": "#{*myRDSPassword}",
      "name": "RDS_database",
      "id": "DatabaseId_WC2j5",
      "type": "RdsDatabase",
      "rdsInstanceId": "#{myRDSId}",
      "username": "#{myRDSUsername}"
    },
    {
      "output": {
        "ref": "S3DataNodeId_iYhHx"
      },
      "input": {
        "ref": "SqlDataNodeId_VevnE"
      },
      "name": "DefaultCopyActivity1",
      "runsOn": {
        "ref": "ResourceId_G9GWz"
      },
      "id": "CopyActivityId_CapKO",
      "type": "CopyActivity"
    },
    {
      "dependsOn": {
        "ref": "CopyActivityId_CapKO"
      },
      "filePath": "#{myS3Container}#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}",
      "name": "DefaultS3DataNode1",
      "id": "S3DataNodeId_iYhHx",
      "type": "S3DataNode"
    },
    {
      "resourceRole": "DataPipelineDefaultResourceRole",
      "role": "DataPipelineDefaultRole",
      "instanceType": "m1.medium",
      "name": "DefaultResource1",
      "id": "ResourceId_G9GWz",
      "type": "Ec2Resource",
      "terminateAfter": "30 Minutes"
    }
  ],
  "parameters": [
  ]
}

Hope it helps.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants