Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3_ETL_BUCKET default and mode values being concatenated #264

Open
JeremyColton opened this issue Jun 5, 2017 · 1 comment
Open

S3_ETL_BUCKET default and mode values being concatenated #264

JeremyColton opened this issue Jun 5, 2017 · 1 comment

Comments

@JeremyColton
Copy link

JeremyColton commented Jun 5, 2017

Hi,

Great product!

I see an issue though - I want to specify mode-specific config eg S3_ETL_BUCKET. If I don't specify this key in the default config, when creating my pipeline with:

dataduct pipeline validate -m production -f test2.yaml

I see the error: "KeyError: 'S3_ETL_BUCKET'. So I must include it eg:
etl: S3_ETL_BUCKET: ABC

But the value ABC is then joined to the mode-specific value below:
production: etl: S3_BASE_PATH: prod

and the following value is seen in my Pipeline for eg logging:
s3://ABC/prod/logs/jeremy_example_upsert/version_20170605153433

This is a bug. Your docs say "Modes define override settings for running a pipeline" but it joins them together instead.

Thanks so much and please help...

@JeremyColton
Copy link
Author

JeremyColton commented Jun 5, 2017

The code fix is to edit this file:
dataduct/etl/etl_pipeline.py

Line 39, change from:
S3_ETL_BUCKET = config.etl['S3_ETL_BUCKET']

to:
S3_ETL_BUCKET = config.etl.get('S3_ETL_BUCKET', const.EMPTY_STR)

This allows the default S3_ETL_BUCKET to be empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant