How to use multiple scripts or directories in a transform step? #262

EAbis · 2017-05-04T20:42:36Z

First of all thank you for making this project, it has made AWS DataPipeline useable.

My question is how do you go about passing multiple scripts, or multiple directories into the a pipeline's YAML file? The reason I'm asking is because I want to consolidate common functionality without having to pass the entire directory for every job to every datapipeline.

For example we currently have a project structure that looks something like this:

Jobs
- Job1
- - job1.py
- - job1.yaml
- - duplicated_utility.py
- Job2
- - job2.py
- - job2.yaml
- - duplicated_utility.py
- Job3
- - job3.py
- - job3.yaml
- - duplicated_utility.py
...

What I want to do is to consolidate the duplicate utility.py files into one file or collection of files in a lib directory. So what I want it to look like would be:

Jobs
- Job1
- - job1.py
- - job1.yaml
- Job2
- - job2.py
- - job2.yaml
- Job3
- - job3.py
- - job3.yaml
lib
- utliity.py
...

The problem with this is that you would have to pass in the directory and point to a specific script name for each job, thus meaning you are moving a lot of extra files around for no reason. You could also create symlinks within each job to the lib directory but that requires a lot of overhead and just isn't ideal.

Is there some functionality I'm not aware of, or a best practice to be used?

Thanks!

Eric

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use multiple scripts or directories in a transform step? #262

How to use multiple scripts or directories in a transform step? #262

EAbis commented May 4, 2017

How to use multiple scripts or directories in a transform step? #262

How to use multiple scripts or directories in a transform step? #262

Comments

EAbis commented May 4, 2017