Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is there any solution to get last record from stdout? #151

Open
makkaba opened this issue Nov 14, 2018 · 5 comments
Open

is there any solution to get last record from stdout? #151

makkaba opened this issue Nov 14, 2018 · 5 comments

Comments

@makkaba
Copy link

makkaba commented Nov 14, 2018

hi i am jeff

i am now using embulk with little weird way.

  1. a batch python script generates some dynamic yaml files every some minutes.
  2. reads diff.yaml file which is generated by last batch job.
  3. applies last record into dynamic yaml file of 1.
  4. executes embulk several times with dynamic yaml files through python bash interface.
  5. drops diff.yaml file. within it, there is last record.

is there any better way to get last record?
or to call embulk in code way? not in bash way?

thanks

@hiroyuki-sato
Copy link
Member

Hello, @makkaba

I think that I don't understand your use case completely yet.
If you execute Embulk from Python,
what do you think to use a Liquid template engine?
You just set an environment variable from Python.

Another idea.

embulk_test=# select * from incremental_test;
 id | name
----+------
  1 | var1
  2 | var2
  3 | var3
(3 rows)
in:
  type: postgresql
  host: localhost
  port: 5432
  user: user
  password: ****
  database: embulk_test
  table: incremental_test
  incremental: true
  incremental_columns:
  - id
out:
  type: stdout

embulk run test.yml -c diff.yml generate the following file

It output stdout like the following and create diff.yml file.

1,var1
2,var2
3,var3

diff.yml

in:
  last_record: [3]
out: {}

The number 3 is the last record.
It is a YAML file. so I think you just create it with Python.

@makkaba
Copy link
Author

makkaba commented Nov 19, 2018

thank you for your reply.
i want to make sure my purpose:

last record = 300000
for 4
input => db
output => db [1,2,3,4]

i have already used diff.yaml file and dynamic yaml file.
like you said. (python template way)

every time yaml file is generated like this.

in:
  type: sqlserver
  driver_path: ~
  host: ~
  user: ~
  password: ~
  query: "SELECT ~ FROM WHERE ~ AND [idx] > :idx"
  use_raw_query_with_incremental: true
  incremental_columns:
  - idx
  incremental: true

  last_record:
  - 111111111
out:
  type: mysql
  host: ~
  user: ~
  password: ~
  database: ~
  table: ~
  mode: merge
  options: {useUnicode: true, characterEncoding: UTF-8}

but now, i want to get returned value by stdout or programatic return for some reason. (but output must be mysql.)

when i deal output with mysql, stdout will be kind of this..

********************************** INFORMATION **********************************
Join us! Embulk-announce mailing list is up for IMPORTANT announcement such as
compatibility-breaking changes and key feature updates.
https://groups.google.com/forum/#!forum/embulk-announce


java.lang.RuntimeException: java.nio.file.NoSuchFileException: sample.yaml
at org.embulk.EmbulkRunner.run(EmbulkRunner.java:152)
at org.embulk.cli.EmbulkRun.runSubcommand(EmbulkRun.java:437)
...
... 3 more

i just want to share use case.
thanks!

@hiroyuki-sato
Copy link
Member

Hello, @makkaba

Does this mean that you want to get a value from out.type: MySQL?
I think It is outside scope of Embulk.
It is better to use a workflow engine like digdag.

Best regards

@sakama
Copy link
Contributor

sakama commented Nov 20, 2018

i want to get returned value by stdout or programatic return for some reason.

Embulk provides EmbulkEmbed that allows us to execute Embulk from Java program.
https://github.com/embulk/embulk/blob/master/embulk-core/src/main/java/org/embulk/EmbulkEmbed.java
We (Arm Treasure Data) are using this mechanism in our platform to execute Embulk from other codes.

Unfortunately, this mechanism is intended to be executed from Java, not Python.

@makkaba
Copy link
Author

makkaba commented Nov 24, 2018

EmbulkEmbed would be helpful.
thank you all !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants