Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] DF/60/69/10 -- EOM/EOP/EOB markers. #242

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

anastasiakaida
Copy link
Contributor

@anastasiakaida anastasiakaida commented Apr 5, 2019

Because of a need for communication between "worker" and "supervisor" processes the following markers (signals) were implemented for Sink/Source connectors:

  1. Stage 010 (Source-connector):
    *End-of-message
    *End-of-process
    Note: For some reason, there is no need to implement EOB.

  2. Stage 069 (Sink-connector):
    *End-of-message
    *End-of-batch
    *End-of-process

  3. Stage 060 (Sink-connector):
    *End-of-message
    *End-of-batch
    *End-of-process

NOTE: There still some logical collisions in a reading of EOB markers (now pre-sink stage is defined to solve this issue -- see #278 ), so it should be fixed here also according to the new convention.

EOProcess: -E|--eop
EOMessage: -e|--eom

*No default EOM/EOP values specified
*EOM/EOP variables ($EOM, $EOP) are not applied now in this .sh script
About $EOP output:
While we have only 1 message (so-called JSON-array) EOM and EOP
should be just sent after that message. It's possible to write both of them
into $tmp-file, (or) but it would be better to leave only EOM in $tmp
and to output EOP explicitly (just to be prepared for a case when we'll divide
somehow "data channel" and "command channel")
If -p or -f specified -- it's "file mode"
Otherwise -- "stream mode".

$EOM and $EOP variables are internal and keep data from command line.
$EOMarkers and $EOProcess variables are for output.
The previous solution doesn't allow to pass '' as the EOM/EOP marker by user.

$EOM_set (Y/N) -- a variable which keeps the info if EOM is set by user
$EOP_set (Y/N) -- a variable which keeps the info if EOP is set by user
Let's assume using command line options in case statements the following way
just because to write the same way in each script:

-short-option|--long-option)
In case of handling escape sequences for custom markers
and because we should handle somehow whitespace symbols
like '\0', '\n', etc., we should choose one proper approach
and stick to it.

The most suitable way is to put escape sequences to variables and then
interpret them when marker is used.

Like this:
{{{

  EOProcess='\x06'
  <...>
  while read -d $(echo -ne "$DELIMITER") line ; do
    <...>
    echo -ne "$EOProcess"
  done

}}}

This solution also works with whitespace symbols.
@anastasiakaida
Copy link
Contributor Author

@anastasiakaida anastasiakaida requested a review from mgolosova April 5, 2019 07:17
@mgolosova
Copy link
Collaborator

@anastasiakaida, although it isn`t a brand new PR and I know that we have discussed this and that questions about this task... yet I can not remember everything.
Can you please add a description with following information:

  • what is the purpose of the changes / what changed (just as a general rule, PRs should contain this information);
  • links to the previous PRs.

Thank you!

@anastasiakaida
Copy link
Contributor Author

@mgolosova
Sure! I'll add the description asap.

EOP -- End-of-process

-E|--eop

Default:
"" -- for file mode
'\0' -- for stream mode

$EOP_set (Y/N) -- a variable which keeps the info if EOP is set by user
-b|--batch (e)enabled|(d)isabled
Default: (d)isabled

This option means batch-mode support.
If there is no arguments (e|d) with the option, batch-mode will
be switched on automatically.

-B|--eob
Dafault: "NOT_SPECIFIED" -- just a "foolproof" decision for cases
where EOB marker is not passed via the command line.

EOB -- End-of-batch (old: $DELIMITER)
BATCHMODE -- an option to support batch-mode processing.

Combinations of markers lead to the following EOB markers ('\x11'
means here any custom marker):

   -b   ||  X   |  X   |   X    |   X    ||  'e'   | 'e'  |  'e'   |  'e'   |
------- || ---- | ---- | ------ | ------ || ------ | ---- | ------ | ------ |
   -B   ||  X   |  ''  | '\x17' | '\x11' ||   X    |  ''  | '\x17' | '\x11' |
======= || ==== | ==== | ====== | ====== || ====== | ==== | ====== | ====== |
EOBatch || '\n' | '\n' | '\x17' | '\x11' || '\x17' | '\n' | '\x17' | '\x11' |

'\x17' - End Of Transmission Block (ASCII)

Note:
$EOB_set (Y/N) -- a variable which keeps the info if EOB is set by user
Unfortunately, the previous solution for 'read' command doesn't work
with some ASCII control characters as EOB like '\n' or '\0'.

"$(echo -ne $EOBatch)" is empty, so a delimiter can't be use.

For that case it's better to use 'eval'.

The 'eval' statement tells the shell to take eval's arguments
and run them through the command-line processing steps all over again.
EOP -- End-of-process

-E|--eop
Default: '\n' ("") -- for file mode, '\0' -- for stream mode

$EOP_set (Y/N) -- a variable which keeps the info if EOP is set by user
-b|--batch (e)enabled|(d)isabled
Default: (d)isabled

This option means batch-mode support.
If there is no arguments (e|d) with the option, batch-mode will
be switched on automatically.

-B|--eob
Dafault: "NOT_SPECIFIED" -- just a "foolproof" decision for cases
where EOB marker is not passed via the command line.

EOB -- End-of-batch (old: $DELIMITER)
BATCHMODE -- an option to support batch-mode processing.

Combinations of markers lead to the following EOB markers ('\x11'
means here any custom marker):

   -b   ||  X   |  X   |   X    |   X    ||  'e'   | 'e'  |  'e'   |  'e'   |
------- || ---- | ---- | ------ | ------ || ------ | ---- | ------ | ------ |
   -B   ||  X   |  ''  | '\x17' | '\x11' ||   X    |  ''  | '\x17' | '\x11' |
======= || ==== | ==== | ====== | ====== || ====== | ==== | ====== | ====== |
EOBatch || '\n' | '\n' | '\x17' | '\x11' || '\x17' | '\n' | '\x17' | '\x11' |

'\x17' - End Of Transmission Block (ASCII)

Note:
1. $EOB_set (Y/N) -- a variable which keeps the info if EOB is set by user
2. Unfortunately, the previous solution for EOP 'read' command doesn't work
with some ASCII control characters as EOB like '\n' or '\0'.

"$(echo -ne $EOBatch)" is empty, so a delimiter can't be use.

For that case it's better to use 'eval'. As it's used for 069.

The 'eval' statement tells the shell to take eval's arguments
and run them through the command-line processing steps all over again.
-e|--eom
Dafault: '\n'

We suppose that EOM in a stream mode is '\n' by default, bacause
this is one and only acceptable EOM signal. To prevent errors
during upload process it's better to change custom EOM delimiter to
the standard one in line (if it was custom).
-e|--eom
Dafault: '\n'
@anastasiakaida anastasiakaida force-pushed the df-connectors-markers branch from 1fee438 to 9ed37e5 Compare April 23, 2019 19:08
@anastasiakaida anastasiakaida force-pushed the df-connectors-markers branch from 9ed37e5 to f4f3769 Compare June 25, 2019 11:37
@mgolosova
Copy link
Collaborator

@anastasiakaida,
Is this PR actual and requires a review? If so, please add the description, as we discussed above.
If not, please close it or add [WIP] prefix (if it was put aside and you're planning on getting back to it eventually).

@anastasiakaida anastasiakaida changed the title DF/60/69/10 -- EOM/EOP/EOB markers. [WIP] DF/60/69/10 -- EOM/EOP/EOB markers. Nov 5, 2019
@anastasiakaida
Copy link
Contributor Author

anastasiakaida commented Nov 5, 2019

@mgolosova [WIP] prefix added, this PR needs some changes before a review in case of organizing data4es data processing on a node 170

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants