-
Notifications
You must be signed in to change notification settings - Fork 2
Internal communication protocol
Marina Golosova edited this page Oct 21, 2020
·
15 revisions
This page provides specification of the internal communication protocol, developed for data and flow control commands transfer between supervisor and worker of the ETL process' stages.
- The protocol is used for inter-process communication.
- The protocol purpose is data transfer and data flow control between two processes: stage supervisor (
S
) and worker (W
). - Connection between
S
andW
is established byS
executingW
's run instructions thruough theW
's process standard input (W-STDIN
) and output streams (W-STDOUT
). - Protocol elements are:
-
<marker>
-- ASCII symbol, never encountered in the transferred data; -
<message>
-- raw data (message content), ending withEOP
(end-of-process) marker; -
<batch>
-- group of messages ending withEOB
(end-of-batch) marker.
- The
<message>
content format and encode/decode rules are not defined by the protocol and to be coordinated at the application level. - All the
<marker>
s have default values, which can be altered byS
and passed to theW
's as a part of its run instruction. - List of
<marker>
s:
-
EOM
(end-of-message):- can be sent by:
S
,W
; - usage: placed to the stream after the raw data (message content) to indicate the end of the message content;
- can be sent by:
-
EOP
(end-of-process):- can be sent by:
W
; - usage: placed to
W-STDOUT
after the last message of theW
's operation execution result (or by its own , if no messages produced) to indicate that requested operation on data is finished andW
is ready for the next command;
- can be sent by:
-
EOB
(end-of-batch):- can be sent by:
S
; - usage: placed to
W-STDIN
after last message in a group to indicate end the group, passed toW
for batch processing;
- can be sent by:
-
BNC
(batch-not-complete):- can be sent by:
W
; - usage: placed to
W-STDOUT
afterEOP
or previousBNC
to request one more message for batch processing fromS
;
- can be sent by:
-
GET
(get-new-data):- can be sent by:
S
; - usage:
- can be sent by:
-
S
andW
processes communication scenario depends on the stage's type (E-, T- or L-).
-
S
executesW
's run instruction and starts waiting for messages atW
'sSTDOUT
. -
W
executes "extraction" operation and generates messages for data flow. -
W
sends messages toSTDOUT
one by one. -
W
stops operation (closesSTDOUT
). -
S
reads all messages from theW
'sSTDOUT
. -
S
passes read messages to the next stage.
-
S
sendsGET
marker to (already running)W
'sSTDIN
and starts waiting for messages atW
'sSTDOUT
.
To Be Continued...