pl-topologicalcopy in pipelines #481
Replies: 2 comments 2 replies
-
DecidedThese four proposed changes shall be implemented, in order: 1. ts-type plugins shall wait on all plugin instances given by
|
Beta Was this translation helpful? Give feedback.
-
Cycle detection is not necessary for a correct implementation of multiple inheritance in ChRIS pipelines."Correct" meaning acyclic. Concern: pipelines containing ts-plugins might contain cycles. Layman’s explanation:
Think of a family tree. Is it possible for a person to be their own biological parent, grandparent, great-grandparent, great-great-grandparent, …? No! But what if it were possible for a child to have only a single parent, or 3 parents? Still, probably not... Formal proofSetup
ProblemSuppose G is a graph constructed by adding vertices to it one by one. A set of [0, n) directed edges E are added each time vertex v is added to G where every edge e in E is from v to an existing vertex in G. Prove that all graphs constructed this manner are acyclic. Restating the ProblemThe "worst case" would be where G has the maximum number of edges allowed. That is, for every vertex to be added v, E is a set of edges from v to every vertex in G (read: in a pipeline, every non-root piping is a
Every possible graph which is constructed as described in the above section is a subgraph of some graph in S. Lemma: E is acyclic implies R is acyclic where R is a subgraph of E Solution
QED The Importance of Breadth-First SearchThe above proof can guarantee no cycles in ChRIS pipelines containing ts-type pipings. Pipelines must be parsed in BFS order. Pipeline instances and workflows must be constructed in BFS order. Let's look at a simple example:
Parsing pipeines, creating pipeline instances/workflows in BFS order assures that all valid plugin instances/pipings which may be a "previous" to the next item to be added, exist. |
Beta Was this translation helpful? Give feedback.
-
Abstract
Several limitations to the current support for "feed join" operations, implemented by
pl-topologicalcopy
, make it impossible for "feed join" to be used in pipelines. To support "feed join" operations in pipelines, here a solution is proposed: (A) ts plugins should be able to wait on multiple parents, (B) ts-plugins should be supported in pipelines by using Piping.id for values of the parameterplugininstances
.Definitions
Feed join: creating a plugin instance in a feed where its inputs are the files of two or more plugin instances in the same feed.
Background
"Topology synthesis" or ts-type plugins were conceived to be a type of ChRIS plugin which interacts with its own feed in special ways: in contrast to ds-type plugins, the inputs of a ts-plugin include more than just the files of its "previous"/parent plugin instance. Currently, ts-plugins are implemented as a special case in the backend:
ChRIS_ultron_backEnd/chris_backend/plugininstances/services/manager.py
Lines 281 to 329 in 5d3c2e2
ts plugins can be described more specifically by explaining the code: a ts plugin is a ds plugin which is assumed to have the parameters
plugininstances
,filter
, andgroupByInstance
. Before a ts plugin runs, CUBE executes some special code to collect the files from every plugin instance specified byplugininstances
. These files are used as the input to the ts-plugin instance.The ts-plugin pl-topologicalcopy is used to perform "join" operations.
In the current implementation there is a problem where ts plugin instances have a data dependency on the plugin instances specified by their
plugininstances
parameter, howeThis topic was discussed during the ChRIS roundtable on 2023-01-10ver this data dependency is not satisfied because it is possible for a ts-plugin to start running before its dependencies are in the statefinishedSuccessfully
. Due to this issue, ts-plugins cannot be used in pipelines.A second issue is that the value for the parameter
plugininstances
can only be figured out at runtime, which makes it impossible to specify its value in a pipeline.Proposal
To support feed join operations in pipelines, ts plugins should be able to wait on multiple parents, and pipelines should be allowed to specify a value for the parameter
plugininstances
by list index.Wait on multiple parents
Simple as that. ts-plugins are currently implemented as "special cases" in manager.py, so why not implement them as a special case in tasks.py as well?
ChRIS_ultron_backEnd/chris_backend/plugininstances/tasks.py
Lines 72 to 84 in 5d3c2e2
Counterarguments: algorithmic complexity, poor scalability: we don't care about scalability right now. A reasonable ChRIS deployment might look like <20,000 plugin instances, CUBE running on a single machine with 16GB RAM.
plugininstances
parameter in pipelines1. The
plugininstances
parameter in a pipeline should accept a CSV of list indices, just as what is accepted by theprevious_index
argument.2. When serializing a ts-type piping, the value for
plugininstances
should be mapped from list indices to Piping.id.3. During workflow creation, plugin instances should be created breadth-first
4. During creation of a plugin instance of a ts-type piping, the value for
plugininstances
should be mapped from Piping.id to PluginInstance.idExample of a ts pipeline
1. Canonical JSON representation of example pipeline
ASCII art diagram of example pipeline
1.1. As a RFC #2 YAML file
Also proposed is for piping title to be used in the value for
plugininstances
instead of list indices.2. Mapping list indices -> Piping.id
For example, CUBE assigns the following piping IDs:
The plugin parameter defaults for piping 79 will be:
3-4 Breadth-first creation of plugin instances
First, user requests workflow to be created.
CUBE creates plugin instances from pipings breadth-first:
Piping 79 is a ts-type piping, which is handled as a special case. Its parameter
plugininstances
is mapped from piping IDs to plugin instance IDs: the pipeline string parameter value77,78
gets transformed to a plugin instance parameter value112,113
.Final result:
ASCII art diagram of created workflow
Summary of Value Mapping
Transformation of parameter value for
plugininstances
"Rename left mask files,Rename right mask files"
"1,2"
"77,78"
"112,113"
Alternative Solutions
Dynamic plugins: see #482 tl;dr since a reasonableimplementation for dynamic plugins would require new features in CUBE anyways, better we just add this proposal instead.
Multiple parents: see #483 tl;dr all plugins may have [0, n) parents, a plugin with >1 parent will automatically do a feed join. This solution would be too much work.
Summary and Considerations:
Meeting Attendance
This topic was discussed during the ChRIS roundtable on 2023-01-10
In attendance: Rudolph, Jennings, Gideon, Sandip
Jorge was absent.
Beta Was this translation helpful? Give feedback.
All reactions