Define Data Structures for simulations, tasks & dataGenerators #59

matthiaskoenig · 2017-10-05T08:29:02Z

Issue

One of the main problems I have with simulation, task and dataGenerators is that it is not defined what kind of DataStructure they are, i.e. it is unclear which kind of data structure they are, which dimension they have, and what data type they have (all this defines the allowed operations and Math on them).
All these points are currently only implicit assumptions, which work for very simple timecourse simulation and steady state simulation, but break down as soon as one wants to encode more complicated simulation experiments than a simple timecourse with an ODE model.
This creates also a lot of problems in the outputs which have to work with these data structures. In the current form lot's of problems results from this

complex operations on the data structures, i.e. things like mean, max, std are not possible/ill defined
more complex plots, i.e. everything above plot2D (and even plot2D for repeated tasks), is not defined, unclear how to implement
additional simulations not being simple timecourses, like boolean networks are ill defined
not clear what DataGenerators should contain

This is basically the core issue which creates most of the other issues, i.e.
dealing with multi-dimensional data ( #21 ), the new more complex plots ( #20 ), simulation on logical models (#8), how to plot repeatedTasks ( #58 ), calculating math over repeated tasks ( #53 ), the new tasks like Jacobian ( #27 ), ..

Proposal

Define Data Structures and dimensions on Simulation & Task, and what this means for DataGenerators.

This will also make implementation much easier and less error prone
Exchange formats and outputs can easily be based on the data structures, e.g., for instance boolean transition graphs are a directed-graph, output formats are possible exchange formats for graphs like GML, GraphML

edit: typos and clarifications

The text was updated successfully, but these errors were encountered:

fbergmann · 2017-10-05T08:58:00Z

Ideally the data description element of L1V3 can be used for exactly that purpose. Just like higher dimensional external data can be described, the idea would be that the individual dimensions are described for data generators as well. Then slices could be defined to get at the individual elements.

matthiaskoenig · 2017-10-05T10:41:10Z

Yes, this should work for all multidimensional data arrays.
Especially the new more complex plots will expect (only work with) dataGenerators in certain dimensions and with certain content. For instance a heatmap would expect one 2D-dataGenerator for the actual data (double) and two corresponding 1D data generators for the axes (string or double) to to write what is plotted in each cell of the heatmap.

matthiaskoenig added feature L1V4 labels Oct 5, 2017

matthiaskoenig removed feature L1V4 labels Jun 21, 2018

jonrkarr mentioned this issue Jan 18, 2021

Clarify the semantics of mathematical expressions of data generators #82

Closed

luciansmith added the L2 label Mar 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define Data Structures for simulations, tasks & dataGenerators #59

Define Data Structures for simulations, tasks & dataGenerators #59

matthiaskoenig commented Oct 5, 2017 •

edited

Loading

fbergmann commented Oct 5, 2017

matthiaskoenig commented Oct 5, 2017

Define Data Structures for simulations, tasks & dataGenerators #59

Define Data Structures for simulations, tasks & dataGenerators #59

Comments

matthiaskoenig commented Oct 5, 2017 • edited Loading

Issue

Proposal

fbergmann commented Oct 5, 2017

matthiaskoenig commented Oct 5, 2017

matthiaskoenig commented Oct 5, 2017 •

edited

Loading