Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define Data Structures for simulations, tasks & dataGenerators #59

Open
matthiaskoenig opened this issue Oct 5, 2017 · 2 comments
Open
Labels

Comments

@matthiaskoenig
Copy link
Collaborator

matthiaskoenig commented Oct 5, 2017

Issue

One of the main problems I have with simulation, task and dataGenerators is that it is not defined what kind of DataStructure they are, i.e. it is unclear which kind of data structure they are, which dimension they have, and what data type they have (all this defines the allowed operations and Math on them).
All these points are currently only implicit assumptions, which work for very simple timecourse simulation and steady state simulation, but break down as soon as one wants to encode more complicated simulation experiments than a simple timecourse with an ODE model.
This creates also a lot of problems in the outputs which have to work with these data structures. In the current form lot's of problems results from this

  • complex operations on the data structures, i.e. things like mean, max, std are not possible/ill defined
  • more complex plots, i.e. everything above plot2D (and even plot2D for repeated tasks), is not defined, unclear how to implement
  • additional simulations not being simple timecourses, like boolean networks are ill defined
  • not clear what DataGenerators should contain

This is basically the core issue which creates most of the other issues, i.e.
dealing with multi-dimensional data ( #21 ), the new more complex plots ( #20 ), simulation on logical models (#8), how to plot repeatedTasks ( #58 ), calculating math over repeated tasks ( #53 ), the new tasks like Jacobian ( #27 ), ..

Proposal

  • Define Data Structures and dimensions on Simulation & Task, and what this means for DataGenerators.

This will also make implementation much easier and less error prone
Exchange formats and outputs can easily be based on the data structures, e.g., for instance boolean transition graphs are a directed-graph, output formats are possible exchange formats for graphs like GML, GraphML

edit: typos and clarifications

@fbergmann
Copy link
Member

Ideally the data description element of L1V3 can be used for exactly that purpose. Just like higher dimensional external data can be described, the idea would be that the individual dimensions are described for data generators as well. Then slices could be defined to get at the individual elements.

@matthiaskoenig
Copy link
Collaborator Author

Yes, this should work for all multidimensional data arrays.
Especially the new more complex plots will expect (only work with) dataGenerators in certain dimensions and with certain content. For instance a heatmap would expect one 2D-dataGenerator for the actual data (double) and two corresponding 1D data generators for the axes (string or double) to to write what is plotted in each cell of the heatmap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants