Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STORY] Implement Dataset.apply() and arcana apply analysis #31

Open
2 of 18 tasks
tclose opened this issue Mar 10, 2022 · 1 comment
Open
2 of 18 tasks

[STORY] Implement Dataset.apply() and arcana apply analysis #31

tclose opened this issue Mar 10, 2022 · 1 comment
Assignees
Labels
analysis-design Complex analysis workflow epic deep Requires advanced framework/domain-specific knowledge pipelines story a unit of work
Milestone

Comments

@tclose
Copy link
Contributor

tclose commented Mar 10, 2022

Description

API and CLI interfaces for applying Analysis classes to datasets need to implemented as described at https://arcana.readthedocs.io/en/develop/processing.html#analysis-classes

Acceptance Criteria

  • 1. Analysis classes can be applied to datasets via API
  • 2. Analysis classes can be applied to datasets via CLI
  • 3. Newly added methods are covered to 90%

Dependencies

Tasks

  • dataset apply method takes an analysis class and the args to be passed to it, then creates the object and saves it in the dataset
  • add pipeline.stack method with column salience filter
    • builds stack of Pipeline objects with workflows created by builder methods
    • workflows merged into single pipelines where the outputs of the builder pipelines are below the Salience threshold and there is no merging over columns
    • Switches will need to be stored as part of Pipeline object
    • ColumnSpec objects to be returned alongside each pipeline object, signifying new columns to be added to the dataset
    • have method to display this nicely by passing the right flag to stack
  • Save pipelines in dataset alongside independent pipelines. Use them be default when creating new Derivatives, unless a "rebuild_workflow" flag is passed or similar. Saved pipelines can be compared with those that would be generated by analysis class as quick way of picking up changes that need to e overwritten.
  • add 'menu' method to analysis classes, make it dynamically determine which columns can be created and filter to these, unless the --all flag or similar has been set.
  • "checks" method in columnspec to return all checks on column
  • manual checks to be specified by basic check that simply returns the "questionable" value
  • Support for 'subanalysis.column' in AnalysisSpec.column(). Return mapped ColumnSpec with '.' In name and "mapped_from" set to appropriate value
  • map pipeline outputs when importing pipelines from sub-analyses be recreating a modified copy of the PipelineBuilder spec
@tclose tclose self-assigned this Mar 10, 2022
@tclose tclose added this to the 2.0.0a milestone Mar 10, 2022
@tclose tclose moved this to Backlog in AIS Master Project Mar 15, 2022
@tclose tclose moved this from Backlog to Todo in AIS Master Project Mar 28, 2022
@tclose tclose moved this from Todo to Backlog in AIS Master Project Mar 28, 2022
@tclose tclose modified the milestones: 2.0.0a, 2.0.0b Mar 28, 2022
@tclose
Copy link
Contributor Author

tclose commented Jun 19, 2022

@tclose tclose moved this to Backlog in AIS Master Project Jun 20, 2022
@tclose tclose removed cli labels Jun 21, 2022
@tclose tclose changed the title Implement Dataset.apply() and arcana apply analysis [STORY] Implement Dataset.apply() and arcana apply analysis Jun 21, 2022
@tclose tclose added pipelines incomplete-desc mid-level Requires moderate framework/domain-specific knowledge analysis-design Complex analysis workflow epic labels Jun 21, 2022
@tclose tclose added story a unit of work deep Requires advanced framework/domain-specific knowledge and removed mid-level Requires moderate framework/domain-specific knowledge incomplete-desc labels Jun 21, 2022
@tclose tclose modified the milestones: 2.0, 2.0b Jun 23, 2022
@tclose tclose removed 2pt labels Aug 9, 2022
@tclose tclose moved this to Todo in AIS Master Project Aug 25, 2022
@tclose tclose moved this from Todo to In Progress in AIS Master Project Aug 31, 2022
@tclose tclose moved this from In Progress to Todo in AIS Master Project Sep 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis-design Complex analysis workflow epic deep Requires advanced framework/domain-specific knowledge pipelines story a unit of work
Projects
None yet
Development

No branches or pull requests

1 participant