+
+ +
+

Road Map

+
+

Overview

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Task

Priority

Current State

Write Documentation

High

Started with a long way to go

Simplify configurations

High

First draft complete

Develop Data Conventions

High

First draft complete

Improve Blockwise Post-Processing

Low

Not Started

Simplify Array handling

High

Almost done (Up/Down sampling)

+
+
+

Detailed Road Map

+
+
    +
  • +
    [ ] Write Documentation
      +
    • +
      [ ] tutorials: not more than three, simple and continuously tested (with Github actions, small U-Net on CPU could work)
        +
      • [x] Basic tutorial: train a U-Net on a toy dataset +- [ ] Parametrize the basic tutorial across tasks (instance/semantic segmentation). +- [ ] Improve visualizations. Move some simple plotting functions to DaCapo. +- [ ] Add a pure pytorch implementation to show benefits side-by-side +- [ ] Track performance metrics (e.g., loss, accuracy, etc.) so we can make sure we aren’t regressing

      • +
      • [ ] semantic segmentation (LM and EM)

      • +
      • [ ] instance segmentation (LM or EM, can be simulated)

      • +
      +
      +
      +
    • +
    • [ ] general documentation of CLI, also API for developers (curate docstrings)

    • +
    +
    +
    +
  • +
  • +
    [x] Simplify configurations
      +
    • [x] Depricate old configs

    • +
    • [x] Add simplified config for simple cases

    • +
    • [x] can still get rid of *Config classes

    • +
    +
    +
    +
  • +
  • +
    [x] Develop Data Conventions
      +
    • [x] document conventions

    • +
    • [ ] convenience scripts to convert dataset into our convention (even starting from directories of PNG files)

    • +
    +
    +
    +
  • +
  • +
    [ ] Improve Blockwise Post-Processing
      +
    • +
      [ ] De-duplicate code between “in-memory” and “block-wise” processing
        +
      • [ ] have only block-wise algorithms, use those also for “in-memory”

      • +
      • [ ] no more “in-memory”, this is just a run with a different Compute Context

      • +
      +
      +
      +
    • +
    • [ ] Incorporate volara into DaCapo (embargo until January)

    • +
    • [ ] Improve debugging support (logging of chain of commands for reproducible runs)

    • +
    • [ ] Split long post-processing steps into several smaller ones for composability (e.g., support running each step independently if we want to support choosing between waterz and mutex_watershed for fragment generation or agglomeration)

    • +
    +
    +
    +
  • +
  • +
    [x] Incorporate funlib.persistence adaptors.
      +
    • +
      [x] all of those can be adapters:
        +
      • [x] Binarize Labels into Mask

      • +
      • [x] Scale/Shift intensities

      • +
      • [ ] Up/Down sample (if easily possible)

      • +
      • [ ] DVID source

      • +
      • [x] Datatype conversions

      • +
      • [x] everything else

      • +
      +
      +
      +
    • +
    • [x] simplify array configs accordingly

    • +
    +
    +
    +
  • +
+
+
+
+

Can Have

+
+
    +
  • +
    [ ] Support other stats stores. Too much time, effort and code was put into the stats and didn’t provide a very nice interface:
      +
    • [ ] defining variables to store

    • +
    • [ ] efficiently batch writing, storing and reading stats to both files and mongodb

    • +
    • [ ] visualizing stats.

    • +
    • [ ] Jeff and Marwan suggest MLFlow instead of WandB

    • +
    +
    +
    +
  • +
  • [ ] Support for slurm clusters

  • +
  • [ ] Support for cloud computing (AWS)

  • +
  • [ ] Lazy loading of dependencies (import takes too long)

  • +
  • [ ] Support bioimage model spec for model dissemination

  • +
+
+
+
+

Non-Goals (for v1.0)

+
    +
  • custom dash board

  • +
  • GUI to run experiments

  • +
+
+
+ + +
+