-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified annotation versioning #7917
base: master
Are you sure you want to change the base?
Conversation
@philippotto A first rudimentary version seems to work: I can create an annotation on l4_sample, (note: volume buckets won’t load), take the annotation id and skeleton tracing id, and insert them in here (both in uri and in request body):
Then reload and see two skeleton nodes. Adding/Removing layer, volume annotation, proofreading is all still broken. So are the version assertions. But this can be used as a starting point to adapt the frontend:
Note: when switching to and from this branch, I clear the fossildb content with |
The next part of the frontend could be added now: The Restore Older Version flow. The version list view should no longer feature tabs per layer, but instead only one global Also, the revertToVersion update action is no longer layer specific (I guess this may already be covered by philipps changes). It is required that the revertToVersion update action is the only action in its update group (so that it gets its own version number). Not sure if the frontend already does that. Since Philipp is out of office, maybe @MichaelBuessemeyer could have a look at this after #6613 is done? Note that this branch opens fossildb with different column families. I run |
Sure :) |
@MichaelBuessemeyer the next batch of frontend todos has arrived 😇 I have not turned all of them into checkboxes in the description, feel free to do that if you prefer.
|
Regarding the first point:
It seems like @philippotto already implemented this 🥳 But regarding:
The frontend does not know this update action. Maybe it was deprecated and is no longer sent by the frontend? At least cannot find it in the frontend code of this branch and the master as well 🥴. Maybe this is an old update action that is still supported by the backend for compatibility reasons? But on the other side, the version restore view does not have such an entry for legacy purposes. => Maybe the |
Important Review skippedMore than 25% of the files skipped due to max files limit. The review is being skipped to prevent a low-quality review. 103 files out of 212 files are above the max files limit of 75. Please upgrade to Pro plan to get higher limits. You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
@fm3 While implementing the new update actions which replaced some routes I came across the following questions:
Why is the Answer: This is only used for more detailed information in the version restore view. For update action
the previous equivalent route was able to update the following annotation properties:
Meaning the route was not only able to update the description and name but also the tags, visibility and viewConfig. Could this also be added to the Answer: The name and description will be versioned as update actions but the |
Some more questions :D
What about sandbox annotations? The current route for this is: Answer: Keep it. & Sandbox is only supported for skeleton annotations. Moreover, the old annotation info fetching route ( TODO [ ]: Always only use the info from the TS route. Maybe the core backend has layers that were deleted already and so on. TODO: micha [ ] Test me with version restore Moreover, is it possible to split up an annotation (its layers) across multiple tracing stores? If that's the case, the frontend needs to call all tracing stores available in the maybe outdated annotation info object from the backend and then merge this information? There can be only one TS at a time for a single WK instance. => So this is not possible |
…nds/webknossos into unified-annotation-versioning
…nds/webknossos into unified-annotation-versioning
…ing annotation info with that information when loading an annotation
note to self:
|
…nds/webknossos into unified-annotation-versioning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the migration script. Looks good to me, but I left a few comments to understand some details better.
MIGRATIONS.unreleased.md
Outdated
- FossilDB must now be opened with new column family set `skeletons,volumes,volumeData,volumeSegmentIndex,editableMappingsInfo,editableMappingsAgglomerateToGraph,editableMappingsSegmentToAgglomerate,annotations,annotationUpdates`. [#7917](https://github.com/scalableminds/webknossos/pull/7917) | ||
- The FossilDB content needs to be migrated. For that, use the python program at `tools/migration-unified-annotation-versioning` (see python main.py --help for instructions). Note that it writes to a completely new FossilDB, that must first be opened with the new column families, see above. The migration code needs to connect to postgres, to the old FossilDB and to the new. After the migration, replace the old FossilDB by the new one. The migration can also be run in several steps so that the majority of the data can already be migrated while WEBKNOSSOS is still running. Then only annotations that have been edited again since the first run need to be migrated in the incremental second run during a WEBKNOSSOS downtime. [#7917](https://github.com/scalableminds/webknossos/pull/7917) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- FossilDB must now be opened with new column family set `skeletons,volumes,volumeData,volumeSegmentIndex,editableMappingsInfo,editableMappingsAgglomerateToGraph,editableMappingsSegmentToAgglomerate,annotations,annotationUpdates`. [#7917](https://github.com/scalableminds/webknossos/pull/7917) | |
- The FossilDB content needs to be migrated. For that, use the python program at `tools/migration-unified-annotation-versioning` (see python main.py --help for instructions). Note that it writes to a completely new FossilDB, that must first be opened with the new column families, see above. The migration code needs to connect to postgres, to the old FossilDB and to the new. After the migration, replace the old FossilDB by the new one. The migration can also be run in several steps so that the majority of the data can already be migrated while WEBKNOSSOS is still running. Then only annotations that have been edited again since the first run need to be migrated in the incremental second run during a WEBKNOSSOS downtime. [#7917](https://github.com/scalableminds/webknossos/pull/7917) | |
- The versioning scheme of annotations has been changed. That requires a larger migration including the FossilDB content. [#7917](https://github.com/scalableminds/webknossos/pull/7917) | |
- FossilDB must now be opened with new column family set `skeletons,volumes,volumeData,volumeSegmentIndex,editableMappingsInfo,editableMappingsAgglomerateToGraph,editableMappingsSegmentToAgglomerate,annotations,annotationUpdates`. [#7917](https://github.com/scalableminds/webknossos/pull/7917) | |
- The FossilDB content needs to be migrated. For that, use the python program at `tools/migration-unified-annotation-versioning` (see python main.py --help for instructions). Note that it writes to a completely new FossilDB, that must first be opened with the new column families, see above. The migration code needs to connect to postgres, to the old FossilDB and to the new. After the migration, replace the old FossilDB by the new one. The migration can also be run in several steps so that the majority of the data can already be migrated while WEBKNOSSOS is still running. Then only annotations that have been edited again since the first run need to be migrated in the incremental second run during a WEBKNOSSOS downtime. [#7917](https://github.com/scalableminds/webknossos/pull/7917) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would also be helpful to explain how to start a second fossildb instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No version constraints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the best variant for ths is. I now added the versions that pip show gave me with ==
but I guess that may be too restrictive? What do you think?
tools/migration-unified-annotation-versioning/Annotation_pb2.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Norman Rzepka <[email protected]>
5cb76ad
to
60139a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hej, I finished the first backend review round. 🎉
The code looks in a very good state to me. Well done @fm3 🥇
I mostly have question for understanding regarding functionality, design and only few high level remarks / questions. Moreover, I also answered the questions myself sometimes but left them for you to double check in case I understood something wrong. So please don't feel overwhelmed by all my comments 🙈
Open TODOs for me:
- Check backend tests (I skipped those)
- Do testing
- Review frontend
If I may have one wish for such big future PRs & I am just guessing here whether that was actually the problem: I noticed that some files were moved and also changed. Somehow my IDE which I used to review the code did not realize that a file was moved, thus showing the changes not in an optimal way (github as well). This might be originating from the fact that git itself did not realize that a file was moved. In case you did not do that: Please make sure when moving a file to commit the deletion and adding the new file in the same commit. This hopefully makes git realize that the file was moved instead of deleted and a new one was added. Changes to the moved file can then be done in another commit. Hopefully, in this case the changes will be shown as a moved file with changes instead of deletion a file and addition of a new file.
@@ -160,7 +161,19 @@ class UserTokenController @Inject()(datasetDAO: DatasetDAO, | |||
private def handleTracingAccess(tracingId: String, | |||
mode: AccessMode, | |||
userBox: Box[User], | |||
token: Option[String]): Fox[UserAccessAnswer] = { | |||
token: Option[String]): Fox[UserAccessAnswer] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cant this also use the new token context?
private def handleAnnotationAccess(annotationId: String, | ||
mode: AccessMode, | ||
userBox: Box[User], | ||
token: Option[String]): Fox[UserAccessAnswer] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cant this also use the new token context?
def resetToBase(annotation: Annotation)(implicit ctx: DBAccessContext): Fox[Unit] = | ||
for { | ||
_ <- bool2Fox(annotation.typ == AnnotationType.Task) ?~> "annotation.revert.tasksOnly" | ||
dataset <- datasetDAO.findOne(annotation._dataset) | ||
tracingStoreClient <- tracingStoreService.clientFor(dataset) | ||
_ <- tracingStoreClient.resetToBase(annotation._id) ?~> "annotation.revert.failed" | ||
} yield () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the postgres data is not updated (previously it wasn't as well). This is not needed because it will automatically happen in case the update action resetToBase
(or so) is applied? Is this ensured by the frontend by e.g. auto reloading the annotation after triggering the action?
Just wanted to double check this as it came up in my mind. I am not familiar with the reset process of task annotations :)
def duplicateSkeletonTracing(skeletonTracingId: String, | ||
versionString: Option[String] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just double checking: we do no longer support duplicating a specific version becasue this code is only called when creating a task based on the base annotation and this does not have versions anyway?
POST /tracingstores/:name/validateUserAccess controllers.UserTokenController.validateAccessViaTracingstore(name: String, key: String, token: Option[String]) | ||
PUT /tracingstores/:name controllers.TracingStoreController.update(name: String) | ||
GET /tracingstores/:name/dataSource controllers.WKRemoteTracingStoreController.dataSourceForTracing(name: String, key: String, tracingId: String) | ||
GET /tracingstores/:name/dataSourceId controllers.WKRemoteTracingStoreController.dataSourceIdForTracing(name: String, key: String, tracingId: String) | ||
GET /tracingstores/:name/annotationId controllers.WKRemoteTracingStoreController.annotationIdForTracing(name: String, key: String, tracingId: String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a random thought: If this route is called often, wouldn't it be possible to save the annotation id in the tracing proto object and just get ir from there instead of asking he core backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I remember now seeing the TSRemoveWKClient -> this is cached. So it should be fine imo.
)) | ||
} yield () | ||
} else Fox.successful(()) | ||
} | ||
_ = Instant.logSince( | ||
before, | ||
s"Duplicating $bucketCount volume buckets from $sourceTracingId v${sourceTracing.version} to $newTracingId v${newTracing.version}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be removed? Was this for performance testing or do you want to keep it for general monitoring?
userToken: Option[String])(implicit mp: MessagesProvider): Fox[MergedVolumeStats] = { | ||
def mergeVolumeData( | ||
tracingIds: Seq[String], | ||
tracings: Seq[VolumeTracing], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a high level:
Honestly, I think it would be useful to include the tracing id in the volume and skeleton proto. This helps in the code not having to always pass both. This might make things easier. e.g. the tracingIds.zip(tracings)
from below might not be needed.
this.copy(actionAuthorId = authorId) | ||
override def isViewOnlyChange: Boolean = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove this?
this.copy(actionAuthorId = authorId) | ||
override def isViewOnlyChange: Boolean = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you remove this here as well?
|
||
object UpdateUserBoundingBoxVisibility { | ||
implicit val jsonFormat: OFormat[UpdateUserBoundingBoxVisibility] = Json.format[UpdateUserBoundingBoxVisibility] | ||
override def isViewOnlyChange: Boolean = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 in some skeleton actions you explicitly removed this isViewOnlyChange
. Here you added it
re-requesting review for frontend |
sorry, misclick 🙈 |
const params = new URLSearchParams(); | ||
|
||
const segmentIdBuffer = serializeProtoListOfLong<T>(segmentIds); | ||
const listArrayBuffer: ArrayBuffer = await doWithToken((token) => | ||
Request.receiveArraybuffer( | ||
`${dataStoreUrl}/data/datasets/${dataSourceId.owningOrganization}/${dataSourceId.directoryName}/layers/${layerName}/agglomerates/${mappingId}/agglomeratesForSegments?token=${token}`, | ||
const listArrayBuffer: ArrayBuffer = await doWithToken((token) => { | ||
params.append("token", token); | ||
return Request.receiveArraybuffer( | ||
`${dataStoreUrl}/data/datasets/${dataSourceId.owningOrganization}/${dataSourceId.directoryName}/layers/${layerName}/agglomerates/${mappingId}/agglomeratesForSegments?${params}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nitpicky but what about
const params = new URLSearchParams(); | |
const segmentIdBuffer = serializeProtoListOfLong<T>(segmentIds); | |
const listArrayBuffer: ArrayBuffer = await doWithToken((token) => | |
Request.receiveArraybuffer( | |
`${dataStoreUrl}/data/datasets/${dataSourceId.owningOrganization}/${dataSourceId.directoryName}/layers/${layerName}/agglomerates/${mappingId}/agglomeratesForSegments?token=${token}`, | |
const listArrayBuffer: ArrayBuffer = await doWithToken((token) => { | |
params.append("token", token); | |
return Request.receiveArraybuffer( | |
`${dataStoreUrl}/data/datasets/${dataSourceId.owningOrganization}/${dataSourceId.directoryName}/layers/${layerName}/agglomerates/${mappingId}/agglomeratesForSegments?${params}`, | |
const segmentIdBuffer = serializeProtoListOfLong<T>(segmentIds); | |
const listArrayBuffer: ArrayBuffer = await doWithToken((token) => { | |
const params = new URLSearchParams({token}); | |
return Request.receiveArraybuffer( | |
`${dataStoreUrl}/data/datasets/${dataSourceId.owningOrganization}/${dataSourceId.directoryName}/layers/${layerName}/agglomerates/${mappingId}/agglomeratesForSegments?${params}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok 🎉 here is my frontend review.
Thanks for implementing the frontend part @philippotto.
The code was mainly very straight forward imo. So no big suggestions from my side.
Open TODOs for me
- Testing
- Review backend tests
// The action creators pushSaveQueueTransaction and pushSaveQueueTransactionIsolated | ||
// are typed so that update actions that need isolation are isolated in a group each. | ||
// From this point on, we can assume that the groups fulfil the isolation requirement. | ||
export const pushSaveQueueTransaction = ( | ||
items: Array<UpdateAction>, | ||
saveQueueType: SaveQueueType, | ||
tracingId: string, | ||
transactionId: string = getUid(), | ||
) => | ||
items: Array<UpdateActionWithoutIsolationRequirement>, | ||
): PushSaveQueueTransaction => | ||
({ | ||
type: "PUSH_SAVE_QUEUE_TRANSACTION", | ||
items, | ||
saveQueueType, | ||
tracingId, | ||
transactionId, | ||
transactionId: getUid(), | ||
}) as const; | ||
|
||
export const pushSaveQueueTransactionIsolated = ( | ||
item: UpdateActionWithIsolationRequirement, | ||
): PushSaveQueueTransaction => | ||
({ | ||
type: "PUSH_SAVE_QUEUE_TRANSACTION", | ||
items: [item], | ||
transactionId: getUid(), | ||
}) as const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 🎉
looks like you found a very good solution to type support the isolation requirement and also made it readable in the code 🎉 💪
const volumeTracing = yield* select((state) => getActiveSegmentationTracing(state)); | ||
if (!volumeTracing || !volumeTracing.mappingName) { | ||
// This should never occur, because the proofreading tool is only available when a volume tracing layer is active. | ||
throw new Error("No active segmentation tracing layer. Cannot create editable mapping."); | ||
} | ||
const upToDateVolumeTracing = yield* select((state) => getActiveSegmentationTracing(state)); | ||
if (upToDateVolumeTracing == null) { | ||
throw new Error("No active segmentation tracing layer. Cannot create editable mapping."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the selects can be removed. I don't see any scenario in which the might differ as no reducer or such should be able to run inbetween the two select statements. Or am I wrong here?
Moreover, the second check (ll. 272-274) is already included in the first if check. Therefore, this can also be removed IMO
|
||
if (showVersionRestore) { | ||
yield* put(setVersionRestoreVisibilityAction(true)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for moving cleaning this up by moving it to annotation_saga
(if I remember correctly)
export enum AnnotationLayerEnum { | ||
Skeleton = "Skeleton", | ||
Volume = "Volume", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philippotto I think I added this change. As you did not undo this refactoring I can assume that you agree that this is an improvement?
@@ -220,14 +227,14 @@ export function AnnotationStats({ | |||
{asInfoBlock && <p className="sidebar-label">Statistics</p>} | |||
<table className={asInfoBlock ? "annotation-stats-table" : "annotation-stats-table-slim"}> | |||
<tbody> | |||
{"treeCount" in stats ? ( | |||
{skeletonStats && "treeCount" in skeletonStats ? ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the "treeCount" in skeletonStats
too much checking here? Shouldn't checking skeletonStats != null
be sufficient?
if (initialMaybeCompoundType != null) { | ||
annotation = await getAnnotationCompoundInformation(annotationId, initialMaybeCompoundType); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fm3, I am just double checking here Does the frontend also need to fetch an annotation proto object in order to get an up to date compound annotation? Afaik not as this is "just" a merged annotation which is in the temporary cache in the tracing store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So imo the code here should be correct not requesting an annotation proto object from the tracing store
@@ -118,10 +124,11 @@ export const annotation: APIAnnotation = { | |||
allowDownload: true, | |||
}, | |||
annotationLayers: [ | |||
// does this still exist? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by that?
@@ -2,10 +2,10 @@ import mockRequire from "mock-require"; | |||
import test from "ava"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving the typing in this file 🙏
t.deepEqual(updateActions[1], { | ||
name: "deleteTree", | ||
value: { | ||
actionTracingId: "tracingId", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe define actionTracingId
for the whole file and use it like this
actionTracingId: "tracingId", | |
value: { | |
actionTracingId, | |
id: 2, | |
}, |
@@ -1544,7 +1539,7 @@ Generated by [AVA](https://avajs.dev). | |||
interpolation: true, | |||
mag: 2, | |||
position: { | |||
x: 9120, | |||
x: 9286, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the coordinates change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually pretty important to clarify this (IMO)
Summary
Steps to test:
WK Testing
Note: Since the FossilDB is opened with a new set of column families, switching to and from this branch requires a fresh FossilDB data dir. I use
rm -rf fossildb/data; git co fossildb/data/.gitignore
Migration Testing
TODOs
Backend
// TODO
commentsisCompacted
bool in json, just display those in version log, no applyagglomerateIdsForSegments
needs to support aversion
parameter so that previewing an older version worksFrontend
version
needs to be used when mapping segments to agglomerates inagglomerateIdsForSegments
same for initialize editable mapping -> why? code reads like this is not done and it works. A saved state is simply required for proofreading actionsSyncthese are two different concepts.layerIndependentActions
with backend -> should include only actions which the backend has set to be in a single transaction groupdeleteAnnotationLayer
, which should be removed and replaced with the update actionnewestVersion
sometimes called with emptystring annotationId? looks like this happens in dashboard/during navigation?Migration
SELECT COUNT(*)
? is the pagination buggy? Nope, some annotations have zero layers, they get thrown out by the JOIN. That’s ok.Issues: