-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asset metadata not appearing on DLP #1734
Comments
cc @weiglszonja |
i suspect the assets summary generator is failing, but the archive is ignoring that failure. on the archive side, two things should happen:
could one of you try generating the asset summary directly in python using the dandischema function (i believe it's in the metadata module)? it takes a list of asset metadata as input. |
@weiglszonja, could you give this a shot? |
@satra can you tell me how to generate the asset metadata? I'm not sure I found what you were referring to.. from dandi.metadata import nwb2asset
asset_md = nwb2asset("000691/sub-4mm-mouse-1/sub-4mm-mouse-1_image+ophys.nwb") This returns: {
"id": None,
"schemaKey": "Asset",
"schemaVersion": "0.6.4",
"name": None,
"description": None,
"contributor": None,
"about": None,
"studyTarget": None,
"license": None,
"protocol": None,
"ethicsApproval": None,
"keywords": None,
"acknowledgement": None,
"access": [
{
"id": None,
"schemaKey": "AccessRequirements",
"status": "dandi:OpenAccess",
"contactPoint": None,
"description": None,
"embargoedUntil": None
}
],
"url": None,
"repository": None,
"relatedResource": None,
"wasGeneratedBy": [
{
"id": None,
"schemaKey": "Session",
"identifier": "2021-07-26T13-50-50",
"name": "2021-07-26T13-50-50",
"description": "This session includes calcium imaging recorded from a head-mounted microscope in a freely moving mouse while simultaneously recording more than a thousand neurons in cortex.",
"startDate": "2021-07-26T13-50-50",
"endDate": None,
"wasAssociatedWith": None,
"used": None
},
{
"id": "urn:uuid:21285aea-6c45-4d5e-a05b-5c8f9a541045",
"schemaKey": "Activity",
"identifier": None,
"name": "Metadata generation",
"description": "Metadata generated by DANDI cli",
"startDate": "2023-11-07T21:46:06.850514+0100",
"endDate": "2023-11-07T21:46:36.501100+0100",
"wasAssociatedWith": [
{
"id": None,
"schemaKey": "Software",
"identifier": "RRID:SCR_019009",
"name": "DANDI Command Line Interface",
"version": "0.55.1",
"url": "https://github.com/dandi/dandi-cli"
}
],
"used": None
}
],
"contentSize": 15232760735,
"encodingFormat": "application/x-nwb",
"digest": {},
"path": "/Volumes/t7-ssd/fee-lab-to-nwb/ophys/final/000691/sub-4mm-mouse-1/sub-4mm-mouse-1_image+ophys.nwb",
"dateModified": "2023-11-07T21:46:36.501290+0100",
"blobDateModified": "2023-11-02T17:54:28+0100",
"dataType": None,
"sameAs": None,
"approach": [
{
"id": None,
"schemaKey": "ApproachType",
"identifier": None,
"name": "microscopy approach; cell population imaging"
}
],
"measurementTechnique": [
{
"id": None,
"schemaKey": "MeasurementTechniqueType",
"identifier": None,
"name": "surgical technique"
},
{
"id": None,
"schemaKey": "MeasurementTechniqueType",
"identifier": None,
"name": "two-photon microscopy technique"
},
{
"id": None,
"schemaKey": "MeasurementTechniqueType",
"identifier": None,
"name": "analytical technique"
}
],
"variableMeasured": [
{
"id": None,
"schemaKey": "PropertyValue",
"maxValue": None,
"minValue": None,
"unitText": None,
"value": "TwoPhotonSeries",
"valueReference": None,
"propertyID": None
},
{
"id": None,
"schemaKey": "PropertyValue",
"maxValue": None,
"minValue": None,
"unitText": None,
"value": "ImagingPlane",
"valueReference": None,
"propertyID": None
},
{
"id": None,
"schemaKey": "PropertyValue",
"maxValue": None,
"minValue": None,
"unitText": None,
"value": "ProcessingModule",
"valueReference": None,
"propertyID": None
},
{
"id": None,
"schemaKey": "PropertyValue",
"maxValue": None,
"minValue": None,
"unitText": None,
"value": "PlaneSegmentation",
"valueReference": None,
"propertyID": None
},
{
"id": None,
"schemaKey": "PropertyValue",
"maxValue": None,
"minValue": None,
"unitText": None,
"value": "OpticalChannel",
"valueReference": None,
"propertyID": None
}
],
"wasDerivedFrom": None,
"wasAttributedTo": [
{
"id": None,
"schemaKey": "Participant",
"identifier": "4mm-mouse-1",
"altName": None,
"strain": None,
"cellLine": None,
"vendor": None,
"age": {
"id": None,
"schemaKey": "PropertyValue",
"maxValue": None,
"minValue": None,
"unitText": "ISO-8601 duration",
"value": "P3M",
"valueReference": {
"id": None,
"schemaKey": "PropertyValue",
"maxValue": None,
"minValue": None,
"unitText": None,
"value": "dandi:BirthReference",
"valueReference": None,
"propertyID": None
},
"propertyID": None
},
"sex": {
"id": None,
"schemaKey": "SexType",
"identifier": "http://purl.obolibrary.org/obo/PATO_0000384",
"name": "Male"
},
"genotype": "C57/B6",
"species": {
"id": None,
"schemaKey": "SpeciesType",
"identifier": "http://purl.obolibrary.org/obo/NCBITaxon_10090",
"name": "Mus musculus - House mouse"
},
"disorder": None,
"relatedParticipant": None,
"sameAs": None
}
]
}
|
@weiglszonja - here you go: from dandi.dandiapi import DandiAPIClient
from dandischema.metadata import aggregate_assets_summary
api = DandiAPIClient()
ds = api.get_dandiset("000691")
aggregate_assets_summary(ds.get_assets()) it seems that the avi file metadata is missing |
Thank you @satra, I see the error now. Do you have any suggestion how to fix it and why this happened? |
@AlmightyYakob - do you know how a metadata field without a |
@weiglszonja - was the avi file uploaded by skipping around validation? and if not, @yarikoptic, is there a reason why you think the asset did not get @AlmightyYakob - can we check on the database side how many assets in which dandisets are missing |
@satra I followed these steps in the terminal, I used
|
thanks @weiglszonja - that by itself should not result in a lack of |
I do not see any .avi among assets and all of those .nwb I see have schemaVersion❯ curl --silent -X 'GET' 'https://api.dandiarchive.org/api/dandisets/000631/versions/draft/assets/?metadata=true' -H 'accept: application/json' | jq '.results | .[] | { path: .path, schemaVersion: .metadata.schemaVersion}'
{
"path": "sub-600ns-4kV-0,8MHz-BP-8-24-21-BPAE-20/sub-600ns-4kV-0,8MHz-BP-8-24-21-BPAE-20_ses-600ns-4kV-0-8MHz paired pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-4kV-0,8MHz-BP-8-24-21-BPAE-13/sub-600ns-4kV-0,8MHz-BP-8-24-21-BPAE-13_ses-BPAE-5xBipolar-0,83MHz_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-4kV-0,8MHz-BP-8-24-21-BPAE-15/sub-600ns-4kV-0,8MHz-BP-8-24-21-BPAE-15_ses-600ns-4kV-0-8MHz paired pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-4kV-0,8MHz-UP-10-5-21-BPAE-06/sub-600ns-4kV-0,8MHz-UP-10-5-21-BPAE-06_ses-600ns-4kV-0-8MHz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-4kV-0,8MHz-UP-10-5-21-BPAE-05/sub-600ns-4kV-0,8MHz-UP-10-5-21-BPAE-05_ses-600ns-4kV-0-8MHz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzBP-8-9-21-BPAE-13/sub-600ns-5kV-1HzBP-8-9-21-BPAE-13_ses-600ns-5kV-1Hz paired pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzBP-8-9-21-BPAE-15/sub-600ns-5kV-1HzBP-8-9-21-BPAE-15_ses-600ns-5kV-1Hz paired pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-4kV-0,8MHz-UP-10-5-21-BPAE-10/sub-600ns-4kV-0,8MHz-UP-10-5-21-BPAE-10_ses-600ns-4kV-0-8MHz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzUP-7-31-21-BPAE-03/sub-600ns-5kV-1HzUP-7-31-21-BPAE-03_ses-600ns-5kV-1Hz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzBP-8-9-21-BPAE-8/sub-600ns-5kV-1HzBP-8-9-21-BPAE-8_ses-600ns-5kV-1Hz paired pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzUP-7-31-21-BPAE-10/sub-600ns-5kV-1HzUP-7-31-21-BPAE-10_ses-600ns-5kV-1Hz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzUP-7-31-21-BPAE-18/sub-600ns-5kV-1HzUP-7-31-21-BPAE-18_ses-600ns-5kV-1Hz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzUP-7-31-21-BPAE-5/sub-600ns-5kV-1HzUP-7-31-21-BPAE-5_ses-600ns-5kV-1Hz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzUP-8-9-21-BPAE-14/sub-600ns-5kV-1HzUP-8-9-21-BPAE-14_ses-600ns-5kV-1Hz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
}
{
"path": "sub-600ns-5kV-1HzUP-8-9-21-BPAE-10/sub-600ns-5kV-1HzUP-8-9-21-BPAE-10_ses-600ns-5kV-1Hz single pulse trains_image.nwb",
"schemaVersion": "0.6.4"
} |
@yarikoptic - you are looking at 631 not 691. also in dandiset 631 that you looked at, there are |
I noted spaces... for both spaces and
I don't know ;) |
so what are we missing? |
It shouldn't be possible to store an asset with a missing
Indeed, querying for assets that have a missing |
There's been a lot of discussion here around @bendichter It appears the primary reason that the assets summary is missing on |
the discussion was based on a returned error here #1734 (comment) , but that code is incorrect because that function doesn't accept an iterable of dandi RemoteAssets. instead it should have been: aggregate_assets_summary([asset.get_raw_metadata() for asset in ds.get_assets()]) indeed, once this is done the schema version error doesn't show up. |
@AlmightyYakob - it also doesn't show up on dandiset 000026. |
I came looking for reports on the same issue which I'm facing with two datasets currently in draft status: Both uploaded using the cli using the same commands @weiglszonja reported:
A very similar dataset we validated and uploaded earlier this year shows Asset Summary information with no issues: Any suggestions for how to proceed? Can we publish the datasets and have the assset summary generated afterwards? |
@jjnesbitt could you please try running reaggregation on originally reported 000691 and then 000768 and 000769 while keeping an eye on either any errors get triggered? |
I've re-run the aggregation and all three of these dandisets now show a proper assets summary. |
@jjnesbitt and @yarikoptic Thank you for helping out so quickly, it's much appreciated. |
https://dandiarchive.org/dandiset/000691/draft
Does anyone know why asset metadata might not be appearing here?
The text was updated successfully, but these errors were encountered: