[bug] attempt to fetch volumes with multiple sources can occassionally result in undeterministic behaviour #318

xgui3783 · 2023-03-30T17:17:13Z

as of 0.4a35

When volume with multiple providers (e.g. nii + neuroglancer/precomputed) calls fetch, network conditions will determine which provider is used.

e.g. for mni152 jb29 labelled where both nii and precomputed are provided, nii will be attempted first (per https://github.com/FZJ-INM1-BDA/siibra-python/blob/01d3d09/siibra/volumes/volume.py#L40-L44 ). If it fails (src is unreachable, in maintenance, being rate limited), neuroglancer precomputed source will be used.

precomputed provider fetch seems to default to lowest resolution.

As a result, voxel count will likely be a factor of 8 less.

code to reproduce:

caveat:

need to install test dep requests-mock
needs to run siibra.cache.clear() between each run

import siibra
import numpy as np
import requests_mock

siibra.cache.clear()

url = "https://neuroglancer.humanbrainproject.eu/precomputed/data-repo-ng-bot/20210714-julichbrain_v290_correctaffine/hbp-d000007_julichbrain-cytoatlas_dev/MPMs/29.0/GapMapPublicMPMAtlas_l_N10_nlin2StdICBM152asym2009c_29_publicDOI_b1bcc0b5a127f5a917f0d8b0e869be5d.nii.gz"

def main():
    region = siibra.get_region("2.9", "area 1 (postcg) left")
    rmap = region.fetch_regional_map("mni152")
    print(np.count_nonzero(rmap.get_fdata()))

def mocked():
    with requests_mock.Mocker(real_http=True) as mocker:
        mocker.register_uri(url=url, method="get", status_code=404, text="oh no")
        main()

main() # prints 6203

clear siibra cache

import siibra
import numpy as np
import requests_mock

siibra.cache.clear()

url = "https://neuroglancer.humanbrainproject.eu/precomputed/data-repo-ng-bot/20210714-julichbrain_v290_correctaffine/hbp-d000007_julichbrain-cytoatlas_dev/MPMs/29.0/GapMapPublicMPMAtlas_l_N10_nlin2StdICBM152asym2009c_29_publicDOI_b1bcc0b5a127f5a917f0d8b0e869be5d.nii.gz"

def main():
    region = siibra.get_region("2.9", "area 1 (postcg) left")
    rmap = region.fetch_regional_map("mni152")
    print(np.count_nonzero(rmap.get_fdata()))

def mocked():
    with requests_mock.Mocker(real_http=True) as mocker:
        mocker.register_uri(url=url, method="get", status_code=404, text="oh no")
        main()

mocked() # prints 757

The text was updated successfully, but these errors were encountered:

AhmetNSimsek · 2023-05-11T11:45:36Z

Using df51a05, I cannot reproduce #318. I always get 6203.

xgui3783 · 2023-05-11T11:53:11Z

I can still reproduce the bug from df51a05

the key is, one need to clear the cache between each run, otherwise, the cached map skew the test result.

I have clarified that in the original issue.

AhmetNSimsek · 2023-05-11T11:54:41Z

the key is, one need to clear the cache between each run, otherwise, the cached map skew the test result.

Yes, I cleared the cache between each run and tested 10 times 😅

xgui3783 · 2023-05-11T12:01:21Z

hmm did you restart your interpreter between each run?

yea, I don't know what to say. The error is still occuring here.

could it be OS specific?

AhmetNSimsek · 2023-05-11T12:39:30Z

Yes, I have. I tried again now and always gives me 6203. I tried with the current main (051e7ec) and it then sometimes prints 757 but for both.

It could be OS-specific indeed.

xgui3783 · 2023-05-11T13:15:14Z

Yes, I have. I tried again now and always gives me 6203. I tried with the current main (051e7ec) and it then sometimes prints 757 but for both.

It could be OS-specific indeed.

I suspect this is because the cache was not cleared.

I was not super clear on how to reproduce the issue when I first posted it. I have since updated it.

Can you try the following code:

import siibra
import numpy as np
import requests_mock

siibra.cache.clear()

url = "https://neuroglancer.humanbrainproject.eu/precomputed/data-repo-ng-bot/20210714-julichbrain_v290_correctaffine/hbp-d000007_julichbrain-cytoatlas_dev/MPMs/29.0/GapMapPublicMPMAtlas_l_N10_nlin2StdICBM152asym2009c_29_publicDOI_b1bcc0b5a127f5a917f0d8b0e869be5d.nii.gz"

def main():
    region = siibra.get_region("2.9", "area 1 (postcg) left")
    rmap = region.fetch_regional_map("mni152")
    print(np.count_nonzero(rmap.get_fdata()))

def mocked():
    with requests_mock.Mocker(real_http=True) as mocker:
        mocker.register_uri(url=url, method="get", status_code=404, text="oh no")
        main()

mocked() # prints 757

AhmetNSimsek · 2023-05-11T13:25:12Z

Can you try the following code:

This produces 757

AhmetNSimsek · 2023-05-16T13:01:46Z

How about adding a base_resolution parameter in json (under provider section) for each precomputed neuroglancer image? If there is nothing, then it will still do default behaviour (go to the lowest). But if there is a value, then it will use it.
Since we have to compute the neuroglancer volumes anyway, this could be outputted during the pipeline.

What do you think @xgui3783 and @dickscheid?

xgui3783 · 2023-05-23T09:21:37Z

I do not like that we are expanding the base schema so much.

in no time we will not remember why we added the attributes.

I proposed the following to resolve the issue:

rather than trying the volume sources one by one, if no spec is provided try and retry the first in the order of preference, only.

we do not introduce more schema we need to maintain, and improve the reproducibility.

AhmetNSimsek · 2023-05-23T10:27:28Z

I see what you mean. Do you think the retry window would be enough for a network issue?

xgui3783 · 2023-05-23T10:52:33Z

I see what you mean. Do you think the retry window would be enough for a network issue?

I argue that thrown error is better than silent breakdown of reproducibility

AhmetNSimsek · 2023-05-23T10:54:25Z

Ahhh you mean never actually try other sources and throw an error?

xgui3783 · 2023-05-23T11:32:12Z

Ahhh you mean never actually try other sources and throw an error?

Correct. Unless the user specifically requests format=neuroglancer/precomputed

AhmetNSimsek · 2023-05-23T11:33:57Z

Okay now I get what you mean. Makes sense. I'll make the change

AhmetNSimsek · 2023-06-01T13:55:37Z

#391 provides a solution.
TODO:

add a unit test. (EDT: carrying to Add a unit test for fetching volumes with multiple sources can occassionally result in undeterministic behaviour #428)
add e2e (test: add e2e test to ensure labelled/stat map is of expected size #367)

xgui3783 added the bug Something isn't working label Mar 30, 2023

This was referenced Apr 19, 2023

Continuous map does not have the same shape as the template FZJ-INM1-BDA/siibra-api#122

Closed

Predifine the order in which a volume source is taken #339

Closed

AhmetNSimsek added the important label Apr 19, 2023

xgui3783 mentioned this issue May 3, 2023

[bug] labelled map and/or mask and/or statistic map inconsistent result #368

Closed

AhmetNSimsek closed this as completed May 11, 2023

AhmetNSimsek reopened this May 11, 2023

AhmetNSimsek mentioned this issue May 31, 2023

If no format is given, only try to fetch single format else fail #391

Merged

AhmetNSimsek mentioned this issue Aug 3, 2023

Add a unit test for fetching volumes with multiple sources can occassionally result in undeterministic behaviour #428

Open

AhmetNSimsek closed this as completed Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] attempt to fetch volumes with multiple sources can occassionally result in undeterministic behaviour #318

[bug] attempt to fetch volumes with multiple sources can occassionally result in undeterministic behaviour #318

xgui3783 commented Mar 30, 2023 •

edited

Loading

AhmetNSimsek commented May 11, 2023

xgui3783 commented May 11, 2023

AhmetNSimsek commented May 11, 2023 •

edited

Loading

xgui3783 commented May 11, 2023 •

edited

Loading

AhmetNSimsek commented May 11, 2023

xgui3783 commented May 11, 2023

AhmetNSimsek commented May 11, 2023 •

edited

Loading

AhmetNSimsek commented May 16, 2023

xgui3783 commented May 23, 2023

AhmetNSimsek commented May 23, 2023

xgui3783 commented May 23, 2023

AhmetNSimsek commented May 23, 2023

xgui3783 commented May 23, 2023

AhmetNSimsek commented May 23, 2023

AhmetNSimsek commented Jun 1, 2023 •

edited

Loading

[bug] attempt to fetch volumes with multiple sources can occassionally result in undeterministic behaviour #318

[bug] attempt to fetch volumes with multiple sources can occassionally result in undeterministic behaviour #318

Comments

xgui3783 commented Mar 30, 2023 • edited Loading

AhmetNSimsek commented May 11, 2023

xgui3783 commented May 11, 2023

AhmetNSimsek commented May 11, 2023 • edited Loading

xgui3783 commented May 11, 2023 • edited Loading

AhmetNSimsek commented May 11, 2023

xgui3783 commented May 11, 2023

AhmetNSimsek commented May 11, 2023 • edited Loading

AhmetNSimsek commented May 16, 2023

xgui3783 commented May 23, 2023

AhmetNSimsek commented May 23, 2023

xgui3783 commented May 23, 2023

AhmetNSimsek commented May 23, 2023

xgui3783 commented May 23, 2023

AhmetNSimsek commented May 23, 2023

AhmetNSimsek commented Jun 1, 2023 • edited Loading

xgui3783 commented Mar 30, 2023 •

edited

Loading

AhmetNSimsek commented May 11, 2023 •

edited

Loading

xgui3783 commented May 11, 2023 •

edited

Loading

AhmetNSimsek commented May 11, 2023 •

edited

Loading

AhmetNSimsek commented Jun 1, 2023 •

edited

Loading