Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embargo compatability #112

Closed
CodyCBakerPhD opened this issue Sep 25, 2023 · 17 comments
Closed

Embargo compatability #112

CodyCBakerPhD opened this issue Sep 25, 2023 · 17 comments

Comments

@CodyCBakerPhD
Copy link
Contributor

It's possible to attach DANDI credentials with curl API calls to DANDI to grant permissions to access embargo'd contents; I think in theory it ought to work here as well

So I'd like to request the ability (or instructions if already an ability) to run Neurosift on an embargoed DANDI set

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

Thanks @CodyCBakerPhD

I'll work on this soon - we will definitely need to support embargoed datasets.

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

Here's an example of a view of an nwb asset from an embargoed dandiset
https://flatironinstitute.github.io/neurosift/?p=/nwb&url=https://dandi-api-staging-embargo-dandisets.s3.amazonaws.com/210194/blobs/945/44c/94544c84-0a89-4b16-9cad-b72af6d67552

As expected, it doesn't load, because access is denied to the s3 bucket. I don't expect to be able to easily get that to download directly. But maybe there's hope to get it from this URL:

https://api-staging.dandiarchive.org/api/assets/203750ef-018f-45a4-b8cb-f990b09e20ba/download/

When I curl that, I get {"detail":"Authentication credentials were not provided."}

How can I provide my dandi key? Do I use the header?

@CodyCBakerPhD @bendichter

@bendichter
Copy link
Contributor

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

Looking at the dandi API code, you might try this:

https://github.com/dandi/dandi-cli/blob/697af7c43c407ca8ccaa815cecd844d810174cbb/tools/update-assets-on-server#L120

That sort of did the trick in that I no longer get the Authentication credentials were not provided error. However, the download is empty (zero bytes).

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

Oh, I needed to use -L to follow redirects. That seems to be working.

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

@bendichter @CodyCBakerPhD

I DANDI-shared the following embargoed dataset with you
https://gui-staging.dandiarchive.org/dandiset/210194

Here is a neurosift link to one of the nwb assets
https://flatironinstitute.github.io/neurosift?p=/nwb&url=https://api-staging.dandiarchive.org/api/assets/203750ef-018f-45a4-b8cb-f990b09e20ba/download/

...which will not load UNTIL you have provided neurosift with your DANDI "Staging" API Key, which you can now do in the GUI. See the key icon in the top menu.

image

image

After you paste in your keys, you can reload the page.

FYI, the keys are stored in local browser storage. It's okay for now, but maybe we'll want to think of a better way.

Note: we'll probably want to modify the dandi neurosift plugin to use the dandiarchive api url rather than the s3 bucket url.

@CodyCBakerPhD
Copy link
Contributor Author

...which will not load UNTIL you have provided neurosift with your DANDI "Staging" API Key, which you can now do in the GUI. See the key icon in the top menu.

For the example you have shared, this works - but was there something specific you had to do for this particular embargoed DANDI set?

I ask because I tried the same approach on another embargoed dataset and it did not work on that (gets stuck on loading and at the bottom says unable to connect to the rtcshare

FYI, the keys are stored in local browser storage. It's okay for now, but maybe we'll want to think of a better way.

We've been dealing with this on the NWB GUIDE as well, check out NeurodataWithoutBorders/nwb-guide#389

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

For the example you have shared, this works - but was there something specific you had to do for this particular embargoed DANDI set?

I ask because I tried the same approach on another embargoed dataset and it did not work on that (gets stuck on loading and at the bottom says unable to connect to the rtcshare

It won't work directly from neurosift the menu option in DANDI. That's because the s3 bucket url won't work with the authentication. You need to feed it the DANDI api asset url (the one that ends with /download/). I think we'll want to have that changed in dandiarchive.

@CodyCBakerPhD
Copy link
Contributor Author

I think we'll want to have that changed in dandiarchive.

I see now - the DANDI menu option gives a URL that points to the S3 directly instead of through the DANDI API

Would you be able to make that change on the DANDI side? Would you like me to make a separate issue for that specifically?

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

I see now - the DANDI menu option gives a URL that points to the S3 directly instead of through the DANDI API

Would you be able to make that change on the DANDI side? Would you like me to make a separate issue for that specifically?

I'll create a DANDI PR for that.

@magland
Copy link
Collaborator

magland commented Sep 27, 2023

Created PR dandi/dandi-archive#1690

@CodyCBakerPhD
Copy link
Contributor Author

Thank you! I'll try it again once that gets merged

@waxlamp
Copy link

waxlamp commented Sep 29, 2023

I wanted to offer an alternative solution strategy, a variant of using the -L option with curl. When curl (or any client) follows redirects, what it's doing is receiving a 300-level response, plucking out the Location header that contains the redirect URL, and making a subsequent request to that URL. If all you need is the presigned (i.e., duly authorized) S3 URL, you can make the request to the DANDI API asset download endpoint and pull the S3 URL from the response, and then make use of that wherever an S3 URL is needed.

In that case, there's one extra step for the client to perform to get that URL, and then everything after that point should "just work". But I may not be fully understanding the problem here--let me know if I'm missing anything.

@magland
Copy link
Collaborator

magland commented Sep 29, 2023

@waxlamp thanks.

Yes I believe that's what I'm doing here, although I might be misunderstanding what you are suggesting:

const urlResolved = await getResolvedUrl(url || defaultUrl)

const getResolvedUrl = async (url: string) => {
if (isDandiAssetUrl(url)) {
const authorizationHeader = getAuthorizationHeaderForUrl(url)
const headers = authorizationHeader ? {Authorization: authorizationHeader} : undefined
const redirectUrl = await getRedirectUrl(url, headers)
if (redirectUrl) {
return redirectUrl
}
}
return url
}

const getRedirectUrl = async (url: string, headers: any) => {
// This is tricky. Normally we would do a HEAD request with a redirect: 'manual' option.
// and then look at the Location response header.
// However, we run into mysterious cors problems
// So instead, we do a HEAD request with no redirect option, and then look at the response.url
const response = await headRequest(url, headers)
if (response.url) return response.url
// if (response.type === 'opaqueredirect' || (response.status >= 300 && response.status < 400)) {
// return response.headers.get('Location')
// }
return null // No redirect
}

I do a head request, get the new s3 signed URL and then use that from that point forward.

@waxlamp
Copy link

waxlamp commented Sep 29, 2023

Ah, no, you are doing exactly what I suggested.

If I have it right, the remaining problem is that outgoing links from DANDI Archive don't supply the right URL for embargoed assets, and it doesn't make sense in those situations for you to somehow modify that URL into one that works, hence the need for changes in the archive codebase.

I will continue this discussion on your dandi-archive PR--I think I have a simpler solution that introduces less technical debt to that codebase.

Thanks for indulging my questions here.

@CodyCBakerPhD
Copy link
Contributor Author

I believe this has been resolved now?

@magland
Copy link
Collaborator

magland commented Feb 8, 2024

Yes thanks @CodyCBakerPhD . Closing

@magland magland closed this as completed Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants