Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to browse s3 buckets in JupyterLab interface #11

Closed
batpad opened this issue Mar 21, 2024 · 18 comments
Closed

Allow users to browse s3 buckets in JupyterLab interface #11

batpad opened this issue Mar 21, 2024 · 18 comments

Comments

@batpad
Copy link
Collaborator

batpad commented Mar 21, 2024

It would be useful to have a browser similar to the local file browser in JupyterLab for s3 buckets that the users on the Hub have access to.

From speaking to @yuvipanda this feature might already exist and maybe just some configuration to setup. @yuvipanda whenever possible, if you could point in the right direction, I can fill out this issue in more detail - thanks!

cc @wildintellect @abarciauskas-bgse

@wildintellect
Copy link

I think it was suggested we try https://github.com/IBM/jupyterlab-s3-browser <- is this maintained?

@yuvipanda
Copy link
Collaborator

That one isn't compatible with JupyterLab 4. https://github.com/Navteca/jupyterlab-bxplorer seems to be, and perhaps is worth trying. Needs an image with it in there to be tried out.

@j08lue
Copy link
Collaborator

j08lue commented Apr 2, 2024

Bucket explorer could be helpful to some users, so if there are easy and sustainable pathways, we should consider adding such a feature.

However, we do want to promote workflows where users share and discover assets via (user-group-specific or public) STAC. File names can only hold that much information and there are no standard conventions, so discovery of assets works a lot better (and interoperable) via STAC.

(NB we are connected to Navteca via the NASA SMCE, which they support, and have earlier on in the project been talking to them about developing solutions that VEDA needs. It should be easy to open up those direct channels again, if need be.)

@wildintellect
Copy link

@j08lue this feature request is not for sharing, it's for users to find their own files since VEDA does not auto-mount buckets like MAAP. ➕ we want open sharing to be done via STAC.

@wildintellect
Copy link

See also MAAP-Project/Community#768

@yuvipanda
Copy link
Collaborator

@minrk, the primary creator of JupyterHub, is also exploring this for another project - destination-earth/DestinE_ESA_GFTS#13 has more information. It uses https://github.com/jpmorganchase/jupyter-fs and seems to look good?

image

@batpad
Copy link
Collaborator Author

batpad commented Jul 11, 2024

This is probably a requirement for the MAAP migration - #43

Next steps here:

@sunu let's add this to our agenda to chat about when we meet next.

cc @yuvipanda

@wildintellect
Copy link

@batpad can we get one of these options deployed at least in staging before Aug 14 (demo to JPL/HQ of VEDA Hub as a future of MAAP)?

cc: @jsignell @yuvipanda @maxrjones

@batpad
Copy link
Collaborator Author

batpad commented Jul 31, 2024

I created an image to test with the bxplorer extension installed.

I see the extension show up in the left panel and it shows me s3 buckets, lets me browse them, etc.

@wildintellect would you be able to help test and we can figure out if this works / what else we might need?

Steps to test:

  • Select the Bring Your Own Image option
  • Use this image: public.ecr.aws/nasa-veda/pangeo-notebook-veda-image:683db507c7a5
  • You should see the bxplorer icon in the left panel - clicking on it should show an s3 browser in the left pane

I didn't spend a lot of time on this, so there are probably things we can configure, etc. - this is pretty much the default install currently.

(this is the branch / PR for the image building, in case anyone wants to try any changes to config etc: NASA-IMPACT/pangeo-notebook-veda-image#18 )

cc @sunu

@yuvipanda
Copy link
Collaborator

I just tried it out and it works. yay.

Double clicking a file explored this way doesn't do anything though. I'd expect it to open in the jupyterlab browser but i suppose that's not a universal expectation. But I'd like it to at least do something. Downloading is a bit clunky but that's probably ok too. I'd love for it to have a 'copy URI' option so we can use it in code. All these can be contributed upstream of course.

But i'd love to hear from actual MAAP users if this is good enough :)

@yuvipanda
Copy link
Collaborator

@batpad I'd also love for you to take a quick look at the code (https://github.com/Navteca/jupyterlab-bxplorer) to see if it feels like something we can contribute to.

@wildintellect
Copy link

Noting that Navteca (Ramon) is the main support behind SMCE.

I gave bxplorer a try:

  1. adding a bucket is on the Favorites page - bigger issue is that permissions to such a bucket have to be pretty open, I tried maap-ops-workspace and nasa-maap-data-store, neither worked, though technically the role the hub uses should be able to see some of the files in those buckets. Anyone know another bucket to test with? I'm not sure it's using our role, lp-prod-protected also didn't work.
  2. Right click gets you the Download option, we don't want that, we want copy URI for inserting into code. I'm glad in way it doesn't open files, that could be painful way to lockup the system.

@j08lue
Copy link
Collaborator

j08lue commented Aug 1, 2024

I think it is fine that publicly catalogued buckets do not support listing (only reading).

What kinds of buckets do MAAP users browse via the explorer, @wildintellect? Some kind of shared temp space?

@wildintellect
Copy link

@j08lue maap-ops-workspace is the critical one right now, which includes both dps_output and shared spaces.
Longer term (later this quarter), different groups like EIS fire have their own buckets they'd want to access. It's hard to predict but other buckets come up occasionally that don't have catalog service over them. So you're right that this is for buckets that lack catalogs.

@batpad
Copy link
Collaborator Author

batpad commented Aug 2, 2024

@batpad I'd also love for you to take a quick look at the code (https://github.com/Navteca/jupyterlab-bxplorer) to see if it feels like something we can contribute to.

Gave things a quick look and it looks all great - the frontend seems standard React and easy to work with if we need to.

Right click gets you the Download option, we don't want that, we want copy URI for inserting into code. I'm glad in way it doesn't open files, that could be painful way to lockup the system.

Is this something we should ticket adding? The code seems reasonable enough that we should be able to submit a PR for this.

I also took a bit of a look at https://github.com/jpmorganchase/jupyter-fs - @wildintellect do you think it's worth trying out as well or does bxplorer seem workable? the jupyter-fs extension seems like a bit more work to setup with some required configuration, etc. - am happy to spend some time early next week if that seems useful @wildintellect .

@wildintellect
Copy link

bxplorer seems fine for now, we can show it off in our VEDA demo, and talk about how it comes from another part of NASA ...
later we should try other options to compare (sometime this PI)

@batpad batpad closed this as completed Sep 17, 2024
@anayeaye
Copy link

@wildintellect
Copy link

@anayeaye interesting but I don't think that applies to this use case which require python interoperability, unless we want to write our own Jupyter Extension in JS.

This might apply to some VEDA front end ideas around browsing catalog items for download.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants