Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Pangeo Forge API catalog into LEAP Data Catalog #74

Closed
wants to merge 21 commits into from

Conversation

cisaacstern
Copy link
Contributor

I'm winding down the Pangeo Forge hosted catalog as part of the project's refocus towards ETL tooling and supporting users (including LEAP!) to host their own catalogs, xref:

As part of that, we don't want the existing data in the catalog to be forgotten, so hoping LEAP will be able to host references to it (the data itself will continue to live on OSN).

So far, this PR is just a script I'm working on to automatically extract the Pangeo Forge catalog data into the format required by the LEAP catalog. I will delete this script before the PR is complete. Opening is draft for now.

@cisaacstern cisaacstern marked this pull request as ready for review December 6, 2023 17:54
@cisaacstern
Copy link
Contributor Author

I've exported all of the entries from https://pangeo-forge.org/catalog into the catalog here, with the exception of HadISST, which was already here!

@jbusecke, do we want to put the following cmip6 entries somewhere else:

  • catalog/datasets/cmip6-pmip.yaml
  • catalog/datasets/cmip6-static-ocean-grids.yaml
  • catalog/datasets/cmip6.yaml

?

@cisaacstern
Copy link
Contributor Author

@andersy005, welcome your review, though I do think I've got the technical side worked out based on the very clear README! 🙏 Mostly tagged you for visibility, since you've worked on both the Pangeo Forge catalog and this one, of course.

Copy link
Collaborator

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks 💯 solid to me. I'm really glad that the LEAP's catalog will become a new home for the existing datasets from Pangeo Forge

@jbusecke
Copy link
Contributor

Sorry I just saw this late (today is going to be purely catch up on github hahaha). This is a fantastic idea in general and I support it 100%.

I think it might be useful to chat through this, and see how we can streamline this. I am in particular curious if anything here is affected by a likely refactor in data-management.

@cisaacstern
Copy link
Contributor Author

@jbusecke sorry for the delayed response.

I don't think this would be affected by the refactor because the data added here do not have corresponding feedstocks/recipes in this repo.

This is purely a migration of data build previously in Pangeo Forge, all of which have feedstock repos linked in the catalog entries added by this PR.

So I think we can merge unless you have other questions?

@cisaacstern
Copy link
Contributor Author

I think the one thing we may want to exclude here is possibly the CMIP things?

@jbusecke
Copy link
Contributor

I think that is the right intuition. My suggestion is:
For each of the CMIP based recipes open a request issue in the new feedstock. That way we can make sure these datasets are eventually ingested (but in a consistent manner).
I would like to have a link to CMIP data in the catalog eventually but we can discuss that separately

@andersy005 andersy005 temporarily deployed to export-pangeo-forge-api March 9, 2024 16:09 — with GitHub Actions Inactive
@andersy005 andersy005 temporarily deployed to export-pangeo-forge-api March 11, 2024 22:42 — with GitHub Actions Inactive
@andersy005 andersy005 temporarily deployed to export-pangeo-forge-api March 11, 2024 22:46 — with GitHub Actions Inactive
@andersy005 andersy005 temporarily deployed to export-pangeo-forge-api March 11, 2024 23:14 — with GitHub Actions Inactive
@andersy005 andersy005 temporarily deployed to export-pangeo-forge-api March 11, 2024 23:18 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants