Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transform dataset coordinates into specified crs #97

Merged
merged 9 commits into from
Dec 21, 2023

Conversation

boothmanrylan
Copy link
Contributor

Fixes #96

setup.py Outdated Show resolved Hide resolved
@boothmanrylan
Copy link
Contributor Author

I believe I fixed the linting issues, but the tests were failing on the Authenticate to Google Cloud step with:

Error: google-github-actions/auth failed with: retry function failed after 4 attempts: the GitHub Action workflow must specify exactly one of "workload_identity_provider" or "credentials_json"! If you are specifying input values via GitHub secrets, ensure the secret is being injected into the environment. By default, secrets are not passed to workflows triggered from forks, including Dependabot.

And I don't know how to fix that, I'm not very familiar with github actions or secrets, so any suggestions would be appreciated.

@naschmitz naschmitz self-requested a review November 16, 2023 02:35
@naschmitz
Copy link
Collaborator

@boothmanrylan Could you sync and resolve the conflicts? Then, I should be able to approve your CL and run the integration tests.

xee/ext.py Outdated
lnglat_img,
grid=lon_grid,
dtype=np.float32,
bandIds=['longitude', 'latitude'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you're including latitude in here?

Same with lat = self.image_to_array( (except reversed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pyproj.Transformer.transform requires both the x and y coordinates be given.

If my understanding of lon_grid and lat_grid is correct, then the original calls to self.image_to_array return the longitude of the topmost and latitude of the leftmost edges of the bounds respectively. If we give those as the x and y inputs to the transformer, what we get back is the transformed coordinates of the diagonal of the bounds which differs from transforming the coordinates of the leftmost and topmost edges separately which is what I'm doing and why latitude is included in the image_to_array call to get the longitude and vice versa.

Looking at this again, I think that using multidimensional coordinates may be a better solution because the difference between the two methods occurs because the x coordinates vary along the y axis and the y coordinates vary along the x axis...

Open to input on that though.

Here is a notebook that shows the difference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and looked at this again and I believe the way I'm doing it is correct.

When I open a dataset -> grab the first image -> write to a geotiff using rioxarray -> upload back to earthengine -> compare against the original first image in the collection I get an exact match. See this notebook

Additionally, extracting the top edge and left edge without grabbing both the longitude and latitude and transforming those together will fail if the region is not perfectly square because pyproj.Transformer.transform needs the x inputs and y inputs to have the same length.

@@ -22,6 +22,10 @@
import numpy as np
import xarray as xr
from xarray.core import indexing
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple import suggestions:

  • os has already been imported. Please remove.
  • tempfile is a built-in Python module. Please move it up below pathlib.
  • rioxarray is unused. Please remove.
  • rasterio is not defined as a test dependency. Please include it in project.optional-dependencies tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do for os tempfile and rasterio

However, from the rioxarray docs:

"rioxarray extends xarray with the rio accessor. The rio accessor is activated by importing rioxarray like so: import rioxarray"

so without that import, ds.rio.to_raster(...) etc. fail with:

AttributeError: 'Dataset' object has no attribute 'rio'

I can add rioxarray as a test dependency along with rasterio though.

@boothmanrylan
Copy link
Contributor Author

@naschmitz I believe I have addressed your requested changes, please let me know if you need more information or if there are other changes I need to make.

xee/ext.py Outdated
@@ -555,11 +559,19 @@ def get_variables(self) -> utils.Frozen[str, xarray.Variable]:
lon_grid = self.project((0, 0, v0.shape[1], 1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @boothmanrylan we figured out that the issue is with pixelLonLat() actually as it gives the coordinates always into the degree so instead of that if you can use this lnglat_img = ee.Image.pixelCoordinates(ee.Projection(self.crs_arg)) then it gives the coordinates into given crs format.

Also change the bandIds from longitude to x into at line 558.
-> Same with latitude to y.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

dabhicusp

This comment was marked as resolved.

Copy link
Collaborator

@dabhicusp dabhicusp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @boothmanrylan I think you have too change here, too.

@boothmanrylan
Copy link
Contributor Author

Hi @dabhicusp I can switch that pixelLonLat to pixelCoordinates, but I'm not sure it is necessary. As far as I can tell that pixelLonLat is just being used to create an image collection with known values to be used in tests e.g. getting the first element of the array and expecting the value to be -179.5

Also, all of the tests that use it are passing for me locally with my changes.

@mahrsee1997 mahrsee1997 self-requested a review December 20, 2023 15:34
@copybara-service copybara-service bot merged commit 7bb1407 into google:main Dec 21, 2023
5 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Opening a dataset with a projected CRS does not convert the dataset's coordinates
5 participants