Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor filenames handling in Field.from_netcdf #1787

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

VeckoTheGecko
Copy link
Contributor

@VeckoTheGecko VeckoTheGecko commented Dec 4, 2024

Changes:

  • Sanitize the filenames file object passed to Field.from_netcdf making clear the data format (results in Field.from_netcdf having support for Path objects contributing to Codebase wide Path object support #1706)
  • Separate field tests into separate file test_field.py
  • Ensuring filenames are converted to strings for downstream processing

@VeckoTheGecko VeckoTheGecko marked this pull request as ready for review December 5, 2024 15:21
Comment on lines +574 to 610
data_filenames = _get_dim_filenames(filenames, "data")
lonlat_filename_lst = _get_dim_filenames(filenames, "lon")
if isinstance(filenames, dict):
assert len(lonlat_filename) == 1
if lonlat_filename != cls._get_dim_filenames(filenames, "lat"):
assert len(lonlat_filename_lst) == 1
if lonlat_filename_lst != _get_dim_filenames(filenames, "lat"):
raise NotImplementedError(
"longitude and latitude dimensions are currently processed together from one single file"
)
lonlat_filename = lonlat_filename[0]
lonlat_filename = lonlat_filename_lst[0]
if "depth" in dimensions:
depth_filename = cls._get_dim_filenames(filenames, "depth")
if isinstance(filenames, dict) and len(depth_filename) != 1:
depth_filename_lst = _get_dim_filenames(filenames, "depth")
if isinstance(filenames, dict) and len(depth_filename_lst) != 1:
raise NotImplementedError("Vertically adaptive meshes not implemented for from_netcdf()")
depth_filename = depth_filename[0]
depth_filename = depth_filename_lst[0]

netcdf_engine = kwargs.pop("netcdf_engine", "netcdf4")
gridindexingtype = kwargs.get("gridindexingtype", "nemo")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the *_lst additions are just to make mypy happy



def _get_dim_filenames(filenames: T_SanitizedFilenames, dim: T_Dimensions) -> list[str]:
"""Get's the relevant filenames for a given dimension."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Get's the relevant filenames for a given dimension."""
"""Get the relevant filenames for a given dimension."""

Comment on lines +2601 to +2602
1. A sorted list of strings with the expanded glob expression
2. A sorted list of strings with the expanded glob expressions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between 1 and 2? Only whether there are multiple glob expressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 and 2 are the same. Just that in 1 the input was outside of a list. Perhaps it would be easier to illustrate these with examples actually rather than explain it here.

files = []
for f in filenames:
files.extend(_expand_filename(f))
return sorted(files)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does it mean we will always sort a list? What if users provide a non-sorted list by intention?

1. A sorted list of strings with the expanded glob expression
2. A sorted list of strings with the expanded glob expressions
3. A dictionary with same keys but values as in (1) or (2).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also mention what the recursed flag does?


See tests for examples.
"""
allowed_dimension_keys = ("lon", "lat", "depth", "data")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should time not also be in this list?

tests/test_advection.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

2 participants