Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add API tests for metadata handling #107

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions tests/metadata/test_metadata_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#!/usr/bin/env python3

# (C) Copyright 2020 ECMWF.
#
# This software is licensed under the terms of the Apache Licence Version 2.0
# which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
# In applying this licence, ECMWF does not waive the privileges and immunities
# granted to it by virtue of its status as an intergovernmental organisation
# nor does it submit to any jurisdiction.
#

import pytest

from earthkit.data import from_source
from earthkit.data.testing import earthkit_examples_file

try:
from earthkit.data.metadata import Metadata
except ImportError:
Metadata = None

try:
from earthkit.data.metadata import RawMetadata
except ImportError:
RawMetadata = None

try:
from earthkit.data.readers.grib.metadata import GribMetadata
except ImportError:
GribMetadata = None


def test_abstract_metadata():
# Metadata properties:
# * immutable
# * behaves like a dict
# * can be updated (creating a new object, either a copy or a view)
# Updates can be made with compatible subclasses, e.g. update a GribMetadata with a RawMetadata
assert Metadata is not None


def test_raw_metadata():
assert RawMetadata is not None
assert issubclass(RawMetadata, Metadata)

md = RawMetadata({"shortName": "2t", "perturbationNumber": 5})
assert md["shortName"] == "2t"
assert md["perturbationNumber"] == 5
assert md.get("shortName") == "2t"
with pytest.raises(KeyError):
md["nonExistentKey"]
assert md.get("nonExistentKey", 12) == 12
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should happen for:

md.get("nonExistentKey")

Should we raise KeyError or return None as a dict would do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would return None, to match dict and the behaviour of the default value when given

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @oiffrig! There is another question related to the type cast. On a Field we can call metadata() like this to specify the value type we want:

f.metadata("centre", astype=str)
f.metadata("centre", astype=int)
f.metadata("centre:s")
f.metadata("centre:str")
f.metadata("centre:l")
f.metadata("centre:int")

How would the Metadata object support these? The last 4 calls are of course supported for GRIB by [] and get().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally sure. What I would suggest is to have that as an overridable behaviour: the metadata object knows a few types and can convert from them (raising an error if the type is not supported), and specific implementations (e.g. GRIB) can extend the default behaviour by using functionalities of the underlying backend

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Would that mean that get should work like that:

md.get("centre", astype=int)


md2 = md.update({"centre": "ecmf", "perturbationNumber": 8})
sentinel = object()
assert md.get("centre", sentinel) is sentinel
assert md["perturbationNumber"] == 5
assert md2["shortName"] == "2t"
assert md2["perturbationNumber"] == 8
assert md2["centre"] == "ecmf"

md3 = RawMetadata({"step": 24, "centre": "unknown"}).update(md2)
assert md3["step"] == 24
assert md3["centre"] == "ecmf"
assert md3["shortName"] == "2t"


def test_grib_metadata():
assert GribMetadata is not None
assert issubclass(GribMetadata, Metadata)

with pytest.raises(TypeError):
GribMetadata({"shortName": "u", "typeOfLevel": "pl", "levelist": 1000})

f = from_source("file", earthkit_examples_file("test_single.grib"))
md = f[0].metadata()

assert isinstance(md, GribMetadata)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest only to change the behaviour of GribField.metadata() when called without arguments, so, the rest of the functionality would be kept?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was the idea. We keep the current API, but change the underlying metadata object, and we return it if the user is not asking for specific keys/namespaces


assert str(md["dataDate"]) == "20170427"
assert md["shortName"] == "2t"

extra = RawMetadata({"shortName": "2ta"})
with pytest.raises(TypeError):
extra.update(md)

updated = md.update(extra)
assert md["shortName"] == "2t"
assert updated["shortName"] == "2ta"
# When we write the GribMetadata, the GRIB keys should be set in the correct
# order to produce the expected result