Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Best Practices] Add check for TimeSeries unit follows CMIXF-12 convention #281

Draft
wants to merge 15 commits into
base: dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 82 additions & 1 deletion docs/best_practices/time_series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,88 @@ measurement for that data, using the appropriate type from the

Check function: :py:meth:`~nwbinspector.checks.time_series.check_missing_unit`


For **formatting** of SI units, the `CMIXF-12 <https://people.csail.mit.edu/jaffer/MIXF/CMIXF-12>`_
convention is recommended to minimize the variability of representation of unit values.

If the unit of measurement is unknown or unavailable, the unit can be declared as "a.u."
which stands for arbitrary units.
Please note the appropriate upper- or lower- casing when using CMIXF-12.

The `cmixf <https://github.com/sensein/cmixf>`_ Python package is used to check whether
the formatting of unit is compliant.

Examples for CMIXF-12 representation of units:

.. list-table::
:widths: 25 25 50
:align: center
:header-rows: 1

* - Neurodata type
- Unit in CMIXF-12
- Unit Name
* - :py:class:`pynwb.behavior.Position`
- "m"
- Meters (e.g. position)
* - :py:class:`~pynwb.behavior.PupilTracking`
- "m"
- Meters (pupil size)
* - :py:class:`~pynwb.behavior.SpatialSeries`
- "m/s"
- Meters per second (speed, velocity)
* - :py:class:`~pynwb.behavior.SpatialSeries`
- "m/s^2"
- Meters per second squared (acceleration)
* - :py:class:`pynwb.behavior.CompassDirection`
- "o" or "rad"
- Degrees or Radians (view angle)
* - :py:class:`pynwb.behavior.EyeTracking`
- "o"
- Degrees (gaze directions)
* - :py:class:`~pynwb.ecephys.ElectricalSeries`
- "V"
- Volts
* - :py:class:`~pynwb.ecephys.SpikeEventSeries`
- "V"
- Volts
* - :py:class:`~pynwb.icephys.PatchClampSeries`
- "V"
- Volts
* - :py:class:`~pynwb.icephys.CurrentClampSeries`
- "V"
- Volts
* - :py:class:`~pynwb.icephys.IZeroClampSeries`
- "V"
- Volts
* - :py:class:`~pynwb.icephys.CurrentClampStimulusSeries`
- "A"
- Amperes
* - :py:class:`~pynwb.icephys.VoltageClampSeries`
- "A"
- Amperes
* - :py:class:`~pynwb.icephys.VoltageClampStimulusSeries`
- "V"
- Volts
* - :py:class:`~pynwb.image.ImageSeries`
- "a.u."
- Arbitrary Units (unknown)
* - :py:class:`~pynwb.image.IndexSeries`
- "a.u."
- Arbitrary Units (unknown)
* - :py:class:`~pynwb.image.ImageMaskSeries`
- "a.u."
- Arbitrary Units (unknown)
* - :py:class:`~pynwb.image.OpticalSeries`
- "a.u."
- Arbitrary Units (unknown)
* - :py:class:`~pynwb.retinotopy.AxisMap`
- "o"
- Degrees
Comment on lines +114 to +116
Copy link
Contributor Author

@weiglszonja weiglszonja Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not TimeSeries but I thought could be good to include for example for representing degrees?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompassDirection is more typical for that as an example (be sure to include radians as well)

* - :py:class:`~pynwb.ogen.OptogeneticSeries`
- "W"
- Watts

Check function: :py:meth:`~nwbinspector.checks.time_series.check_unit_formatting`

.. _best_practice_time_series_global_time_reference:

Expand Down
21 changes: 21 additions & 0 deletions nwbinspector/checks/time_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import numpy as np

from pynwb import TimeSeries
from cmixf.parser import CMIXFLexer

# from ..tools import all_of_type
from ..register_checks import register_check, Importance, Severity, InspectorMessage
Expand Down Expand Up @@ -92,3 +93,23 @@ def check_resolution(time_series: TimeSeries):
return InspectorMessage(
message=f"'resolution' should use -1.0 or NaN for unknown instead of {time_series.resolution}."
)


@register_check(importance=Importance.BEST_PRACTICE_VIOLATION, neurodata_type=TimeSeries)
def check_unit_formatting(time_series: TimeSeries):
"""
Check the unit value of a TimeSeries that it complies with CMIXF-12 convention for formatting the units.

Best Practice: :ref:`best_practice_unit_of_measurement`
"""

# Early return for arbitrary units that are unknown or unavailable.
if time_series.unit == "a.u.":
return

lexer = CMIXFLexer()
tokens = lexer.tokenize(time_series.unit)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lexer.tokenize() a generator object, when a bad character is found it immediately raises an error. e.g.

tokens = lexer.tokenize("ta")
next(tokens) 

would return the recognized token value for unit t
Token(type='UNITC', value='t', lineno=1, index=0)

next(tokens)
would return:
{ValueError}Line 1: Bad character 'a'

try:
list(tokens)
except ValueError:
return InspectorMessage(message=f"The 'unit' should adhere to CMIXF-12 format instead of '{time_series.unit}'.")
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ click
tqdm
numpy>=1.20.0,<1.21.0;python_version<'3.8'
numpy>=1.22.0,<1.23.0;python_version>='3.8' # <1.23 bound is from PyNWB
cmixf
48 changes: 48 additions & 0 deletions tests/unit_tests/test_time_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
check_timestamps_ascending,
check_missing_unit,
check_resolution,
check_unit_formatting,
)
from nwbinspector.testing import check_streaming_tests_enabled
from nwbinspector.utils import get_package_version, robust_s3_read
Expand Down Expand Up @@ -202,3 +203,50 @@ def test_check_resolution_fail():
object_name="test",
location="/",
)


def test_check_data_unit_fail():
time_series = pynwb.TimeSeries(
name="test",
unit="test_unit",
data=[1, 2, 3],
timestamps=[1, 2, 3],
)
assert check_unit_formatting(time_series) == InspectorMessage(
message="The 'unit' should adhere to CMIXF-12 format instead of 'test_unit'.",
importance=Importance.BEST_PRACTICE_VIOLATION,
check_function_name="check_unit_formatting",
object_type="TimeSeries",
object_name="test",
location="/",
)


def test_check_data_unit_pass():
time_series = pynwb.TimeSeries(
name="test",
unit="m",
data=[1, 2, 3],
timestamps=[1, 2, 3],
)
assert check_unit_formatting(time_series) is None


def test_check_data_unit_with_derived_unit():
time_series = pynwb.TimeSeries(
name="test",
unit="m/s",
data=[1, 2, 3],
timestamps=[1, 2, 3],
)
assert check_unit_formatting(time_series) is None


def test_check_data_unit_with_arbitrary_unit():
time_series = pynwb.TimeSeries(
name="test",
unit="a.u.",
data=[1, 2, 3],
timestamps=[1, 2, 3],
)
assert check_unit_formatting(time_series) is None