Skip to content

Commit

Permalink
Spellchecking with codespell (#1576)
Browse files Browse the repository at this point in the history
### What kind of change does this PR introduce?

* Adds `codespell` for checking word spellings
* Configures `codespell` to ignore several French words and all current
translations configurations.
* Adds a `pre-commit` hook for performing these checks on commit

### Does this PR introduce a breaking change?

Yes, `codespell` is now a development dependency.

### Other information:

https://github.com/codespell-project/codespell
  • Loading branch information
Zeitsperre authored Jan 9, 2024
2 parents 58dc43b + 9888d95 commit e335ff4
Show file tree
Hide file tree
Showing 31 changed files with 97 additions and 79 deletions.
12 changes: 10 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ repos:
hooks:
- id: isort
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.9
rev: v0.1.11
hooks:
- id: ruff
- repo: https://github.com/pycqa/flake8
Expand All @@ -53,22 +53,30 @@ repos:
rev: 1.7.1
hooks:
- id: nbqa-pyupgrade
additional_dependencies: [ 'pyupgrade==3.15.0' ]
args: [ '--py38-plus' ]
- id: nbqa-black
additional_dependencies: [ 'black==23.12.1' ]
- id: nbqa-isort
additional_dependencies: [ 'isort==5.13.2' ]
- repo: https://github.com/kynan/nbstripout
rev: 0.6.1
hooks:
- id: nbstripout
files: '.ipynb'
args: [ '--extra-keys', 'metadata.kernelspec' ]
args: [ '--extra-keys=metadata.kernelspec' ]
- repo: https://github.com/keewis/blackdoc
rev: v0.3.9
hooks:
- id: blackdoc
additional_dependencies: [ 'black==23.12.1' ]
exclude: '(xclim/indices/__init__.py|docs/installation.rst)'
- repo: https://github.com/codespell-project/codespell
rev: v2.2.6
hooks:
- id: codespell
additional_dependencies: [ 'tomli' ]
args: [ '--toml=pyproject.toml' ]
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.27.3
hooks:
Expand Down
14 changes: 8 additions & 6 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,15 @@ Bug fixes
^^^^^^^^^
* Fixed passing ``missing=0`` to ``xclim.core.calendar.convert_calendar``. (:issue:`1562`, :pull:`1563`).
* Fix wrong `window` attributes in ``xclim.indices.standardized_precipitation_index``, ``xclim.indices.standardized_precipitation_evapotranspiration_index``. (:issue:`1552` :pull:`1554`).
* Several spelling mistakes have been corrected within the documentation and codebase. (:pull:`1576`).

Internal changes
^^^^^^^^^^^^^^^^
* The `flake8` configuration has been migrated from `setup.cfg` to `.flake8`; `setup.cfg` has been removed. (:pull:`1569`)
* The `bump-version.yml` workflow has been adjusted to bump the `patch` version when the last version is determined to have been a `release` version; otherwise, the `build` version is bumped. (:issue:`1557`, :pull:`1569`).
* The GitHub Workflows now use the `step-security/harden-runner` action to monitor source code, actions, and dependency safety. All workflows now employ more constrained permissions rule sets to prevent security issues. (:pull:`1577`).
* Updated the CONTRIBUTING.rst directions to showcase the new versioning system. (:issue:`1557`, :pull:`1573`).
* The `codespell` library is now a development dependency for the `dev` installation recipe with configurations found within `pyproject.toml`. This is also now a linting step and integrated as a `pre-commit` hook. For more information, see the `codespell documentation <https://github.com/codespell-project/codespell>`_ (:pull:`1576`).


v0.47.0 (2023-12-01)
Expand Down Expand Up @@ -317,7 +319,7 @@ New features and enhancements
* ``xclim.core.calendar.yearly_interpolated_doy``
* ``xclim.core.calendar.yearly_random_doy``
* `scipy` is no longer pinned below v1.9 and `lmoments3>=1.0.5` is now a core dependency and installed by default with `pip`. (:issue:`1142`, :pull:`1171`).
* Fix bug on number of bins in ``xclim.sdba.propeties.spatial_correlogram``. (:pull:`1336`)
* Fix bug on number of bins in ``xclim.sdba.properties.spatial_correlogram``. (:pull:`1336`)
* Add `resample_before_rl` argument to control when resampling happens in `maximum_consecutive_{frost|frost_free|dry|tx}_days` and in heat indices (in `_threshold`) (:issue:`1329`, :pull:`1331`)
* Add ``xclim.ensembles.make_criteria`` to help create inputs for the ensemble-reduction methods. (:issue:`1338`, :pull:`1341`).

Expand Down Expand Up @@ -1071,7 +1073,7 @@ Bug fixes
* Dimensions in a grouper's ``add_dims`` are now taken into consideration in function wrapped with ``map_blocks/groups``. This feature is still not fully tested throughout ``sdba`` though, so use with caution.
* Better dtype preservation throughout ``sdba``.
* "constant" extrapolation in the quantile mappings' adjustment is now padding values just above and under the target's max and min, instead of ``±np.inf``.
* Fixes in ``sdba.LOCI`` for the case where a grouping with additionnal dimensions is used.
* Fixes in ``sdba.LOCI`` for the case where a grouping with additional dimensions is used.

Internal Changes
^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -1139,7 +1141,7 @@ New indicators
Internal Changes
^^^^^^^^^^^^^^^^
* ``aggregate_between_dates`` (introduced in v0.27.0) now accepts ``DayOfYear``-like strings for supplying start and end dates (e.g. ``start="02-01", end="10-31"``).
* The indicator call sequence now considers "variable" the inputs annoted so. Dropped the ``nvar`` attribute.
* The indicator call sequence now considers "variable" the inputs annotated so. Dropped the ``nvar`` attribute.
* Default cfcheck is now to check metadata according to the variable name, using CMIP6 names in xclim/data/variable.yml.
* ``Indicator.missing`` defaults to "skip" if ``freq`` is absent from the list of parameters.
* Minor modifications to the GitHub Pull Requests template.
Expand Down Expand Up @@ -1186,7 +1188,7 @@ New indicators
Internal Changes
^^^^^^^^^^^^^^^^
* `run_length.rle_statistics` now accepts a `window` argument.
* Common arguments to the `op` parameter now have better adjective and noun formattings.
* Common arguments to the `op` parameter now have better adjective and noun formatting.
* Added and adjusted typing in call signatures and docstrings, with grammar fixes, for many `xclim.indices` operations.
* Added internal function ``aggregate_between_dates`` for array aggregation operations using xarray datetime arrays with start and end DayOfYear values.

Expand Down Expand Up @@ -1422,7 +1424,7 @@ Breaking changes
* The python library `pandoc` is no longer listed as a docs build requirement. Documentation still requires a current
version of `pandoc` binaries installed at system-level.
* ANUCLIM indices have seen their `input_freq` parameter renamed to `src_timestep` for clarity.
* A clean-up and harmonization of the indicators metadata has changed some of the indicator identifiers, long_names, abstracts and titles. `xclim.atmos.drought_code` and `fire_weather_indexes` now have indentifiers "dc" and "fwi" (lowercase version of the previous identifiers).
* A clean-up and harmonization of the indicators metadata has changed some of the indicator identifiers, long_names, abstracts and titles. `xclim.atmos.drought_code` and `fire_weather_indexes` now have identifiers "dc" and "fwi" (lowercase version of the previous identifiers).
* `xc.indices.run_length.run_length_with_dates` becomes `xc.indices.run_length.season_length`. Its argument `date` is now optional and the default changes from "07-01" to `None`.
* `xc.indices.consecutive_frost_days` becomes `xc.indices.maximum_consecutive_frost_days`.
* Changed the `history` indicator output attribute to `xclim_history` in order to respect CF conventions.
Expand Down Expand Up @@ -1569,7 +1571,7 @@ v0.14.x (2020-02-21)
* Refactoring of the documentation.
* Added support for pint 0.10
* Add `atmos.heat_wave_total_length` (fixing a namespace issue)
* Fixes in `utils.percentile_doy` and `indices.winter_rain_ratio` for multidimensionnal datasets.
* Fixes in `utils.percentile_doy` and `indices.winter_rain_ratio` for multidimensional datasets.
* Rewrote the `subset.subset_shape` function to allow for dask.delayed (lazy) computation.
* Added utility functions to compute `time_bnds` when resampling data encoded with `CFTimeIndex` (non-standard calendars).
* Fix in `subset.subset_gridpoint` for dask array coordinates.
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ lint: ## check style with flake8 and black
nbqa black --check docs
blackdoc --check --exclude=xclim/indices/__init__.py xclim
blackdoc --check docs
codespell xclim tests docs
yamllint --config-file=.yamllint.yaml xclim

test: ## run tests quickly with the default Python
Expand Down
2 changes: 1 addition & 1 deletion docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Anaconda release
For ease of installation across operating systems, we also offer an Anaconda Python package hosted on conda-forge.
This version tends to be updated at around the same frequency as the PyPI-hosted library, but can lag by a few days at times.

`xclim` can be installed from conda-forge wth the following:
`xclim` can be installed from conda-forge with the following:

.. code-block:: shell
Expand Down
4 changes: 1 addition & 3 deletions docs/notebooks/ensembles.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@
"\n",
"from __future__ import annotations\n",
"\n",
"from pathlib import Path\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import xarray as xr\n",
Expand Down Expand Up @@ -290,7 +288,7 @@
"\n",
"We can then divide the plotted points into categories each with its own hatching pattern, usually leaving the robust data (models agree and enough show a significant change) without hatching. \n",
"\n",
"Xclim provides some tools to help in generating these hatching masks. First is [xc.ensembles.robustness_fractions](../apidoc/xclim.ensembles.rst#xclim.ensembles._robustness.robustness_fractions) that can characterize the change significance and sign agreement accross ensemble members. To demonstrate its usage, we'll first generate some fake annual mean temperature data. Here, `ref` is the data on the reference period and `fut` is a future projection. There are 5 different members in the ensemble. We tweaked the generation so that all models agree on significant change in the \"south\" while agreement and signifiance of change decreases as we go north and east."
"Xclim provides some tools to help in generating these hatching masks. First is [xc.ensembles.robustness_fractions](../apidoc/xclim.ensembles.rst#xclim.ensembles._robustness.robustness_fractions) that can characterize the change significance and sign agreement across ensemble members. To demonstrate its usage, we'll first generate some fake annual mean temperature data. Here, `ref` is the data on the reference period and `fut` is a future projection. There are 5 different members in the ensemble. We tweaked the generation so that all models agree on significant change in the \"south\" while agreement and signifiance of change decreases as we go north and east."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/partitioning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"Here we estimate the sources of uncertainty for an ensemble of climate model projections. The data is the same as used in the [IPCC WGI AR6 Atlas](https://github.com/IPCC-WG1/Atlas). \n",
"\n",
"## Fetch data\n",
"We'll only fetch a small sample of the full ensemble to illustrate the logic and data structure expected by the partitioning algorith."
"We'll only fetch a small sample of the full ensemble to illustrate the logic and data structure expected by the partitioning algorithm."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/sdba-advanced.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -840,7 +840,7 @@
"ref_prop = sdba.properties.spell_length_distribution(\n",
" da=ref_future, thresh=\"28 degC\", op=\">\", stat=\"mean\", group=\"time.season\"\n",
")\n",
"# Properties are often associated with the same measures. This correspondance is implemented in xclim:\n",
"# Properties are often associated with the same measures. This correspondence is implemented in xclim:\n",
"measure = sdba.properties.spell_length_distribution.get_measure()\n",
"measure_sim = measure(sim_prop, ref_prop)\n",
"measure_scen = measure(scen_prop, ref_prop)\n",
Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/sdba.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -369,7 +369,7 @@
"metadata": {},
"outputs": [],
"source": [
"# To get an exagerated example we select different points\n",
"# To get an exaggerated example we select different points\n",
"# here \"lon\" will be our dimension of two \"spatially correlated\" points\n",
"reft = ds.air.isel(lat=21, lon=[40, 52]).drop_vars([\"lon\", \"lat\"])\n",
"simt = ds.air.isel(lat=18, lon=[17, 35]).drop_vars([\"lon\", \"lat\"])\n",
Expand Down Expand Up @@ -570,7 +570,7 @@
" base=sdba.QuantileDeltaMapping, # Use QDM as the univariate adjustment.\n",
" base_kws={\"nquantiles\": 20, \"group\": \"time\"},\n",
" n_iter=20, # perform 20 iteration\n",
" n_escore=1000, # only send 1000 points to the escore metric (it is realy slow)\n",
" n_escore=1000, # only send 1000 points to the escore metric (it is really slow)\n",
" )\n",
"\n",
"scenh_npdft = out.scenh.rename(time_hist=\"time\") # Bias-adjusted historical period\n",
Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/usage.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This computation was made using the `growing_degree_days` **indicator**. The same computation could be made through the **index**. You can see how the metadata is alot poorer here."
"This computation was made using the `growing_degree_days` **indicator**. The same computation could be made through the **index**. You can see how the metadata is a lot poorer here."
]
},
{
Expand Down Expand Up @@ -202,7 +202,7 @@
"):\n",
" # Change the missing method to \"percent\", instead of the default \"any\"\n",
" # Set the tolerance to 10%, periods with more than 10% of missing data\n",
" # in the input will be masked in the ouput.\n",
" # in the input will be masked in the output.\n",
" gdd = xclim.atmos.growing_degree_days(daily_ds.air, thresh=\"10.0 degC\", freq=\"MS\")\n",
"gdd"
]
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ dependencies:
- blackdoc
- bump-my-version
- cairosvg
- codespell
- coverage
- distributed >=2.0
- filelock
Expand Down
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ dev = [
"black >=23.3.0",
"blackdoc",
"bump-my-version",
"codespell",
"coverage[toml]",
"flake8",
"flake8-alphabetize",
Expand Down Expand Up @@ -148,6 +149,9 @@ values = [
"release"
]

[tool.codespell]
skip = 'xclim/data/*.json,docs/_build,docs/notebooks/xclim_training/*.ipynb,docs/references.bib,__pycache__,*.nc,*.png,*.gz,*.whl'
ignore-words-list = "absolue,astroid,bloc,bui,callendar,degreee,environnement,hanel,inferrable,lond,nam,nd,ressources,vas"

[tool.coverage.run]
relative_files = true
Expand Down
2 changes: 1 addition & 1 deletion tests/test_ensembles.py
Original file line number Diff line number Diff line change
Expand Up @@ -412,7 +412,7 @@ def test_kmeans_variweights(self, open_dataset, random_state):
make_graph=False,
variable_weights=var_weights,
)
# Results here may change according to sklearn version, hence the *isin* intead of ==
# Results here may change according to sklearn version, hence the *isin* instead of ==
assert all(np.isin([12, 13, 16], ids))
assert len(ids) == 6

Expand Down
4 changes: 2 additions & 2 deletions tests/test_indices.py
Original file line number Diff line number Diff line change
Expand Up @@ -1465,7 +1465,7 @@ def test_jetstream_metric_woollings(self):
# Should raise ValueError as longitude is in 0-360 instead of -180.E-180.W
with pytest.raises(ValueError):
_ = xci.jetstream_metric_woollings(da_ua)
# redefine longitude coordiantes to -180.E-180.W so function runs
# redefine longitude coordinates to -180.E-180.W so function runs
da_ua = da_ua.cf.assign_coords(
{
"X": (
Expand Down Expand Up @@ -2888,7 +2888,7 @@ def test_humidex(tas_series):
# expected values from https://en.wikipedia.org/wiki/Humidex
expected = np.array([16, 29, 47, 52]) * units.degC

# Celcius
# Celsius
hc = xci.humidex(tas, dtps)
np.testing.assert_array_almost_equal(hc, expected, 0)

Expand Down
4 changes: 2 additions & 2 deletions tests/test_sdba/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ def test_grouper_apply(tas_series, use_dask, group, n):
exp = tas.mean(dim=grouper.dim).expand_dims("group").T
np.testing.assert_array_equal(out_mean, exp)

# With additionnal dimension included
# With additional dimension included
grouper = Grouper(group, add_dims=["lat"])
out = grouper.apply("mean", tas)
assert out.ndim == 1
Expand All @@ -98,7 +98,7 @@ def test_grouper_apply(tas_series, use_dask, group, n):
assert out.attrs["group_compute_dims"] == [grouper.dim, "lat"]
assert out.attrs["group_window"] == 1

# Additionnal but main_only
# Additional but main_only
out = grouper.apply("mean", tas, main_only=True)
np.testing.assert_array_equal(out, out_mean)

Expand Down
2 changes: 1 addition & 1 deletion tests/test_temperature.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ def test_TN_3d_data(self, open_dataset):
~np.isnan(tnmean).values & ~np.isnan(tnmax).values & ~np.isnan(tnmin).values
)

# test maxes always greater than mean and mean alwyas greater than min (non nan values only)
# test maxes always greater than mean and mean always greater than min (non nan values only)
assert np.all(tnmax.values[no_nan] > tnmean.values[no_nan]) & np.all(
tnmean.values[no_nan] > tnmin.values[no_nan]
)
Expand Down
4 changes: 2 additions & 2 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ def test_ensure_chunk_size():

class TestNanCalcPercentiles:
def test_calc_perc_type7(self):
# Exemple array from: https://en.wikipedia.org/wiki/Percentile#The_nearest-rank_method
# Example array from: https://en.wikipedia.org/wiki/Percentile#The_nearest-rank_method
arr = np.asarray([15.0, 20.0, 35.0, 40.0, 50.0])
res = nan_calc_percentiles(arr, percentiles=[40.0], alpha=1, beta=1)
# The expected is from R `quantile(arr, probs=c(0.4), type=7)`
Expand All @@ -87,7 +87,7 @@ def test_calc_perc_type8(self):
assert np.all(res[0][1] == 27)

def test_calc_perc_2d(self):
# Exemple array from: https://en.wikipedia.org/wiki/Percentile#The_nearest-rank_method
# Example array from: https://en.wikipedia.org/wiki/Percentile#The_nearest-rank_method
arr = np.asarray(
[[15.0, 20.0, 35.0, 40.0, 50.0], [15.0, 20.0, 35.0, 40.0, 50.0]]
)
Expand Down
12 changes: 6 additions & 6 deletions xclim/core/bootstrapping.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ def percentile_bootstrap(func):
This feature is experimental.
Bootstraping avoids discontinuities in the exceedance between the reference period over which percentiles are
Bootstrapping avoids discontinuities in the exceedance between the reference period over which percentiles are
computed, and "out of reference" periods. See `bootstrap_func` for details.
Declaration example:
Expand Down Expand Up @@ -71,12 +71,12 @@ def bootstrap_func(compute_index_func: Callable, **kwargs) -> xarray.DataArray:
at the beginning and end of the reference period used to calculate percentiles. The bootstrap procedure can reduce
those discontinuities by iteratively computing the percentile estimate and the index on altered reference periods.
Theses altered reference periods are themselves built iteratively: When computing the index for year x, the
bootstrapping create as many altered reference period as the number of years in the reference period.
To build one altered reference period, the values of year x are replaced by the values of another year in the
These altered reference periods are themselves built iteratively: When computing the index for year `x`, the
bootstrapping creates as many altered reference periods as the number of years in the reference period.
To build one altered reference period, the values of year `x` are replaced by the values of another year in the
reference period, then the index is computed on this altered period. This is repeated for each year of the reference
period, excluding year x, The final result of the index for year x, is then the average of all the index results on
altered years.
period, excluding year `x`. The final result of the index for year `x` is then the average of all the index results
on altered years.
Parameters
----------
Expand Down
Loading

0 comments on commit e335ff4

Please sign in to comment.