NREL · mdeceglie · Oct 17, 2024 · Sep 10, 2024 · Sep 10, 2024 · Sep 11, 2024
diff --git a/.github/workflows/nbval.yaml b/.github/workflows/nbval.yaml
@@ -10,10 +10,10 @@ jobs:
       fail-fast: false  # don't cancel other matrix jobs when one fails
       matrix:
         notebook-file: [
-          'TrendAnalysis_example_pvdaq4.ipynb',
-          'degradation_and_soiling_example_pvdaq_4.ipynb',
+          'TrendAnalysis_example.ipynb',
+          'TrendAnalysis_example_NSRDB.ipynb',
+          'degradation_and_soiling_example.ipynb',
           'system_availability_example.ipynb'
-          # can't run the DKASC notebook here because it requires pre-downloaded data
         ]
 
     steps:

diff --git a/README.md b/README.md
@@ -12,11 +12,11 @@ Code coverage:
 RdTools is an open-source library to support reproducible technical analysis of
 time series data from photovoltaic energy systems. The library aims to provide
 best practice analysis routines along with the building blocks for users to
-tailor their own analyses.
-Current applications include the evaluation of PV production over several years to obtain
-rates of performance degradation and soiling loss. RdTools can handle
-both high frequency (hourly or better) or low frequency (daily, weekly,
-etc.) datasets. Best results are obtained with higher frequency data.
+tailor their own analyses. Current applications include the evaluation of PV
+production over several years to obtain rates of performance degradation and 
+soiling loss. RdTools can handle both high frequency (hourly or better) or low
+frequency (daily, weekly, etc.) datasets. Best results are obtained with higher
+frequency data.
 
 RdTools can be installed automatically into Python from PyPI using the
 command line:
@@ -27,7 +27,7 @@ pip install rdtools
 
 For API documentation and full examples, please see the [documentation](https://rdtools.readthedocs.io).
 
-RdTools currently is tested on Python 3.7+.
+RdTools currently is tested on Python 3.9+.
 
 ## Citing RdTools
 

diff --git a/docs/TrendAnalysis_example_pvdaq4.ipynb → docs/TrendAnalysis_example.ipynb b/docs/TrendAnalysis_example_pvdaq4.ipynb → docs/TrendAnalysis_example.ipynb
diff --git a/docs/TrendAnalysis_example_NSRDB.ipynb b/docs/TrendAnalysis_example_NSRDB.ipynb
diff --git a/docs/cods_example.ipynb b/docs/cods_example.ipynb
diff --git a/...adation_and_soiling_example_pvdaq_4.ipynb → docs/degradation_and_soiling_example.ipynb b/...adation_and_soiling_example_pvdaq_4.ipynb → docs/degradation_and_soiling_example.ipynb
diff --git a/docs/sphinx/source/changelog/pending.rst b/docs/sphinx/source/changelog/pending.rst
@@ -20,6 +20,8 @@ Enhancements
 * Added a new wrapper function for clearsky filters (:pull:`412`)
 * Improve test coverage, especially for the newly added filter capabilities (:pull:`413`)
 * Added codecov.yml configuration file (:pull:`420`)
+* Availability module no longer considered experimental (:pull:`429`)
+* Add capability to seed the CircularBlockBootstrap (:pull:`429`)
 
 Bug fixes
 ---------
@@ -43,7 +45,7 @@ Documentation
 Requirements
 ------------
 * Specified versions in ``requirements.txt``, ``requirements_min.txt`` and ``docs/notebook_requirements.txt``
-have been updated (:pull:`412`, :pull:`428`)
+have been updated (:pull:`412`, :pull:`428`, :pull:`429`)
 
     * Updated certifi==2024.7.4 in ``requirements.txt`` (:pull:`428`)
     * Updated chardet==5.2.0 in ``requirements.txt`` (:pull:`428`)
@@ -86,7 +88,7 @@ have been updated (:pull:`412`, :pull:`428`)
     * Updated h5py==3.7.0 in ``requirements_min.txt`` (:pull:`428`)
     * Updated pvlib==0.11.0 in ``requirements_min.txt`` (:pull:`428`)
     * Updated scikit-learn==1.1.3 in ``requirements_min.txt`` (:pull:`428`)
-    * Updated arch==4.11 in ``requirements_min.txt`` (:pull:`428`)
+    * Updated arch==5.0 in ``requirements_min.txt`` (:pull:`429`)
     * Updated filterpy==1.4.5 in ``requirements_min.txt`` (:pull:`428`)
     * Updated xgboost==1.6.0 in ``requirements_min.txt`` (:pull:`431`)
 
@@ -143,7 +145,7 @@ have been updated (:pull:`412`, :pull:`428`)
     * Increase maximum version of pvlib to <0.12 (:pull:`423`)
     * Updated classifiers to accomodate new python versions (:pull:`428`)
     * Add pytest-cov to TESTS_REQUIRE (:pull:`420`)
-    * Add arch >= 4.11 to INSTALL_REQUIRES (:pull:`428`)
+    * Add arch >= 5.0 to INSTALL_REQUIRES (:pull:`429`)
     * Add filterpy >= 1.4.2 to INSTALL_REQUIRES (:pull:`428`)
     * Updated matplotlib >= 3.5.3 in INSTALL_REQUIRES (:pull:`428`)
     * Updated numpy >= 1.22.4 in INSTALL_REQUIRES (:pull:`428`)

diff --git a/docs/sphinx/source/examples.rst b/docs/sphinx/source/examples.rst
@@ -22,7 +22,7 @@ This page shows example usage of the RdTools analysis functions.
 
 .. nbgallery::
 
-    examples/degradation_and_soiling_example_pvdaq_4
-    examples/TrendAnalysis_example_pvdaq4
+    examples/degradation_and_soiling_example
+    examples/TrendAnalysis_example
+    examples/TrendAnalysis_example_NSRDB
     examples/system_availability_example
-    examples/cods_example
diff --git a/docs/sphinx/source/examples/TrendAnalysis_example.nblink b/docs/sphinx/source/examples/TrendAnalysis_example.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../TrendAnalysis_example.ipynb"
+}
diff --git a/docs/sphinx/source/examples/TrendAnalysis_example_NSRDB.nblink b/docs/sphinx/source/examples/TrendAnalysis_example_NSRDB.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../TrendAnalysis_example_NSRDB.ipynb"
+}
diff --git a/docs/sphinx/source/examples/TrendAnalysis_example_pvdaq4.nblink b/docs/sphinx/source/examples/TrendAnalysis_example_pvdaq4.nblink
diff --git a/docs/sphinx/source/examples/cods_example.nblink b/docs/sphinx/source/examples/cods_example.nblink
diff --git a/docs/sphinx/source/examples/degradation_and_soiling_example.nblink b/docs/sphinx/source/examples/degradation_and_soiling_example.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../degradation_and_soiling_example.ipynb"
+}
diff --git a/docs/sphinx/source/examples/degradation_and_soiling_example_pvdaq_4.nblink b/docs/sphinx/source/examples/degradation_and_soiling_example_pvdaq_4.nblink
diff --git a/docs/sphinx/source/index.rst b/docs/sphinx/source/index.rst
@@ -36,17 +36,15 @@ supported. A typical analysis of soiling and degradation contains the following:
 
 0. Import and preliminary calculations
 1. Normalize data using a performance metric
-2. Filter data that creates bias
+2. Filter data to reduce error
 3. Aggregate data
-4. Analyze aggregated data to estimate the degradation rate and/or
+4. Filter aggregated data to remove anomalies
+5. Analyze aggregated data to estimate the degradation rate and/or
    soiling loss
 
-Steps 1 and 2 may be accomplished with the clearsky workflow (see the
-:ref:`examples`) which can help eliminate problems from irradiance sensor
-drift.
-
-.. image:: _images/RdTools_workflows.png
-  :alt: RdTools workflow diagram
+It can be helpful to repeat the above steps with both ground-based measurements of weather
+and satellite weather to check for drift in the ground-based measurements. This is illustrated
+in the TrendAnalysis with NSRDB example.
 
 Degradation
 ^^^^^^^^^^^
@@ -63,28 +61,26 @@ the uncertainty in the estimate via a bootstrap calculation. The
 .. image:: _images/Clearsky_result_updated.png
    :alt: RdTools degradation results plot
 
-Two workflows are available for system performance ratio calculation,
-and illustrated in an example notebook. The sensor-based approach
-assumes that site irradiance and temperature sensors are calibrated and
-in good repair. Since this is not always the case, a 'clear-sky'
-workflow is provided that is based on modeled temperature and
-irradiance. Note that site irradiance data is still required to identify
-clear-sky conditions to be analyzed. In many cases, the 'clear-sky'
-analysis can identify conditions of instrument errors or irradiance
-sensor drift, such as in the above analysis.
-
-The clear-sky analysis tends to provide less stable results than sensor-based
+Drift of weather sensors over time (particularly irradiance) can bias the results
+of this workflow. The preferred way to check for this is to also run the workflow using
+satellite-derived weather data such as the National Solar Radiation Database (NSRDB) and
+compare results to the sensor-based analysis. If satellite data is not available,
+a 'clear-sky' workflow is also available in RdTools. THis workflow is based on modeled
+temperature and irradiance. Note that site irradiance data is still required to identify
+clear-sky conditions to be analyzed.
+
+Satellite and clear-sky analysis tends to provide less stable results than sensor-based
 analysis when details such as filtering are changed. We generally recommend
-that the clear-sky analysis be used as a check on the sensor-based results,
+that the these be used only as a check on the sensor-based results,
 rather than as a stand-alone analysis.
 
 Soiling
 ^^^^^^^
 
-Soiling can be estimated with the stochastic rate and recovery (SRR)
-method (Deceglie 2018). This method works well when soiling patterns
-follow a "sawtooth" pattern, a linear decline followed by a sharp
-recovery associated with natural or manual cleaning.
+RdTools provides two methods for soiling analysis. The first is the
+stochastic rate and recovery (SRR) method (Deceglie 2018). This method works
+well when soiling patternsfollow a "sawtooth" pattern, a linear decline followed
+by a sharp recovery associated with natural or manual cleaning.
 :py:func:`.soiling.soiling_srr` performs the calculation and returns the P50
 insolation-weighted soiling ratio, confidence interval, and additional
 information (``soiling_info``) which includes a summary of the soiling
@@ -97,6 +93,12 @@ identified soiling rates for the dataset.
    :width: 320
    :height: 216
 
+The combined estimation of degradation and soiling (CODS) method (Skomedal 2020) is also available
+in RdTools. CODS self-consistently extracts degradation, soiling, and seasonality
+of the daily-aggregated normalized performance signal. It is particularly useful
+when soiling trends are biasing degradation results. It's use is shown in both the TrendAnalysis
+example notebook as well as the funtional API example notebook for degradation and soiling. 
+
 TrendAnalysis
 ^^^^^^^^^^^^^
 An object-oriented API for complete soiling and degradation analysis including 
@@ -151,7 +153,7 @@ Usage and examples
 ------------------
 
 Full workflow examples are found in the notebooks in :ref:`examples`.
-The examples are designed to work with python 3.10. For a consistent
+The examples are designed to work with python 3.12. For a consistent
 experience, we recommend installing the packages and versions documented
 in ``docs/notebook_requirements.txt``. This can be achieved in your
 environment by first installing RdTools as described above, then running
@@ -259,6 +261,10 @@ appropriate:
    Directly From PV Yield," in IEEE Journal of Photovoltaics, 8(2),
    pp. 547-551, 2018 DOI: `10.1109/JPHOTOV.2017.2784682 <https://doi.org/10.1109/JPHOTOV.2017.2784682>`_
 
+-  Åsmund Skomedal and Michael G. Deceglie, "Combined Estimation of Degradation and Soiling Losses in
+   Photovoltaic Systems," in IEEE Journal of Photovoltaics, 10(6) pp. 1788-1796, 2020.
+   DOI: `10.1109/JPHOTOV.2020.3018219 <https://doi.org/10.1109/JPHOTOV.2020.3018219>`_
+
 -  Kevin Anderson and Ryan Blumenthal, "Overcoming Communications Outages in
    Inverter Downtime Analysis", 2020 IEEE 47th Photovoltaic Specialists
    Conference (PVSC). DOI: `10.1109/PVSC45281.2020.9300635 <https://doi.org/10.1109/PVSC45281.2020.9300635>`_

diff --git a/docs/system_availability_example.ipynb b/docs/system_availability_example.ipynb
diff --git a/rdtools/analysis_chains.py b/rdtools/analysis_chains.py
@@ -69,21 +69,28 @@ class TrendAnalysis:
     ----------
     (not all attributes documented here)
     filter_params: dict
-        parameters to be passed to rdtools.filtering functions. Keys are the
+        Parameters to be passed to rdtools.filtering functions. Keys are the
         names of the rdtools.filtering functions. Values are dicts of parameters
-        to be passed to those functions. Also has a special key `ad_hoc_filter`
-        the associated value is a boolean mask joined with the rest of the filters.
-        filter_params defaults to empty dicts for each function in rdtools.filtering,
-        in which case those functions use default parameter values,  `ad_hoc_filter`
-        defaults to None. See examples for more information.
+        to be passed to those functions. Allowed keys are `normalized_filter`*,
+        `poa_filter`*, `tcell_filter`*, `clip_filter`*, `hour_angle_filter`,
+        `clearsky_filter`* (only used in a clear sky analysis), and
+        `sensor_clearsky_filter` (only used in a sensor analysis). (* indicates a
+        filter included by default). To invoke `clearsky_filter` for a sensor
+        analysis, use the special key `sensor_clearsky_filter`. Also has a special
+        key `ad_hoc_filter`, the associated value is a boolean mask joined with the
+        rest of the filters. Defaults to empty dicts for each function as described
+        above, in which case those functions use default parameter values,
+        `ad_hoc_filter` defaults to None. See examples for more information.
     filter_params_aggregated: dict
         parameters to be passed to rdtools.filtering functions that specifically handle
-        aggregated data (dily filters, etc). Keys are the names of the rdtools.filtering functions.
-        Values are dicts of parameters to be passed to those functions. To invoke `clearsky_filter`
-        for a sensor analysis, use the special key `sensor_clearsky_filter`. Also has a special key
+        aggregated data (daily filters, etc). Keys are the names of rdtools.filtering
+        functions. Allowed keys are `two_way_window_filter`*, `insolation_filter`,
+        `hampel_filter`, and `directional_tukey_filter` (* indicates filters included by
+        default). Values are dicts of parameters to be passed to those functions (empty
+        dict calls the funtion with its default parameters). Also has a special key
         `ad_hoc_filter`; this filter is a boolean mask joined with the rest of the filters.
-        filter_params_aggregated defaults to empty dicts for each function in rdtools.filtering,
-        in which case those functions use default parameter values,  `ad_hoc_filter`
+        filter_params_aggregated defaults to an empty dict for two_way_window_filter,
+        in which case the filter is run with its default parameter values. `ad_hoc_filter`
         defaults to None. See examples for more information.
     results : dict
         Nested dict used to store the results of methods ending with `_analysis`

diff --git a/rdtools/availability.py b/rdtools/availability.py
@@ -14,12 +14,6 @@
 from scipy.interpolate import interp1d
 import warnings
 
-warnings.warn(
-    'The availability module is currently experimental. The API, results, '
-    'and default behaviors may change in future releases (including MINOR '
-    'and PATCH releases) as the code matures.'
-)
-
 
 class AvailabilityAnalysis:
     """

diff --git a/rdtools/bootstrap.py b/rdtools/bootstrap.py
@@ -15,7 +15,8 @@
 
 
 def _make_time_series_bootstrap_samples(
-    signal, model_fit, sample_nr=1000, block_length=90, decomposition_type='multiplicative'
+    signal, model_fit, sample_nr=1000, block_length=90,
+    decomposition_type='multiplicative', bootstrap_seed=None
 ):
     '''
     Generate bootstrap samples based a time series signal and its model fit
@@ -34,6 +35,10 @@ def _make_time_series_bootstrap_samples(
     decomposition_type : string, default 'multiplicative'
         The type of decomposition to use with the model,
         either 'multiplicative' or 'additive'
+    bootstrap_seed: {Generator, RandomState, int}, default None
+        Seed passed to CircularBlockBootstrap use to ensure reproducable results.
+        If an int, passes the value to value to ``np.random.default_rng``.
+        If None (default), a fresh Generator is constructed with system-provided entropy.
 
     Returns
     -------
@@ -54,7 +59,7 @@ def _make_time_series_bootstrap_samples(
         index=signal.index, columns=range(sample_nr))
 
     # Create circular blocks of boostrap samples
-    bs = CircularBlockBootstrap(block_length, residuals)
+    bs = CircularBlockBootstrap(block_length, residuals, seed=bootstrap_seed)
     for b, bootstrapped_residuals in enumerate(bs.bootstrap(sample_nr)):
         if decomposition_type == 'multiplicative':
             bootstrap_samples.loc[:, b] = \

diff --git a/rdtools/filtering.py b/rdtools/filtering.py
@@ -742,6 +742,7 @@ def _calculate_xgboost_model_features(df, sampling_frequency):
     # Get the max value for the day and see how each value compares
     df["date"] = list(pd.to_datetime(pd.Series(df.index)).dt.date)
     df["daily_max"] = df.groupby(["date"])["scaled_value"].transform("max")
+
     # Get percentage of daily max
     df["percent_daily_max"] = df["scaled_value"] / (df["daily_max"] + 0.00001)
     # Get the standard deviation, median and mean of the first order
@@ -871,6 +872,7 @@ def xgboost_clip_filter(power_ac, mounting_type="fixed"):
     # data frequency.
     xgb_predictions = xgb_predictions.reindex(index=power_ac.index, method="ffill")
     xgb_predictions.loc[xgb_predictions.isnull()] = False
+
     # Regenerate the features with the original sampling frequency
     # (pre-resampling or interpolation).
     power_ac_df = power_ac.to_frame()
@@ -924,6 +926,7 @@ def two_way_window_filter(
     """
     Removes anomalies based on forward and backward window of the rolling median. Points beyond
     outlier_threshold from both the forward and backward-looking median are excluded by the filter.
+    Designed for use after the aggregation step in the RdTools trend analysis workflows.
 
     Parameters
     ----------
@@ -964,7 +967,8 @@ def two_way_window_filter(
 def insolation_filter(insolation, quantile=0.1):
     """
     A simple quantile filter. Primary application in RdTools is to exclude
-    low insolation points after the aggregation step.
+    low insolation points after the aggregation step in the trend analysis
+    workflows.
 
     Parameters
     ----------
@@ -986,7 +990,8 @@ def insolation_filter(insolation, quantile=0.1):
 
 def hampel_filter(series, k="14d", t0=3):
     """
-    Hampel outlier filter primarily applied after aggregation step, but broadly
+    Hampel outlier designed for use after the aggregation step
+    in the RdTools trend analysis workflows, but broadly
     applicable.
 
     Parameters
@@ -1026,8 +1031,9 @@ def _tukey_fence(series, k=1.5):
 def directional_tukey_filter(series, roll_period=pd.to_timedelta("7 Days"), k=1.5):
     """
     Performs a forward and backward looking rolling Tukey filter. Points more than k*IQR
-    above the third quartile or below the first quartile are classified as outliers.Points
-    must only pass one of either the forward or backward looking filters to be kept.
+    above the third quartile or below the first quartile are classified as outliers. Points
+    must only pass one of either the forward or backward looking filters to be kept. Designed
+    for use after the aggregation step in the RdTools trend analysis workflows
 
 
     Parameters

diff --git a/rdtools/plotting.py b/rdtools/plotting.py
@@ -347,11 +347,6 @@ def availability_summary_plots(power_system, power_subsystem, loss_total,
     :py:meth:`.availability.AvailabilityAnalysis.plot` instead of running
     this function manually.
 
-    .. warning::
-        The availability module is currently experimental. The API, results,
-        and default behaviors may change in future releases (including MINOR
-        and PATCH releases) as the code matures.
-
     Parameters
     ----------
     power_system : pandas.Series
@@ -389,11 +384,6 @@ def availability_summary_plots(power_system, power_subsystem, loss_total,
     ...     aa.power_subsystem, aa.loss_total, aa.energy_cumulative,
     ...     aa.energy_expected_rescaled, aa.outage_info)
     """
-    warnings.warn(
-        'The availability module is currently experimental. The API, results, '
-        'and default behaviors may change in future releases (including MINOR '
-        'and PATCH releases) as the code matures.'
-    )
 
     fig = plt.figure(figsize=(16, 8))
     gs = fig.add_gridspec(3, 2)