feat: add timeseries section to model schema #99

Batch21 · 2024-02-07T10:47:02Z

No description provided.

Batch21 · 2024-02-07T11:00:07Z

@jetuk there is quite a bit left to do here but it would be good to get your opinion on the approach.

I have a few queries:

Should Timeseries be a variant of MetricFloatValue? Originally I made it a variant of MetricFloatReference but as the load method of this enum is called in the model build method when setting up the inter-network transfers, it required creating an empty timeseries collection, which was a bit ugly.
Should the timeseries inputs be able to handle index timeseries? The load_column and load_df methods could probably take a generic type arg for loading either floats or ints.
Should timeseries and tables be combined into a single inputs section?

jetuk

I've had a quick read through. I've left a couple of comments. I think this is what I was thinking it would look like.

pywr-schema/src/test_models/timeseries.json

pywr-schema/src/timeseries/mod.rs

pywr-schema/src/timeseries/polars_dataset.rs

pywr-schema/src/nodes/annual_virtual_storage.rs

Batch21 · 2024-02-11T23:28:45Z

@jetuk I've had a go at some resampling using polars. It still needs some work but one change that might simplify the code a bit would be to use chrono across the project instead of time-rs. This would also open up the possibility of using polars duration strings to specify model timestep frequency and the polars date_range function to generate the timesteps. What do you think?

jetuk · 2024-02-12T10:55:26Z

I think the polars integration and the duration strings are a good argument for switching to chrono. How should we do that? Should we do a separate PR to swap, and the rebase this against it?

Batch21 · 2024-03-03T19:49:07Z

@jetuk could you take another look at this. I responded to most of your comments thought there are still some issues to fix:

Resampling to or from timesteps with non-uniform durations does not currently work:
- If the input data has a non-uniform datetime column the code will currently hit a todo!(). One issue with implementing something here is that polars does not have a direct way to infer frequency from a datetime column. I can't find any other rust crates with this functionality but looking at the Pandas code for it, it might not be too hard to implement it for a few frequencies (maybe we only need monthly initially?).
- For the model timesteps, currently the TimeDomain derives step duration using only the first timestep so will not account for non-uniform timestep durations. This needs fixing alongside the last part of Time domain improvements #89, which I think you said you wanted to have a look at?
Its probably unnecessary to use nanosecond precision when comparing the durations. Seconds is probably fine?
I've fixed an issue where a ParameterNameAlreadyExist error occurred if a dataframe or dataframe column was referenced more than once. However there is still the issue that a user could define a parameter with a name that matches with one that would be created for a timeseries input. In this case, it would now use the values from the user-defined parameter if this was loaded first and not load the timeseries data. I wonder if parameters should have some kind of namespacing so that we can easily separate user-defined parameters from those created internally.

At least the first of these is probably worth doing in another PR?

jetuk · 2024-03-03T20:55:54Z

@jetuk could you take another look at this. I responded to most of your comments thought there are still some issues to fix:

* Resampling to or from timesteps with non-uniform durations does not currently work:
  
  * If the input data has a non-uniform datetime column the code will currently hit a `todo!()`. One issue with implementing something here is that polars does not have a direct way to infer frequency from a datetime column. I can't find any other rust crates with this functionality but looking at the Pandas code for it, it might not be too hard to implement it for a few frequencies (maybe we only need monthly initially?).

That's probably OK. There are other TODOs and unfinished items at the moment. We can get this framework in and work on it.

  * For the model timesteps, currently the `TimeDomain` derives step duration using only the first timestep so will not account for non-uniform timestep durations. This needs fixing alongside the last part of [Time domain improvements #89](https://github.com/pywr/pywr-next/issues/89), which I think you said you wanted to have a look at?

Yes, I had some ideas, but nothing implemented. Can we merge this with a limitation? That method could do a check that all time-steps have the same duration and panic if not?

* Its probably unnecessary to use nanosecond precision when comparing the durations. Seconds is probably fine?

Yes!

* I've fixed an issue where a `ParameterNameAlreadyExist` error occurred if a dataframe or dataframe column was referenced more than once. However there is still the issue that a user could define a parameter with a name that matches with one that would be created for a timeseries input. In this case, it would now use the values from the user-defined parameter if this was loaded first and not load the timeseries data. I wonder if parameters should have some kind of namespacing so that we can easily separate user-defined parameters from those created internally.

This could mirror the "sub-names" for the nodes? I hadn't anticipated a schema parameter creating many core parameters.

At least the first of these is probably worth doing in another PR?

Probably all of them can be addressed with additional updates?

Batch21 · 2024-03-07T21:29:10Z

@jetuk this is ready for another look. All your original comments should be addressed in some fashion.

One update that's probably necessary is adding the parameters_mut method to all the nodes. However, I wanted to check first if you though there was a better approach?

jetuk · 2024-03-08T21:24:14Z

I think there are other approaches, but they will delay this PR further.

Two ideas:

Remove the manually writing of parameters and parameters_mut using a derive macro (see v1 schema).
Move to a builder pattern for mutating nodes (I'm not sure about this).
Use a visitor pattern instead. Here there would be a visit or visit_mut method which you would give a closure that's called with each field or parameter. I am more used to (1), but thought about this after reading some of the tracing stuff.

So, probably (1) would just remove this boiler plate. I guess it might be more efficient to do (1) in a separate PR and then update this one? Assuming we can do that in the short-term to not delay this further.

Batch21 · 2024-03-09T11:31:13Z

I forgot that the proc macros you created in the v1 schema did this. Looking at them, we should be able to copy them across without too many modifications. I can have a go in a separate PR this evening.

This supports using PyParameter as an index parameter with updated schema to define the return type of the Python method.

Restores some removed functions required for the highs solver. Adds highs feature to CI.

jetuk

This looks good. I've left a few, but minor, comments to improve docstrings and tidy things up etc.

I think this might need a cargo fmt too.

jetuk · 2024-03-21T12:29:46Z

pywr-core/src/timestep.rs

@@ -58,6 +58,10 @@ impl PywrDuration {
    pub fn fractional_days(&self) -> f64 {
        self.0.num_seconds() as f64 / SECS_IN_DAY as f64
    }
+


Add docstring. Would be good to make the one above a proper docstring as well.

jetuk · 2024-03-21T12:34:38Z

pywr-schema/src/model.rs

@@ -412,8 +445,9 @@ impl PywrModel {
            Timestepper::default()
        });

-        let nodes = v1
+        let nodes_and_ts: Vec<NodeAndTimeseries> = v1


It might be worthwhile adding some comments here about this process.

jetuk · 2024-03-21T12:40:48Z

pywr-schema/src/model.rs

+            (None, None)
+        };
+
+        ts_data.extend(param_ts_data.into_iter().flatten());


I think you can move this up into the above if let Some(v1_parameters) ... block to avoid wrapping in an Option?

jetuk · 2024-03-21T12:49:56Z

pywr-schema/src/timeseries/align_and_resample.rs

+
+    if durations.len() > 1 {
+        // Non-uniform timestep are not yet supported
+        todo!();


Put the comment in the todo!() macro.

jetuk · 2024-03-21T12:50:37Z

pywr-schema/src/timeseries/align_and_resample.rs

+        .time()
+        .step_duration()
+        .whole_nanoseconds()
+        .expect("Nano seconds could not be extracted from model step duration");


Should this be a TimesreiesError variant?

jetuk · 2024-03-21T12:57:55Z

pywr-schema/src/model.rs

 use crate::outputs::Output;
-use crate::parameters::{MetricFloatReference, TryIntoV2Parameter};
+use crate::parameters::{


Some unused imports to resolve here.

jetuk · 2024-03-21T12:58:09Z

pywr-schema/src/parameters/control_curves.rs

@@ -5,6 +5,7 @@ use crate::nodes::NodeAttribute;
 use crate::parameters::{
    DynamicFloatValue, IntoV2Parameter, NodeReference, ParameterMeta, TryFromV1Parameter, TryIntoV2Parameter,
 };
+use crate::timeseries::{self, LoadedTimeseriesCollection};


Unused import self

jetuk · 2024-03-21T12:58:46Z

pywr-schema/src/parameters/offset.rs

@@ -1,5 +1,6 @@
 use crate::data_tables::LoadedTableCollection;
-use crate::parameters::{ConstantValue, DynamicFloatValue, DynamicFloatValueType, ParameterMeta};
+use crate::parameters::{ConstantValue, DynamicFloatValue, DynamicFloatValueType, ParameterMeta, VariableSettings};


Unused import VariableSettings

jetuk · 2024-03-21T12:59:03Z

pywr-schema/src/timeseries/align_and_resample.rs

@@ -0,0 +1,245 @@
+use chrono::TimeDelta;


Unused import

jetuk · 2024-03-21T12:59:21Z

pywr-schema/src/timeseries/mod.rs

+use pywr_core::parameters::{Array1Parameter, Array2Parameter, ParameterIndex};
+use pywr_core::PywrError;
+use pywr_v1_schema::tables::TableVec;
+use std::sync::Arc;


Unused import

Batch21 · 2024-03-23T11:25:30Z

@jetuk thanks for the review. I think I've addressed all your comments apart from the one I reply to above.

…ries

jetuk · 2024-03-24T13:37:19Z

Should we remove DataframeParameter in this PR? Or mark it deprecated in some fashion?

Batch21 · 2024-03-24T19:57:35Z

Should we remove DataframeParameter in this PR? Or mark it deprecated in some fashion?

yeah, I did think of doing this earlier on. At this stage of development it could probably just be removed?

jetuk · 2024-03-25T18:27:33Z

@Batch21 Are you finished with this now?

Batch21 · 2024-03-25T18:42:07Z

@Batch21 Are you finished with this now?

Yes, though I did spot another issue when removing the df parameter. The polars timeseries backend does not load gzip files, which will be needed when the test that loads the simple-wasm test model is enabled. I can add an issue for this.

WIP: feat: add timeseries section to model schema

32b5e5e

jetuk reviewed Feb 7, 2024

View reviewed changes

add a align_and_resample timeseries mod

2b308ec

Batch21 added 3 commits February 29, 2024 10:11

Merge branch 'main' into schema-timeseries

e5c6822

fix: check if param exists before creating param from timeseries input

1b5c5b1

create seperate timeseries error enum

72fe90e

jetuk mentioned this pull request Mar 6, 2024

Add Turbine and HydropowerDam nodes #107

Open

Batch21 added 2 commits March 7, 2024 00:10

wip: conversion of v1 dataframe parameters to timeseries inputs

5320983

update timeseries url in test model + remove dbg statement

1300e77

jetuk mentioned this pull request Mar 11, 2024

feat: add pywr-schema-macro crate #136

Merged

jetuk and others added 8 commits March 11, 2024 13:53

feat: Implement Parameter<usize> for PyParameter.

6b29780

This supports using PyParameter as an index parameter with updated schema to define the return type of the Python method.

Merge branch 'main' into schema-timeseries

da16f71

Merge branch 'py-index-param' into schema-timeseries

d7d011c

fix: Fix and upgrade highs_sys to v1.6.2

483b17e

Restores some removed functions required for the highs solver. Adds highs feature to CI.

Merge branch 'main' into schema-timeseries

413a6be

Merge remote-tracking branch 'origin/fix-highs' into schema-timeseries

e9805f4

fix: conversion of inline dataframe parameter to timeseries

4040da2

Merge branch 'main' into schema-timeseries

bbe8187

jetuk requested changes Mar 21, 2024

View reviewed changes

response to review comments

38f16d4

fix: use index_col for time_col when converting v1 df param to timese…

a021b2b

…ries

Merge branch 'main' into schema-timeseries

7de6996

remove dataframe parameter from schema and update test models

4150d54

jetuk approved these changes Mar 26, 2024

View reviewed changes

jetuk changed the title ~~WIP: feat: add timeseries section to model schema~~ feat: add timeseries section to model schema Mar 26, 2024

jetuk merged commit 2e38724 into main Mar 26, 2024
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add timeseries section to model schema #99

feat: add timeseries section to model schema #99

Batch21 commented Feb 7, 2024

Batch21 commented Feb 7, 2024

jetuk left a comment

Batch21 commented Feb 11, 2024

jetuk commented Feb 12, 2024

Batch21 commented Mar 3, 2024

jetuk commented Mar 3, 2024

Batch21 commented Mar 7, 2024

jetuk commented Mar 8, 2024

Batch21 commented Mar 9, 2024

jetuk left a comment

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

jetuk Mar 21, 2024

Batch21 commented Mar 23, 2024

jetuk commented Mar 24, 2024

Batch21 commented Mar 24, 2024

jetuk commented Mar 25, 2024

Batch21 commented Mar 25, 2024

feat: add timeseries section to model schema #99

feat: add timeseries section to model schema #99

Conversation

Batch21 commented Feb 7, 2024

Batch21 commented Feb 7, 2024

jetuk left a comment

Choose a reason for hiding this comment

Batch21 commented Feb 11, 2024

jetuk commented Feb 12, 2024

Batch21 commented Mar 3, 2024

jetuk commented Mar 3, 2024

Batch21 commented Mar 7, 2024

jetuk commented Mar 8, 2024

Batch21 commented Mar 9, 2024

jetuk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Batch21 commented Mar 23, 2024

jetuk commented Mar 24, 2024

Batch21 commented Mar 24, 2024

jetuk commented Mar 25, 2024

Batch21 commented Mar 25, 2024