Add helper functions for PFT geometry and canopy shape. #290

davidorme · 2024-09-20T15:15:40Z

We need some neat wrappers to make it easier to visualise PFT scaling without having to go through a Community object. This is partly to make the docs easier, but also to allow users to visualise the effects of the PFT settings. We need:

A geometry method or class that takes a PFT and an array of diameter at breast heights (DBH) or stem heights and generates all the other predictions of the T model, for use in plotting allometries and scaling.
A similar canopy method or class that takes a PFT, a stem height / DBH value and generates the vertical canopy data for plotting.

davidorme · 2024-09-20T17:51:52Z

OK - been playing around with idea on this, and I think a nice API for users on this issue is to extend Flora:

class Flora():
    ...
    def get_allometries(dbh: NDArray) -> dict[str, NDArray]:
        """Generate T Model predictions for PFTs in Flora for a set of DBH values."""
        ...
    def get_canopy_shape(z: NDArray) -> dict[str, NDArray]:
        """Generate canopy shape predictions for PFTs in Flora for a set of vertical height values."""
        ...

Those aren't methods of PlantFunctionalType as described above but immediately makes predictions across a set of PFTs that a user is working with, which seems like 90% of the use cases. If someone genuinely does only want one PFT, then creating a single PFT flora is pretty low cost.

If the return values in the dictionary are a set of (n_dbh, n_pfts) arrays for different predictions then it is also a very clean plotting structure for matplotlib. A usage sketch would be:

import numpy as np
from matplotlib import pyplot as plt
from pyrealm.demography.flora import PlantFunctionalType, Flora

# Create flora
pft1 = PlantFunctionalType(name='pft1', h_max=20)
pft2 = PlantFunctionalType(name='pft2', h_max=30)
flora = Flora([pft1, pft2])

# Get allometries for DBH values from 0 to 1 m in 1cm increments
dbh = np.linspace(0, 1, num=101) 
allometries = flora.get_allometries(dbh=dbh)

# Plot stem_height vs dbh for both PFTs in one go.
plt(dbh, allometries['stem_height'], label = flora.pft_names)

I like that a lot as an API but it immediately provides another angle on why pandas.DataFrame is the wrong choice for Flora.data and pd.Series is unhelpful for the arguments in the t_model_functions.py functions . If we have the PFT data as numpy arrays and the T model functions as pure numpy, then the get_allometry method should be able to turn dbh into a column vector and calculate across all PFTs by broadcasting traits as row vectors and size values as column vectors. For example, this works and results in a (100, 2) array of stem heights.

def calculate_heights(h_max: NDArray, a_hd: NDArray, dbh: NDArray) -> NDArray:
   
    return h_max * (1 - np.exp(-a_hd * dbh / h_max))

dbh = np.linspace(0, 1, num=101)
h_max = np.array([20,30])
a_hd = np.array([116.0,116.0])

h = calculate_heights(h_max, a_hd, dbh[:, None]) # turn DBH into a column vector.

I think that restructure would be make the functions more flexible and general. I was going to concentrate on expanding functionality through this issue and #289 before thinking about replacing pandas, but I think that API sketch and lot of the canopy code would be far cleaner if I fixed the internal data structures first.

@omarjamil, @MarionBWeinzierl and @j-emberton Thoughts?

j-emberton · 2024-09-21T10:11:33Z

OK - been playing around with idea on this, and I think a nice API for users on this issue is to extend Flora:
class Flora():
    ...
    def get_allometries(dbh: NDArray) -> dict[str, NDArray]:
        """Generate T Model predictions for PFTs in Flora for a set of DBH values."""
        ...
    def get_canopy_shape(z: NDArray) -> dict[str, NDArray]:
        """Generate canopy shape predictions for PFTs in Flora for a set of vertical height values."""
        ...
Those aren't methods of PlantFunctionalType as described above but immediately makes predictions across a set of PFTs that a user is working with, which seems like 90% of the use cases. If someone genuinely does only want one PFT, then creating a single PFT flora is pretty low cost.

If the return values in the dictionary are a set of (n_dbh, n_pfts) arrays for different predictions then it is also a very clean plotting structure for matplotlib. A usage sketch would be:
import numpy as np
from matplotlib import pyplot as plt
from pyrealm.demography.flora import PlantFunctionalType, Flora

# Create flora
pft1 = PlantFunctionalType(name='pft1', h_max=20)
pft2 = PlantFunctionalType(name='pft2', h_max=30)
flora = Flora([pft1, pft2])

# Get allometries for DBH values from 0 to 1 m in 1cm increments
dbh = np.linspace(0, 1, num=101) 
allometries = flora.get_allometries(dbh=dbh)

# Plot stem_height vs dbh for both PFTs in one go.
plt(dbh, allometries['stem_height'], label = flora.pft_names)
I like that a lot as an API but it immediately provides another angle on why pandas.DataFrame is the wrong choice for Flora.data and pd.Series is unhelpful for the arguments in the t_model_functions.py functions . If we have the PFT data as numpy arrays and the T model functions as pure numpy, then the get_allometry method should be able to turn dbh into a column vector and calculate across all PFTs by broadcasting traits as row vectors and size values as column vectors. For example, this works and results in a (100, 2) array of stem heights.
def calculate_heights(h_max: NDArray, a_hd: NDArray, dbh: NDArray) -> NDArray:
   
    return h_max * (1 - np.exp(-a_hd * dbh / h_max))

dbh = np.linspace(0, 1, num=101)
h_max = np.array([20,30])
a_hd = np.array([116.0,116.0])

h = calculate_heights(h_max, a_hd, dbh[:, None]) # turn DBH into a column vector.
I think that restructure would be make the functions more flexible and general. I was going to concentrate on expanding functionality through this issue and #289 before thinking about replacing pandas, but I think that API sketch and lot of the canopy code would be far cleaner if I fixed the internal data structures first.

@omarjamil, @MarionBWeinzierl and @j-emberton Thoughts?

I like this approach, but I would change the implementation slightly so that the end user doesn't need to worry about transposing the array.

I would prefer that we handle the transposition inside (perhaps using a generic shape test and conditional transposition helper function) to keep the interface cleaner. Alongside, for this type of broadcasting we might want to enforce checks at runtime to make sure the arrays are broadcastable.

Its slightly messy to try and type hint the appropriate relative shapes for all the inputs, so easier just to do it at runtime.

But this is secondary to the overall approach which I think is nice.

davidorme · 2024-09-21T10:58:27Z

I like this approach, but I would change the implementation slightly so that the end user doesn't need to worry about transposing the array.

I would prefer that we handle the transposition inside (perhaps using a generic shape test and conditional transposition helper function) to keep the interface cleaner. Alongside, for this type of broadcasting we might want to enforce checks at runtime to make sure the arrays are broadcastable.

I'd agree completely except for the use case where someone wants e.g. a 6x6 output with 6 heights for each of 6 PFTs in a Flora. If the broadcasting is internal, then we can't tell the user intention from the input shapes alone, so we would have to add an intended output shape argument (out_2d: bool = False) to every function. That's not a huge deal, but it feels cleaner to use the shapes themselves to show intention and clearly document the usage?

And I don't think I'd bother type hinting the shapes - or at least, we've not done it anywhere else, so handling shape checking in a validation function like the ones in #288 seems ok to me.

omarjamil · 2024-09-23T10:02:06Z

This seems like a reasonable approach to me as well. On a broader point, I agree that we should leave the transposition to the user. As long the functions are consistent in how the input arrays are arranged and what is returned, the rest should be left to the user. This would be the easiest to understand and debug.

davidorme · 2024-09-23T12:40:06Z

Thanks @omarjamil and @j-emberton . So my plan now is to:

Remove the use of pandas for what is now in develop (Replace use of pandas in pyrealm.demography #292)
Update the in progress PR for the Canopy model (Implementation of the canopy model #288 ) to match
Then come back here and implement the sketch above.

davidorme added this to the Demography and allocation model milestone Sep 20, 2024

davidorme mentioned this issue Sep 20, 2024

Look at replacing pandas in pyrealm.demography #291

Closed

davidorme mentioned this issue Sep 21, 2024

Replace use of pandas in pyrealm.demography #292

Merged

7 tasks

davidorme mentioned this issue Sep 25, 2024

Add Flora helper methods #296

Merged

10 tasks

davidorme linked a pull request Sep 25, 2024 that will close this issue

Add Flora helper methods #296

Merged

10 tasks

davidorme closed this as completed in #296 Sep 27, 2024

davidorme mentioned this issue Sep 27, 2024

Update traits data structures #298

Closed

j-emberton mentioned this issue Oct 10, 2024

[Meta] Demography and allocation #321

Open

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add helper functions for PFT geometry and canopy shape. #290

Add helper functions for PFT geometry and canopy shape. #290

davidorme commented Sep 20, 2024

davidorme commented Sep 20, 2024

j-emberton commented Sep 21, 2024

davidorme commented Sep 21, 2024 •

edited

Loading

omarjamil commented Sep 23, 2024

davidorme commented Sep 23, 2024 •

edited

Loading

Add helper functions for PFT geometry and canopy shape. #290

Add helper functions for PFT geometry and canopy shape. #290

Comments

davidorme commented Sep 20, 2024

davidorme commented Sep 20, 2024

j-emberton commented Sep 21, 2024

davidorme commented Sep 21, 2024 • edited Loading

omarjamil commented Sep 23, 2024

davidorme commented Sep 23, 2024 • edited Loading

davidorme commented Sep 21, 2024 •

edited

Loading

davidorme commented Sep 23, 2024 •

edited

Loading