Implementation of Community object init and factory methods. #282

davidorme · 2024-09-12T15:59:27Z

Description

This PR brings in a Community object based of @AmyOctoCat's work in PR #230. From the features listed in #278, we have a (mostly) complete implementation with some elements postponed for later PRs:

provide something close to a final definition of the Community building on @AmyOctoCat's work and review by @davidorme,

Note

The community object takes cohort data as arrays, merges in PFT traits for those arrays and then calculates the cored predictions of the T Model for the initialisation data.
I've switched to using a pandas.DataFrame for storing cohort data. I think that reverses something I told @AmyOctoCat (sorry!), but although there is overhead in using the Dataframe rather than simple arrays as attributes, it does simplify the namespace of the class a lot and it is also much cleaner to add and delete cohorts as rows rather than having to iterate over individual attributes.

methods to create instances from data in files - and also handle multiple communities in files.

Note

This does not implement loading multiple Community objects yet - might as well sense check the single case before doing that!
The serialisation formats for CSV, JSON and TOML are revised
Factory methods are now implemented to load from those formats.
Marshmallow schemas are used to validate and post-process all of those formats into the format needed to initialise a Community object.

test that code.

Note

It currently tests the T Model functions code rather lightly and it would be better to have deeper regression tests against the R implementation. I'd like to break that out into a new PR as complete tests would include some elements missing from the T model functions at the moment and I don't want to bloat this basic Community PR.

provide API docs

This PR also updates Flora to provide a pandas.Dataframe view of the flora data, which makes it super easy to merge onto the cohort data.

There's some complexity in the serialisation formats and schemas:

the Community object needs single values for cell_area and cell_id but arrays for the cohort data.
CSV is great for rows of cohort data, but isn't structured for the single values, so I've imposed those as needing to be constant fields in the CSV. That is easy for users to maintain and feeds forward into having multiple cells in a file.
JSON/TOML allows structured data, but editing arrays of cohort data is ugly and confusing because they're not neatly aligned in grids. So here, I've made the cohort data a list of objects, which is much easier to maintain in these formats.

The validation schemas take these two slightly different approaches, validate them and then coerce to the common arguments used for initialisation.

Fixes #278

Type of change

New feature (non-breaking change which adds functionality)
Optimization (back-end change that speeds up the code)
Bug fix (non-breaking change which fixes an issue)

Key checklist

Make sure you've run the pre-commit checks: $ pre-commit run -a
All tests pass: $ poetry run pytest

Further checks

Code is commented, particularly in hard-to-understand areas
Tests added that prove fix is effective or that feature works

…rt data

…rom flora

…ds with different marshmallow schemas

…behaviour

codecov-commenter · 2024-09-12T16:05:41Z

Codecov Report

Attention: Patch coverage is 93.82716% with 10 lines in your changes missing coverage. Please review.

Project coverage is 95.13%. Comparing base (1f315ba) to head (89e5670).
Report is 48 commits behind head on develop.

Files with missing lines	Patch %	Lines
pyrealm/demography/community.py	93.54%	8 Missing ⚠️
pyrealm/demography/t_model_functions.py	92.30%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #282      +/-   ##
===========================================
- Coverage    95.29%   95.13%   -0.16%     
===========================================
  Files           28       32       +4     
  Lines         1720     2077     +357     
===========================================
+ Hits          1639     1976     +337     
- Misses          81      101      +20

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

davidorme · 2024-09-13T07:54:46Z

There is still a doc failure - sphinx is failing to link to pandas DataFrame from the flora.data attribute. It is linking other refs to pandas.DataFrame but not here. No idea why.

We could just add this to the growing pile of noise in our nitpick_ignore section in the sphinx conf.py but I think there's something wrong in the setup - we surely shouldn't be getting this much trouble from intersphinx and numpy?

…of-community-object

pyrealm/demography/community.py

j-emberton · 2024-09-13T13:33:09Z

I've had a good poke around this code and it looks ok to my eye. I can't claim to be completely abreast of the science, but I like the fact you're rolling out this 'schema' approach that flows from what we discussed for the prior flora PR. The approach seems sensible.

I did I quick check and there is a GPU drop (cuDF) in for pandas. Hopefully this should mean that all this is compatible with the array api stuff I've been talking about.

Happy to approve once the docs are fixed

j-emberton · 2024-09-13T13:34:47Z

There is still a doc failure - sphinx is failing to link to pandas DataFrame from the flora.data attribute. It is linking other refs to pandas.DataFrame but not here. No idea why.

We could just add this to the growing pile of noise in our nitpick_ignore section in the sphinx conf.py but I think there's something wrong in the setup - we surely shouldn't be getting this much trouble from intersphinx and numpy?

I haven't had time to dig into this issue at all I'm afraid. I'd hope we can fix it rather than ignore.

Co-authored-by: James Emberton <[email protected]>

davidorme · 2024-09-16T09:22:35Z

@j-emberton Thanks for the worked example and other requests. Should all be good now.

I have just nitpick-ignored the intersphinx issue with pandas. I'd much rather solve it properly, but this kind of thing always seems to turn into a huge time sink and I don't have the time now.

j-emberton

Following discussion and changes, happy to review

davidorme added 18 commits September 6, 2024 11:04

Initial import of @AmyOctoCat files for this subset from #231

6e1856c

Firming up example proposed input formats for community

58458ab

Amalgamating T model function modules

484d4e5

Initial refactor to dataclass

1856b9d

Adding array structured PFT data to Flora

a0e6a40

Validating pft names in cohorts in Community __init__

2280d69

Merge branch 'develop' into 278-implementation-of-community-object

ba2e004

Use pandas to represent flora data as arrays

f081777

Implementing community as a pandas dataframe of cohort data

a3b9689

Updating T Model functions and populating T model predictions in coho…

08922fa

…rt data

Moving q_m and p_zm calculations back into t model functions module f…

f4fe6a7

…rom flora

Update T model and fix broken tests

f1750a5

T Model docstring updates

7305b47

Explicit testing of exception messages in community data validation

2087367

Implementation of Community.from_csv and testing

7fb901f

Firming up community serialisation formats, more direct factory metho…

c6a186e

…ds with different marshmallow schemas

Removing old marshmallow schema, firming up validation and testing

a5f39dd

Simplified and aligned test inputs, added meaningful test of correct …

0b7fa05

…behaviour

davidorme linked an issue Sep 12, 2024 that may be closed by this pull request

Implementation of Community object #278

Closed

davidorme requested review from MarionBWeinzierl and j-emberton September 12, 2024 16:01

davidorme added this to the Demography and allocation model milestone Sep 12, 2024

davidorme added 2 commits September 12, 2024 19:18

Extending intersphinx details and whacking docs issues - some not fixed

89a6a21

Docstring updates

81088e0

davidorme added 4 commits September 13, 2024 09:06

Forgot to merge in @AmyOctoCat's tests for the t model functionality

5df492d

Fixed incorrect yield value in TModelConst and updated regression tests

2032adc

Merge branch '281-yield-incorrect-in-tmodel' into 278-implementation-…

be40279

…of-community-object

Merging in @AmyOctoCat's t model function tests

2dea74a

davidorme mentioned this pull request Sep 13, 2024

Complete implementation of T Model functions in demography module and test better. #284

Closed

j-emberton reviewed Sep 13, 2024

View reviewed changes

pyrealm/demography/community.py Show resolved Hide resolved

j-emberton reviewed Sep 13, 2024

View reviewed changes

pyrealm/demography/community.py Outdated Show resolved Hide resolved

j-emberton reviewed Sep 13, 2024

View reviewed changes

pyrealm/demography/community.py Show resolved Hide resolved

davidorme and others added 2 commits September 16, 2024 09:23

Worked example of community use.

6cf5c1b

Co-authored-by: James Emberton <[email protected]>

Updates from @j-emberton review

89e5670

j-emberton self-requested a review September 16, 2024 09:40

j-emberton approved these changes Sep 16, 2024

View reviewed changes

davidorme merged commit ce78db1 into develop Sep 16, 2024
12 checks passed

davidorme deleted the 278-implementation-of-community-object branch September 16, 2024 10:12

This was referenced Sep 16, 2024

Implementation of the Canopy model #286

Closed

Replace use of pandas in pyrealm.demography #292

Merged

Add helper functions for PFT geometry and canopy shape. #290

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of Community object init and factory methods. #282

Implementation of Community object init and factory methods. #282

davidorme commented Sep 12, 2024 •

edited

Loading

codecov-commenter commented Sep 12, 2024 •

edited

Loading

davidorme commented Sep 13, 2024

j-emberton commented Sep 13, 2024 •

edited

Loading

j-emberton commented Sep 13, 2024

davidorme commented Sep 16, 2024

j-emberton left a comment

Implementation of Community object init and factory methods. #282

Implementation of Community object init and factory methods. #282

Conversation

davidorme commented Sep 12, 2024 • edited Loading

Description

Type of change

Key checklist

Further checks

codecov-commenter commented Sep 12, 2024 • edited Loading

Codecov Report

davidorme commented Sep 13, 2024

j-emberton commented Sep 13, 2024 • edited Loading

j-emberton commented Sep 13, 2024

davidorme commented Sep 16, 2024

j-emberton left a comment

Choose a reason for hiding this comment

davidorme commented Sep 12, 2024 •

edited

Loading

codecov-commenter commented Sep 12, 2024 •

edited

Loading

j-emberton commented Sep 13, 2024 •

edited

Loading