diff --git a/CHANGELOG.md b/CHANGELOG.md index 46a11ac..a582c98 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -36,6 +36,8 @@ Keep it human-readable, your future self will thank you! - feat: Add CONTRIBUTORS.md file. (#72) - Fixed issue when computing area weights with scipy.Voronoi. (#79) +- Create package documentation. + ### Changed - ci: small fixes and updates pre-commit, downsteam-ci (#49) diff --git a/docs/cli/inspect.rst b/docs/cli/inspect.rst index 394d529..27227c8 100644 --- a/docs/cli/inspect.rst +++ b/docs/cli/inspect.rst @@ -4,7 +4,8 @@ inspect ======== -Use this command to inspect a graph stored in your filesystem. +Use this command to inspect a graph stored in your filesystem. A set of interactive and static visualisations +are generated to allow visual inspection of the graph design. The syntax of the recipe file is described in :doc:`building graphs <../graphs/introduction>`. diff --git a/docs/graphs/edge_attributes.rst b/docs/graphs/edge_attributes.rst index c7d713b..1c99149 100644 --- a/docs/graphs/edge_attributes.rst +++ b/docs/graphs/edge_attributes.rst @@ -20,8 +20,9 @@ coordinates. .. code:: yaml edges: - - ... - edge_builder: ... + - source_name: ... + target_name: ... + edge_builders: ... attributes: edge_length: _target_: anemoi.graphs.edges.attributes.EdgeLength @@ -37,8 +38,9 @@ latitude and longitude coordinates of the source and target nodes. .. code:: yaml edges: - - ... - edge_builder: ... + - source_name: ... + target_name: ... + edge_builders: ... attributes: edge_length: _target_: anemoi.graphs.edges.attributes.EdgeDirection diff --git a/docs/graphs/edges.rst b/docs/graphs/edges.rst index 57aebd2..2d40ffa 100644 --- a/docs/graphs/edges.rst +++ b/docs/graphs/edges.rst @@ -14,8 +14,8 @@ for each (`source name`, `target name`) pair specified. edges: - source_name: data target_name: hidden - edge_builder: - _target_: anemoi.graphs.edges.CutOff + edge_builders: + - _target_: anemoi.graphs.edges.CutOff cutoff_factor: 0.7 Below are the available methods for defining the edges: @@ -26,3 +26,19 @@ Below are the available methods for defining the edges: edges/cutoff edges/knn edges/multi_scale + +Additionally, there are 2 extra arguments (``source_mask_attr_name`` and +``target_mask_attr_name``) that can be used in the edge configuration to +mask source and/or target nodes. This can be useful to different use +cases, such as Limited Area Modeling (LAM) where your decoder edges +should only connect to the nodes in the limited area. + +.. code:: yaml + + edges: + - source_name: hidden + target_name: data + edge_builders: + - _target_: anemoi.graphs.edges.KNNEdges + num_nearest_neighbours: 5 + target_mask_attr_name: cutout diff --git a/docs/graphs/edges/cutoff.rst b/docs/graphs/edges/cutoff.rst index ba999e6..56c3774 100644 --- a/docs/graphs/edges/cutoff.rst +++ b/docs/graphs/edges/cutoff.rst @@ -39,13 +39,13 @@ YAML configuration: edges: - source_name: source target_name: destination - edge_builder: - _target_: anemoi.graphs.edges.CutOffEdges + edge_builders: + - _target_: anemoi.graphs.edges.CutOffEdges cutoff_factor: 0.6 .. note:: - The cut-off method is recommended for the encoder edges, to connect + The cut-off method is recommended for the encoder edge, to connect all data nodes to hidden nodes. The optimal ``cutoff_factor`` value will be the lowest value without orphan nodes. This optimal value depends on the node distribution, so it is recommended to tune it for diff --git a/docs/graphs/edges/knn.rst b/docs/graphs/edges/knn.rst index b90c048..e0dab3b 100644 --- a/docs/graphs/edges/knn.rst +++ b/docs/graphs/edges/knn.rst @@ -1,11 +1,11 @@ -##################### - K-Nearest Neighbors -##################### +###################### + K-Nearest Neighbours +###################### The knn method is a method for establishing connections between two sets of nodes. Given two sets of nodes, (`source`, `target`), the knn method -connects all destination nodes, to their ``num_nearest_neighbours`` -nearest source nodes. +connects all target nodes, to their ``num_nearest_neighbours`` nearest +source nodes. To use this method to build your connections, you can use the following YAML configuration: @@ -14,12 +14,12 @@ YAML configuration: edges: - source_name: source - target_name: destination - edge_builder: - _target_: anemoi.graphs.edges.KNNEdges + target_name: target + edge_builders: + - _target_: anemoi.graphs.edges.KNNEdges num_nearest_neighbours: 3 .. note:: - The knn method is recommended for the decoder edges, to connect all - data nodes with the surrounding hidden nodes. + The KNNEdges method is recommended for the decoder edges, to connect + all target nodes with the surrounding source nodes. diff --git a/docs/graphs/edges/multi_scale.rst b/docs/graphs/edges/multi_scale.rst index 7f5878a..0c876ec 100644 --- a/docs/graphs/edges/multi_scale.rst +++ b/docs/graphs/edges/multi_scale.rst @@ -5,7 +5,7 @@ The multi-scale connections can only be defined with the same source and target nodes. Edges of different scales are defined based on the refinement level of an icosahedron. The higher the refinement level, the -shorter the length of the edges. By default, all possible refinements +shorther the length of the edges. By default, all possible refinements levels are considered. To use this method to build your connections, you can use the following @@ -16,8 +16,8 @@ YAML configuration: edges: - source_name: source target_name: source - edge_builder: - _target_: anemoi.graphs.edges.MultiScaleEdges + edge_builders: + - _target_: anemoi.graphs.edges.MultiScaleEdges x_hops: 1 where `x_hops` is the number of hops between two nodes of the same @@ -28,8 +28,12 @@ refinement level to be considered neighbours, and then connected. This method is used by data-driven weather models like GraphCast to process the latent/hidden state. +.. csv-table:: Triangular refinements specifications (x_hops=1) + :file: ./tri_refined_edges.csv + :header-rows: 1 + .. warning:: - This connection method is only supported for building the connections + This connection method is only support for building the connections within a set of nodes defined with the ``TriNodes`` or ``HexNodes`` classes. diff --git a/docs/graphs/edges/tri_refined_edges.csv b/docs/graphs/edges/tri_refined_edges.csv new file mode 100644 index 0000000..a29e943 --- /dev/null +++ b/docs/graphs/edges/tri_refined_edges.csv @@ -0,0 +1,8 @@ +Refinement,Num Nodes,Num Edges,Num Multilevel Edges +0,12,60,60 +1,42,240,300 +2,162,960,1260 +3,642,3840,5100 +4,2562,15360,20460 +5,10242,61440,81900 +6,40962,245760,327660 diff --git a/docs/graphs/introduction.rst b/docs/graphs/introduction.rst index b0d6383..e91b397 100644 --- a/docs/graphs/introduction.rst +++ b/docs/graphs/introduction.rst @@ -65,6 +65,7 @@ following classes define different behaviour: - :doc:`node_coordinates/zarr_dataset` - :doc:`node_coordinates/npz_file` +- :doc:`node_coordinates/icon_mesh` - :doc:`node_coordinates/tri_refined_icosahedron` - :doc:`node_coordinates/hex_refined_icosahedron` - :doc:`node_coordinates/healpix` diff --git a/docs/graphs/node_attributes.rst b/docs/graphs/node_attributes.rst index 9856625..19195be 100644 --- a/docs/graphs/node_attributes.rst +++ b/docs/graphs/node_attributes.rst @@ -21,3 +21,12 @@ nodes. :maxdepth: 1 node_attributes/weights + node_attributes/zarr_attribute + +Additionally, different boolean operations have been implemented to +support more complex use cases: + +.. toctree:: + :maxdepth: 1 + + node_attributes/boolean_operations diff --git a/docs/graphs/node_attributes/boolean_operations.rst b/docs/graphs/node_attributes/boolean_operations.rst new file mode 100644 index 0000000..60d5f20 --- /dev/null +++ b/docs/graphs/node_attributes/boolean_operations.rst @@ -0,0 +1,12 @@ +#################### + Boolean operations +#################### + +_anemoi-graphs_ package implements a set of boolean opearations to +support these operations when defining node attributes. Below, an +attribute `mask` is computed as the intersection of two other masks, +that are generated as the non-missing values in 2 different variables in +a Zarr dataset. + +.. literalinclude:: ../yaml/attributes_boolean_operation.yaml + :language: yaml diff --git a/docs/graphs/node_attributes/weights.rst b/docs/graphs/node_attributes/weights.rst index b3cfccd..bf98640 100644 --- a/docs/graphs/node_attributes/weights.rst +++ b/docs/graphs/node_attributes/weights.rst @@ -2,7 +2,7 @@ Weights ######### -The `weights` are a node attribute useful for defining the importance of +The `weights` are node attributes useful for defining the importance of a node in the loss function. You can set the weights to follow an uniform distribution or to match the area associated with that node. diff --git a/docs/graphs/node_attributes/zarr_attribute.rst b/docs/graphs/node_attributes/zarr_attribute.rst new file mode 100644 index 0000000..c0fecac --- /dev/null +++ b/docs/graphs/node_attributes/zarr_attribute.rst @@ -0,0 +1,18 @@ +################### + From Zarr dataset +################### + +Zarr datasets are the standard format to define data nodes in +_anemoi-graphs_. The user can define node attributes based on a zarr +dataset variable. For example, the following recipe will define an +attribute `land_mask` based on the _lsm_ variable of the dataset. + +.. literalinclude:: ../yaml/attributes_nonmissingzarr.yaml + :language: yaml + +In addition, if an user is using "cutout" operation to build their +dataset, it may be helpful to create a `cutout_mask` to track the +provenance of the resulting nodes. An example is shown below: + +.. literalinclude:: ../yaml/attributes_cutout.yaml + :language: yaml diff --git a/docs/graphs/node_coordinates.rst b/docs/graphs/node_coordinates.rst index aff5b70..76973d2 100644 --- a/docs/graphs/node_coordinates.rst +++ b/docs/graphs/node_coordinates.rst @@ -24,6 +24,7 @@ a file: node_coordinates/zarr_dataset node_coordinates/npz_file + node_coordinates/icon_mesh or based on other algorithms. A commonn approach is to use an icosahedron to project the earth's surface, and refine it iteratively to diff --git a/docs/graphs/node_coordinates/healpix.rst b/docs/graphs/node_coordinates/healpix.rst index 2f5fc17..de071f5 100644 --- a/docs/graphs/node_coordinates/healpix.rst +++ b/docs/graphs/node_coordinates/healpix.rst @@ -10,7 +10,7 @@ to the number of refinements of the sphere. .. code:: yaml nodes: - data: + data: # name of the nodes node_builder: _target_: anemoi.graphs.nodes.HEALPixNodes resolution: 3 diff --git a/docs/graphs/node_coordinates/hex_refined_icosahedron.rst b/docs/graphs/node_coordinates/hex_refined_icosahedron.rst index 7f665cd..a39edfb 100644 --- a/docs/graphs/node_coordinates/hex_refined_icosahedron.rst +++ b/docs/graphs/node_coordinates/hex_refined_icosahedron.rst @@ -4,32 +4,56 @@ This method allows us to define the nodes based on the Hexagonal Hierarchical Geospatial Indexing System, which uses hexagons to divide -the sphere. Each refinement level divides each hexagon into seven -smaller hexagons. +the sphere. With each refinement, each hexagon into seven smaller +hexagons. To define the `node coordinates` based on the hexagonal refinements of an icosahedron, you can use the following YAML configuration: +*************** + Global graphs +*************** + +The class `HexNodes` allows us to define the nodes over the entire +globe. + .. code:: yaml nodes: - data: + hidden: # name of the nodes node_builder: _target_: anemoi.graphs.nodes.HexNodes resolution: 4 attributes: ... -where resolution is the number of refinements to be applied. +where `resolution` is the number of refinements to be applied. + +********************* + Limited Area graphs +********************* + +The class `LimitedAreaHexNodes` allows us to define the nodes only for a +specific area of interest. + +.. code:: yaml + + nodes: + hidden: # name of the nodes + node_builder: + _target_: anemoi.graphs.nodes.LimitedAreaHexNodes + resolution: 4 + reference_node_name: nodes_name + mask_attr_name: mask_name # optional + margin_radius_km: 100 # optional + attributes: ... + +where `reference_node_name` is the name of the nodes to define the area +of interest. .. csv-table:: Hexagonal Hierarchical refinements specifications :file: ./hex_refined.csv :header-rows: 1 -Note that the refinement level is the parameter used to control the -resolution of the nodes, but the resolution also depends on the -refinement method. Then, for the same refinement level, ``HexNodes`` -will have a higher resolution than ``TriNodes``. - .. warning:: This class will require the `h3 `_ package to be diff --git a/docs/graphs/node_coordinates/icon_mesh.rst b/docs/graphs/node_coordinates/icon_mesh.rst index dc306ed..dbba0b6 100644 --- a/docs/graphs/node_coordinates/icon_mesh.rst +++ b/docs/graphs/node_coordinates/icon_mesh.rst @@ -47,7 +47,6 @@ following YAML example: node_builder: _target_: anemoi.graphs.nodes.ICONCellGridNodes icon_mesh: "icon_mesh" - attributes: ${graph.attributes.nodes} # Hidden nodes hidden: node_builder: @@ -56,9 +55,8 @@ following YAML example: edges: # Processor configuration - - source_name: ${graph.hidden} - target_name: ${graph.hidden} - edge_builder: - _target_: anemoi.graphs.edges.ICONTopologicalProcessorEdges + - source_name: "hidden" + target_name: "hidden" + edge_builders: + - _target_: anemoi.graphs.edges.ICONTopologicalProcessorEdges icon_mesh: "icon_mesh" - attributes: ${graph.attributes.edges} diff --git a/docs/graphs/node_coordinates/npz_file.rst b/docs/graphs/node_coordinates/npz_file.rst index 266687d..9b720db 100644 --- a/docs/graphs/node_coordinates/npz_file.rst +++ b/docs/graphs/node_coordinates/npz_file.rst @@ -8,13 +8,13 @@ following YAML configuration: .. code:: yaml nodes: - data: + data: # name of the nodes node_builder: _target_: anemoi.graphs.nodes.NPZFileNodes - grids_definition_path: /path/to/folder/with/grids/ + grid_definition_path: /path/to/folder/with/grids/ resolution: o48 -where `grids_definition_path` is the path to the folder containing the +where `grid_definition_path` is the path to the folder containing the grid definition files and `resolution` is the resolution of the grid to be used. diff --git a/docs/graphs/node_coordinates/tri_nodes.csv b/docs/graphs/node_coordinates/tri_nodes.csv new file mode 100644 index 0000000..360681b --- /dev/null +++ b/docs/graphs/node_coordinates/tri_nodes.csv @@ -0,0 +1,8 @@ +Refinement,Num Nodes,Num Faces,Avg. Area (sq km) +0,12,20,25503600 +1,42,80,6375900 +2,162,320,1593975 +3,642,1280,398494 +4,2562,5120,99623 +5,10242,20480,24905 +6,40962,81920,6226 diff --git a/docs/graphs/node_coordinates/tri_refined_icosahedron.rst b/docs/graphs/node_coordinates/tri_refined_icosahedron.rst index 44b3e44..2d0b8b6 100644 --- a/docs/graphs/node_coordinates/tri_refined_icosahedron.rst +++ b/docs/graphs/node_coordinates/tri_refined_icosahedron.rst @@ -3,27 +3,78 @@ ################################ This class allows us to define nodes based on iterative refinements of -an icoshaedron with triangles. +an icosahedron with triangles. To define the `node coordinates` based on icosahedral refinements of an -icosahedron, you can use the following YAML configuration: +icosahedron, you can use the following YAML configurations: + +*************** + Global graphs +*************** + +The class `TriNodes` allows us to define the nodes over the entire globe .. code:: yaml nodes: - data: + hidden: # name of the nodes node_builder: _target_: anemoi.graphs.nodes.TriNodes resolution: 4 attributes: ... -where resolution is the number of refinements to be applied to the +where `resolution` is the number of refinements to be applied to the icosahedron. -Note that the refinement level is the parameter used to control the -resolution of the nodes, but the resolution also depends on the -refinement method. Then, for the same refinement level, ``HexNodes`` -will have a higher resolution than ``TriNodes``. +********************* + Limited Area graphs +********************* + +The class `LimitedAreaTriNodes` allows us to define the nodes only for a +specific area of interest. + +.. code:: yaml + + nodes: + hidden: # name of the nodes + node_builder: + _target_: anemoi.graphs.nodes.LimitedAreaTriNodes + resolution: 4 + reference_node_name: nodes_name + mask_attr_name: mask_name # optional + margin_radius_km: 100 # optional + attributes: ... + +where `reference_node_name` is the name of the nodes to define the area +of interest. These nodes must be defined in the recipe beforehand. + +***************** + Stretched graph +***************** + +The class `StretchedTriNodes` allows us to define the nodes with a +different resolution for inside and outside the area of interest. + +.. code:: yaml + + nodes: + hidden: # name of the nodes + node_builder: + _target_: anemoi.graphs.nodes.StretchedTriNodes + global_resolution: 3 + lam_resolution: 5 + reference_node_name: nodes_name + mask_attr_name: mask_name # optional + margin_radius_km: 100 # optional + attributes: ... + +where `resolution` argument is dropped divided into `global_resolution` +and `lam_resolution`, which are the number of refinements to be applied +to the icosahedron outside and inside the area of interest respectively. + +.. csv-table:: Triangular refinements specifications + :file: ./tri_nodes.csv + :header-rows: 1 .. warning:: diff --git a/docs/graphs/node_coordinates/zarr_dataset.rst b/docs/graphs/node_coordinates/zarr_dataset.rst index 3723c0e..a8d7203 100644 --- a/docs/graphs/node_coordinates/zarr_dataset.rst +++ b/docs/graphs/node_coordinates/zarr_dataset.rst @@ -13,27 +13,26 @@ the following YAML configuration: .. code:: yaml nodes: - data: + data: # name of the nodes node_builder: _target_: anemoi.graphs.nodes.ZarrDatasetNodes dataset: /path/to/dataset.zarr attributes: ... -where `dataset` is the path to the Zarr dataset. The -``ZarrDatasetNodes`` class supports operations compatible with -:ref:`anemoi-datasets `, such as "cutout". -Below, an example of how to use the "cutout" operation directly within -:ref:`anemoi-graphs `. +where `dataset` is the path to the Zarr dataset. + +The ``CutOutZarrDatasetNodes`` class supports 2 input datasets, one for +the LAM model and one for the boundary forcing. To define the `node +coordinates` combining multiple Zarr datasets, you can use the following +YAML configuration: .. code:: yaml nodes: - data: + data: # name of the nodes node_builder: - _target_: anemoi.graphs.nodes.ZarrDatasetNodes - dataset: - cutout: - dataset: /path/to/lam_dataset.zarr - dataset: /path/to/boundary_forcing.zarr - adjust: "all" + _target_: anemoi.graphs.nodes.CutOutZarrDatasetNodes + lam_dataset: /path/to/lam_dataset.zarr + forcing_dataset: /path/to/boundary_forcing.zarr + thinning: 25 # sample every n-th point (only for lam_dataset) attributes: ... diff --git a/docs/graphs/yaml/attributes_boolean_operation.yaml b/docs/graphs/yaml/attributes_boolean_operation.yaml new file mode 100644 index 0000000..ae1c479 --- /dev/null +++ b/docs/graphs/yaml/attributes_boolean_operation.yaml @@ -0,0 +1,14 @@ +nodes: + data: + node_builder: + _target_: anemoi.graphs.nodes.ZarrDatasetNodes + # ... + attributes: + mask: + _target_: anemoi.graphs.nodes.attributes.BooleanAndMask + masks: + - _target_: anemoi.graphs.nodes.attributes.NonmissingZarrVariable + variable: var1 + - _target_: anemoi.graphs.nodes.attributes.NonmissingZarrVariable + variable: var2 + hidden: ... diff --git a/docs/graphs/yaml/attributes_cutout.yaml b/docs/graphs/yaml/attributes_cutout.yaml new file mode 100644 index 0000000..eabc2bc --- /dev/null +++ b/docs/graphs/yaml/attributes_cutout.yaml @@ -0,0 +1,10 @@ +nodes: + data: + node_builder: + _target_: anemoi.graphs.nodes.ZarrDatasetNodes + dataset: + cutout: ... + attributes: + cutout_mask: + _target_: anemoi.graphs.nodes.attributes.CutOutMask + hidden: ... diff --git a/docs/graphs/yaml/attributes_nonmissingzarr.yaml b/docs/graphs/yaml/attributes_nonmissingzarr.yaml new file mode 100644 index 0000000..d38eccc --- /dev/null +++ b/docs/graphs/yaml/attributes_nonmissingzarr.yaml @@ -0,0 +1,10 @@ +nodes: + data: + node_builder: + _target_: anemoi.graphs.nodes.ZarrDatasetNodes + # ... + attributes: + land_mask: + _target_: anemoi.graphs.nodes.attributes.NonmissingZarrVariable + variable: lsm + hidden: ... diff --git a/docs/graphs/yaml/attributes_weights.yaml b/docs/graphs/yaml/attributes_weights.yaml index f889ef7..6a5d574 100644 --- a/docs/graphs/yaml/attributes_weights.yaml +++ b/docs/graphs/yaml/attributes_weights.yaml @@ -1,10 +1,8 @@ nodes: data: - node_builder: - _target_: anemoi.graphs.nodes.nodes.ZarrDatasetNodeBuilder - dataset: /path/to/dataset.zarr + node_builder: ... attributes: weights: - _target_: anemoi.graphs.nodes.weights.Area - norm: unit-max + _target_: anemoi.graphs.nodes.attributes.AreaWeights # options: AreaWeights, UniformWeights + norm: unit-max # options: unit-max, unit-range, unit-std hidden: ... diff --git a/docs/index.rst b/docs/index.rst index 3baa098..383b4bb 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -19,8 +19,10 @@ framework it seeks to handle many of the complexities that meteorological organisations will share, allowing them to easily train models from existing recipes but with their own data. -This package provides a series of utility functions for used by the rest -of the *Anemoi* packages. +The `anemoi-graphs` package allows you to design custom graphs for +training data-driven weather models. The graphs are built using a +`recipe`, which is a YAML file that specifies the nodes and edges of the +graph. - :doc:`overview` @@ -61,6 +63,7 @@ of the *Anemoi* packages. - :doc:`modules/edge_attributes` - :doc:`modules/graph_creator` - :doc:`modules/graph_inspector` +- :doc:`modules/post_processor` .. toctree:: :maxdepth: 1 @@ -73,6 +76,7 @@ of the *Anemoi* packages. modules/edge_attributes modules/graph_creator modules/graph_inspector + modules/post_processor ******************* Command line tool @@ -122,6 +126,7 @@ of the *Anemoi* packages. :caption: Usage usage/getting_started + usage/limited_area ***************** Anemoi packages diff --git a/docs/modules/edge_builder.rst b/docs/modules/edge_builder.rst index 1fa555b..208e3ad 100644 --- a/docs/modules/edge_builder.rst +++ b/docs/modules/edge_builder.rst @@ -6,6 +6,6 @@ .. automodule:: anemoi.graphs.edges.builder :members: - :exclude-members: BaseEdgeBuilder + :exclude-members: BaseEdgeBuilder,NodeMaskingMixin :no-undoc-members: :show-inheritance: diff --git a/docs/modules/node_attributes.rst b/docs/modules/node_attributes.rst index 3193409..d67a043 100644 --- a/docs/modules/node_attributes.rst +++ b/docs/modules/node_attributes.rst @@ -6,6 +6,6 @@ .. automodule:: anemoi.graphs.nodes.attributes :members: - :exclude-members: BaseWeights + :exclude-members: BaseWeights,BooleanBaseNodeAttribute,BooleanOperation :no-undoc-members: :show-inheritance: diff --git a/docs/modules/node_builder.rst b/docs/modules/node_builder.rst index 9a83de5..a9558f5 100644 --- a/docs/modules/node_builder.rst +++ b/docs/modules/node_builder.rst @@ -4,8 +4,24 @@ Node builder ############## -.. automodule:: anemoi.graphs.nodes.builder +.. automodule:: anemoi.graphs.nodes.builders.from_file :members: - :exclude-members: BaseNodeBuilder,IcosahedralNodes + :no-undoc-members: + :show-inheritance: + +.. automodule:: anemoi.graphs.nodes.builders.from_icon + :members: + :no-undoc-members: + :exclude-members: ICONTopologicalBaseEdgeBuilder + :show-inheritance: + +.. automodule:: anemoi.graphs.nodes.builders.from_healpix + :members: + :no-undoc-members: + :show-inheritance: + +.. automodule:: anemoi.graphs.nodes.builders.from_refined_icosahedron + :members: + :exclude-members: IcosahedralNodes,LimitedAreaIcosahedralNodes,StretchedIcosahedronNodes :no-undoc-members: :show-inheritance: diff --git a/docs/modules/post_processor.rst b/docs/modules/post_processor.rst new file mode 100644 index 0000000..8220709 --- /dev/null +++ b/docs/modules/post_processor.rst @@ -0,0 +1,11 @@ +.. _modules-post_processor: + +################ + Post processor +################ + +.. automodule:: anemoi.graphs.processors.post_process + :members: + :no-undoc-members: + :show-inheritance: + :exclude-members: PostProcessor,BaseMaskingProcessor diff --git a/docs/usage/getting_started.rst b/docs/usage/getting_started.rst index 4c322a2..12aac99 100644 --- a/docs/usage/getting_started.rst +++ b/docs/usage/getting_started.rst @@ -4,17 +4,19 @@ Getting started ################# +The simplest use case is to build an encoder-processor-decoder graph for +a global weather model. + ************** First recipe ************** -The simplest use case is to build an encoder-processor-decoder graph for -a global weather model. In this case, the recipe must contain a -``nodes`` section where the keys will be the names of the sets of -`nodes`, that will later be used to build the connections. Each `nodes` -configuration must include a ``node_builder`` section describing how to -generate the `nodes`, and it may include an optional ``attributes`` -section to define additional attributes (weights, mask, ...). +In this case, the recipe must contain a ``nodes`` section where the keys +will be the names of the sets of `nodes`, that will later be used to +build the connections. Each `nodes` configuration must include a +``node_builder`` section describing how to generate the `nodes`, and it +may include an optional ``attributes`` section to define additional +attributes (weights, mask, ...). .. literalinclude:: yaml/nodes.yaml :language: yaml @@ -54,7 +56,7 @@ following command: .. code:: console - $ anemoi-graphs inspect graph.pt output_plots + $ anemoi-graphs inspect graph.pt This will generate the following graph: @@ -65,7 +67,8 @@ This will generate the following graph: Note that that the resulting graph will only work with a Transformer processor because there are no connections between the `hidden - nodes`. + nodes`. This is the default behaviour for :ref:`anemoi-training + `. ****************************** Adding processor connections diff --git a/docs/usage/limited_area.rst b/docs/usage/limited_area.rst new file mode 100644 index 0000000..273936d --- /dev/null +++ b/docs/usage/limited_area.rst @@ -0,0 +1,89 @@ +.. _usage-limited_area: + +############################# + Limited Area Modeling (LAM) +############################# + +AnemoI Graphs brings another level of flexibility to the user by +allowing the definition of limited area graphs. + +***************************************** + Define hidden nodes in area of interest +***************************************** + +The user can use a regional dataset to define the `data` nodes over the +region of interest. Then, it can define the hidden nodes only over the +region of interest using any of the ``LimitedArea_____Nodes`` classes. + +.. literalinclude:: yaml/lam_nodes_wo_boundary.yaml + :language: yaml + +************************************************ + Cut out regional dataset into a global dataset +************************************************ + +In this case, the user may want to include boundary forcings to the +region of interest. AnemoI Graphs allows the user to use 2 datasets to +build the `data` nodes, combining nodes from the LAM dataset and the +global dataset (as boundary forcings). The class ``ZarrDatasetNodes`` +allows this functionality: + +.. literalinclude:: yaml/cutout_zarr.yaml + :language: yaml + +The ``ZarrDatasetNodes`` supports an optional ``thinning`` argument +which can be used to sampling points from the regional dataset to reduce +computation during development stage. + +In addition, this node builder class will create an additional node +attribute with a mask showing which node correspond to each of the 2 +datasets. + +.. code:: console + + >>> graph + HeteroData( + data={ + x=[40320, 2], + node_type='ZarrDatasetNodes', + area_weight=[40320, 1], + cutout_mask=[40320, 1], + } + ) + +********************************************* + Define hidden nodes over region of interest +********************************************* + +Once the `data` nodes are defined, the user can define the hidden nodes +only over the region of interest. In this case, the area of interest is +defined by the `data` nodes masked by the ``cutout`` attribute. + +.. literalinclude:: yaml/limited_area_nodes.yaml + :language: yaml + +.. code:: console + + >>> graph + HeteroData( + data={ + x=[40320, 2], + node_type='ZarrDatasetNodes', + area_weight=[40320, 1], + cutout_mask=[40320, 1], + }, + hidden={ + x=[10242, 2], + node_type='TriNodes', + } + ) + +************** + Adding edges +************** + +The user may define the edges using the same configuration as for the +global graphs. + +.. literalinclude:: yaml/global.yaml + :language: yaml diff --git a/docs/usage/yaml/cutout_zarr.yaml b/docs/usage/yaml/cutout_zarr.yaml new file mode 100644 index 0000000..8fd72f5 --- /dev/null +++ b/docs/usage/yaml/cutout_zarr.yaml @@ -0,0 +1,15 @@ +nodes: + data: + node_builder: + _target_: anemoi.graphs.nodes.ZarrDatasetNodes + dataset: + cutout: + - dataset: regional-dataset.zarr + thinning: 25 + - dataset: /path/to/global-dataset.zarr + adjust: all + min_distance_km: 10 + attributes: ... + hidden: ... + +edges: ... diff --git a/docs/usage/yaml/global.yaml b/docs/usage/yaml/global.yaml index a0bc3bb..f00746d 100644 --- a/docs/usage/yaml/global.yaml +++ b/docs/usage/yaml/global.yaml @@ -6,18 +6,19 @@ edges: # A) Encoder connections - source_name: data target_name: hidden - edge_builder: - _target_: anemoi.graphs.edges.CutOffEdges + edge_builders: + - _target_: anemoi.graphs.edges.CutOffEdges cutoff_factor: 0.7 # B) Decoder connections - source_name: hidden - target_name: hidden - edge_builder: - _target_: anemoi.graphs.edges.KNNEdges - nearest_neighbours: 3 + target_name: data + target_mask_attr_name: cutout + edge_builders: + - _target_: anemoi.graphs.edges.KNNEdges + num_nearest_neighbours: 3 # C) Processor connections - source_name: hidden - target_name: data - edge_builder: - _target_: anemoi.graphs.edges.KNNEdges - nearest_neighbours: 3 + target_name: hidden + edge_builders: + - _target_: anemoi.graphs.edges.MultiScaleEdges + x_hops: 1 diff --git a/docs/usage/yaml/global_with-attrs.yaml b/docs/usage/yaml/global_with-attrs.yaml index fc292d0..8fddb1a 100644 --- a/docs/usage/yaml/global_with-attrs.yaml +++ b/docs/usage/yaml/global_with-attrs.yaml @@ -6,27 +6,27 @@ edges: # A) Encoder connections - source_name: data target_name: hidden - edge_builder: - _target_: anemoi.graphs.edges.CutOffEdges + edge_builders: + - _target_: anemoi.graphs.edges.CutOffEdges cutoff_factor: 0.7 attributes: edge_length: _target_: anemoi.graphs.edges.attributes.EdgeLength # B) Decoder connections - source_name: hidden - target_name: hidden - edge_builder: - _target_: anemoi.graphs.edges.KNNEdges - nearest_neighbours: 3 + target_name: data + edge_builders: + - _target_: anemoi.graphs.edges.KNNEdges + num_nearest_neighbours: 3 attributes: edge_length: _target_: anemoi.graphs.edges.attributes.EdgeLength # C) Processor connections - source_name: hidden - target_name: data - edge_builder: - _target_: anemoi.graphs.edges.KNNEdges - nearest_neighbours: 3 + target_name: hidden + edge_builders: + - _target_: anemoi.graphs.edges.MultiScaleEdges + x_hops: 1 attributes: edge_length: _target_: anemoi.graphs.edges.attributes.EdgeLength diff --git a/docs/usage/yaml/global_wo-proc.yaml b/docs/usage/yaml/global_wo-proc.yaml index c1e3ad3..696dbe4 100644 --- a/docs/usage/yaml/global_wo-proc.yaml +++ b/docs/usage/yaml/global_wo-proc.yaml @@ -6,12 +6,12 @@ edges: # A) Encoder connections - source_name: data target_name: hidden - edge_builder: - _target_: anemoi.graphs.edges.CutOffEdges + edge_builders: + - _target_: anemoi.graphs.edges.CutOffEdges cutoff_factor: 0.7 # B) Decoder connections - source_name: hidden - target_name: hidden - edge_builder: - _target_: anemoi.graphs.edges.KNNEdges - nearest_neighbours: 3 + target_name: data + edge_builders: + - _target_: anemoi.graphs.edges.KNNEdges + num_nearest_neighbours: 3 diff --git a/docs/usage/yaml/lam_nodes_wo_boundary.yaml b/docs/usage/yaml/lam_nodes_wo_boundary.yaml new file mode 100644 index 0000000..011e99f --- /dev/null +++ b/docs/usage/yaml/lam_nodes_wo_boundary.yaml @@ -0,0 +1,23 @@ +nodes: + data: + node_builder: + _target_: anemoi.graphs.nodes.ZarrDatasetNodes + dataset: + cutout: + - dataset: regional-dataset.zarr + thinning: 25 + - dataset: /path/to/global-dataset.zarr + adjust: all + min_distance_km: 10 + attributes: + cutout_mask: + _target_: anemoi.graphs.nodes.attributes.CutOutMask + hidden: + node_builder: + _target_: anemoi.graphs.nodes.LimitedAreaTriNodes + resolution: 5 + reference_node_name: data + mask_attr_name: cutout_mask + attributes: ... + +edges: ... diff --git a/docs/usage/yaml/limited_area_nodes.yaml b/docs/usage/yaml/limited_area_nodes.yaml new file mode 100644 index 0000000..c63294f --- /dev/null +++ b/docs/usage/yaml/limited_area_nodes.yaml @@ -0,0 +1,10 @@ +nodes: + data: ... + hidden: + node_builder: + _target_: anemoi.graphs.nodes.LimitedAreaTriNodes + resolution: 5 + reference_node_name: data + mask_attr_name: cutout_mask + +edges: ... diff --git a/docs/usage/yaml/nodes.yaml b/docs/usage/yaml/nodes.yaml index 7c97851..c65fbc5 100644 --- a/docs/usage/yaml/nodes.yaml +++ b/docs/usage/yaml/nodes.yaml @@ -5,6 +5,5 @@ nodes: dataset: /path/to/dataset.zarr hidden: node_builder: - _target_: anemoi.graphs.nodes.NPZFileNodes - grid_definition_path: /path/to/grids/ - resolution: o48 + _target_: anemoi.graphs.nodes.TriNodes + resolution: 5 diff --git a/src/anemoi/graphs/edges/attributes.py b/src/anemoi/graphs/edges/attributes.py index 3560fb2..6be801b 100644 --- a/src/anemoi/graphs/edges/attributes.py +++ b/src/anemoi/graphs/edges/attributes.py @@ -66,7 +66,7 @@ class EdgeDirection(BaseEdgeAttribute): Attributes ---------- norm : Optional[str] - Normalisation method. + Normalisation method. Options: None, "l1", "l2", "unit-max", "unit-range", "unit-std". luse_rotated_features : bool Whether to use rotated features. @@ -109,10 +109,10 @@ class EdgeLength(BaseEdgeAttribute): Attributes ---------- - norm : str - Normalisation method. + norm : Optional[str] + Normalisation method. Options: None, "l1", "l2", "unit-max", "unit-range", "unit-std". invert : bool - Whether to invert the edge lengths, i.e. 1 - edge_length. + Whether to invert the edge lengths, i.e. 1 - edge_length. Defaults to False. Methods ------- diff --git a/src/anemoi/graphs/inspect.py b/src/anemoi/graphs/inspect.py index fa59018..ec8ac5c 100644 --- a/src/anemoi/graphs/inspect.py +++ b/src/anemoi/graphs/inspect.py @@ -26,7 +26,19 @@ class GraphInspector: - """Inspect the graph.""" + """Inspect the graph. + + Attributes + ---------- + path: Union[str, Path] + Path to the graph file. + output_path: Path + Path to the output directory where the plots will be saved. + show_attribute_distributions: Optional[bool] + Whether to show the distribution of the node and edge attributes. + show_nodes: Optional[bool] + Whether to show the interactive plots of the nodes. + """ def __init__( self,