-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State-Dumping Debug Tool #197
base: main
Are you sure you want to change the base?
Conversation
…emistry_data and grackle_version to a separate file.
… in a json format)
Previously, we were dumping `chemistry_data`, `code_units`, and `grackle_version` to a json-string that was stored in the hdf5 file. While this certainly got the job done, there were 2 drawbacks: 1. we did it in a rather round-about manner... 2. if anyone ever wanted to deserialize the dumps into new instances, you would need to support hdf5 reading AND json-parsing With this new change, we now dump these structs in a way that individual key-value pairs are stored as attributes. This simplifies hypothetical deserialization
…a (it's now a part of the public API
6d87b1e
to
5632e76
Compare
@brittonsmith, When you have time this is now ready for review. Sorry this got so large. I think this could create some minor conflicts with #177, but I'm happy to help resolve those (it should be very easy) |
This change was split off from GH PR grackle-project#197. This should be a lot easier to review in isolation (and I think that it is a generally useful change in its own right). Overview -------- The objective of this PR is conceptually very simple. It introduces the following function: ```c++ int gr_initialize_field_data(grackle_field_data *my_fields); ``` The idea is to have users immediately call this function right after they initialize a new `grackle_field_data` instance - When it's called the handful of non data-field members are assigned sensible defaults. It also initializes all data-field members to NULL pointers. - this is analogous to the way that `local_initialize_chemistry_parameters` is used. Motivation ---------- While this function is not strictly necessary, I think it is a useful addition and I think we should tell people to use it (though old code won't break). A large motivation is peace of mind. I have often wished for this sort of thing. I'm always a little concerned when using a new configuration of grackle that I didn't initialize all of the required fields. I would definitely feel a little more comfortable knowing that a field that I forgot about is assigned a ``NULL`` pointer rather than some garbage data that might let the code chug along. It could also facillitate more error checking: - such as checks in the style of the ``grid_dx`` check introduced in PR grackle-project#190 (applying similar checks to `grid_rank`, `grid_dimension`, `grid_start`, and `grid_end` requires that we know their default value). - we could explicitly check that the user specified all required data-fields. We can only do this if we know that unset data-fields have a default value of `NULL`. - **From a user friendliness perspective, I think this check alone is enough to justify the function's existence.** - We could also potentially warn if unnecessary fields were specified - Plus, we could account for the fact that functions like ``calculate_pressure`` require fewer fields that ``solve_chemistry`` - we could also add checks that none of the fields are aliases (an implicit Fortran requirement). This would be much easier to implement if we knew we could just ignore fields that were set to ``NULL`` pointers We would probably need to make these checks opt-in somehow, both because we can't be absolutely certain whether a user actually invoked ``gr_initialize_field_data`` and users might not want the runtime-cost (which could be relatively large for very small grids) Naming ------ I choose to name this function based on my suggestion in grackle-project#189 that we adopt a convention that all new functions in the stable public API share a common prefix (`gr_` or `grackle_`). For the sake of proposing something concrete, I assumed that we choose the `gr_` prefix. (If we decide on a different prefix, this would be easy to change). Alternatively, if we don't want any prefix, we could name it `initialize_grackle_field_data` (I do worry a little that could introduce naming conflicts with existing functions in downstream codes)
This change was split off from GH PR grackle-project#197. This should be a lot easier to review in isolation (and I think that it is a generally useful change in its own right). Overview -------- The objective of this PR is conceptually very simple. It introduces the following function: ```c++ int gr_initialize_field_data(grackle_field_data *my_fields); ``` The idea is to have users immediately call this function right after they initialize a new `grackle_field_data` instance - When it's called the handful of non data-field members are assigned sensible defaults. It also initializes all data-field members to NULL pointers. - this is analogous to the way that `local_initialize_chemistry_parameters` is used. Motivation ---------- While this function is not strictly necessary, I think it is a useful addition and I think we should tell people to use it (though old code won't break). A large motivation is peace of mind. I have often wished for this sort of thing. I'm always a little concerned when using a new configuration of grackle that I didn't initialize all of the required fields. I would definitely feel a little more comfortable knowing that a field that I forgot about is assigned a ``NULL`` pointer rather than some garbage data that might let the code chug along. It could also facillitate more error checking: - such as checks in the style of the ``grid_dx`` check introduced in PR grackle-project#190 (applying similar checks to `grid_rank`, `grid_dimension`, `grid_start`, and `grid_end` requires that we know their default value). - we could explicitly check that the user specified all required data-fields. We can only do this if we know that unset data-fields have a default value of `NULL`. - **From a user friendliness perspective, I think this check alone is enough to justify the function's existence.** - We could also potentially warn if unnecessary fields were specified - Plus, we could account for the fact that functions like ``calculate_pressure`` require fewer fields that ``solve_chemistry`` - we could also add checks that none of the fields are aliases (an implicit Fortran requirement). This would be much easier to implement if we knew we could just ignore fields that were set to ``NULL`` pointers We would probably need to make these checks opt-in somehow, both because we can't be absolutely certain whether a user actually invoked ``gr_initialize_field_data`` and users might not want the runtime-cost (which could be relatively large for very small grids) Naming ------ I choose to name this function based on my suggestion in grackle-project#189 that we adopt a convention that all new functions in the stable public API share a common prefix (`gr_` or `grackle_`). For the sake of proposing something concrete, I assumed that we choose the `gr_` prefix. (If we decide on a different prefix, this would be easy to change). Alternatively, if we don't want any prefix, we could name it `initialize_grackle_field_data` (I do worry a little that could introduce naming conflicts with existing functions in downstream codes)
@mabruzzo, I will try to review this when I am back from holiday in a couple weeks. If you could resolve the current merger conflicts before then, that would be great. |
I haven't quite gotten around to this yet (hopefully tomorrow or the day after)! There are a number of structural improvements that could be made. In particular #209 will simplify the API a lot! |
To make this PR easier to review, I split off some of the changes into #205. That should be reviewed first.
Description
This PR introduces "tooling" to help with Diagnosing and Debugging Grackle errors.
Motivation
One of Grackle's greatest weaknesses is that it's really hard to diagnose and debug errors:
This PR introduces "tooling" to help address this challenge. We primarily introduce a C function to dump the internal state of the hdf5 file. Then we introduced convenience python functions (that we need anyway to test this functionality) to reproduce grackle's state from the hdf5 file and make it easy to try to use that to reproduce the issue.
This makes it easy for users to communicate enough information to get help and for the people who are actually trying to debug the problem.
Detailed Description
In this section I will provide some more details about the changes. See the website docs for a more detailed description of what they do and how to use them.
Essentially this PR seeks to introduce a new function to assist with debugging:
The idea here is simple: when you are working on reproducing a grackle-error, you can use this to dump grackle's state to an hdf5 file just before the error occurs. Specifically, the data is dumped to a newly created HDF5 (at the path specified by
fname
) OR within an existing hdf5 file (at a location specified bydest_hid
).To properly dump the state of
grackle_field_data
, we need people to initialize unused fields toNULL
. This is accomplished by a new function called:The idea is to have users immediately call this function right after they initialize a new
grackle_field_data
instance, analogous to the way thatlocal_initialize_chemistry_parameters
is used.1I also added some python functions to help load in the dumps and use them for debugging
Stability Concerns
I'm very confident about the internal changes to Grackle. They may seem a little complicated, but I think they are robust and provide a lot of flexibility for changing things in the future.
With that said, I have some concerns about stabilizing 2 aspects of this PR. In particular I'm worried about:
grunstable_h5dump_state
.code_units
state) and could definitely be improved. For example, maybe we have people create a dump right after initialization and then dump again when they think the problem occurs. Or, if we introduce on the changes suggested in Auto-computing the Required Units #198, this could be simplified a lot more.If we needed to guarantee stability, I'm not sure I would ever feel confident proposing these changes. Furthermore, I think these changes would be useful to people, right now, even without a stability guarantee (people don't need to call
grunstable_h5dump_state
in the mainline version of their simulation code -- there is no need to create these dumps outside of debugging purposes).Thus, my proposed compromise is to integrate these things into Grackle as unstable features. I've tried to communicate that in the docs.
To make it extra clear that
grunstable_h5dump_state
isn't stable, I have:grackle_unstable.h
public-header (a downstream application would need to explicitly include this header in addition to the regulargrackle.h
headergr_
orgrackle_
)grunstable_
under the assumption that we'll start prefixing all public components of the stable-public API withgr_
grunstable_
prefix would need to changeOther changes
This includes a few other changes:
chemistry_data
,code_units
, andgrackle_version
structs to stdout (when running grackle in VERBOSE mode), have been refactored to use similar functionality to the hdf5-dumping machineryFuture Ideas
It might be nice to ship Grackle with a simple C/C++ program to help with diagnosing issues. More details are provided below the fold.
More about a C/C++ diagnostic tool
In the simplest form, the program
grackle_data_file
variableCurrently, the only way to load in the internal state is with the new python functions. This C/C++ program would be a useful addition because:
gdb
There are also some fairly cool things you could extend this tool to do:
Footnotes
The particular change can be separately reviewed as part of PR Introduce
gr_initialize_field_data
#205. ↩In particular, if people serialize just the
chemistry_data
contents in a standardized way, we could support some cool things withyt
analysis (building on existing functionality in pygrackle, or even adding some basic things directly toyt
without needingpygrackle
) ↩If I'm feeling motivated, I might introduce a separate pull-request to try to introduce some of this functionality (but I probably won't get to it). ↩