Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size of the mechanism and speed of the simulation #436

Open
rs028 opened this issue Dec 1, 2020 — with Slack · 9 comments
Open

Size of the mechanism and speed of the simulation #436

rs028 opened this issue Dec 1, 2020 — with Slack · 9 comments

Comments

Copy link
Collaborator

rs028 commented Dec 1, 2020

The model is very slow when the entire MCM is used as chemical mechanism, as opposed to just a subset of it. It takes up to 5 days to run a 3 hours simulation (although this depends on the type of machine, obviously).

Possible cause: most of the species are zeros and/or unused.

@rs028 rs028 added the question label Dec 1, 2020 — with Slack
@rs028 rs028 added the testing label Dec 4, 2020
@spco
Copy link
Collaborator

spco commented Jan 6, 2021

For my own education, could you confirm you're referring to the case where the .fac file is very large?

In that case, I'd hope there's something we can do, but I'm not sure right now.
If most of the values are zero because the initial values of most of them are zero, then it's quite hard to identify that case in the code itself, I would think.

You'd need to identify which species must stay zero based upon the mechanism and initial conditions, to be able to remove them from the matrix. That's probably easier to do in pre-processing, except that it's dependent on the initial conditions, not just the .fac file. So your mechanism.f90 and mechanism.species etc would depend on the initial conditions as well as the .fac. That could well work happily, but you would need to recompile mechanism.so whenever you change the initial conditions, not just when you change the .fac (and you need to ensure that is done consistently etc).

Does that make sense? It sounds feasible but there are some design choices to be made, and it's a reasonable amount of work to get right.

(I guess the mechanism is in a way a directed graph, with the nodes being species, node-weights being the species concentration, edges the reactions, and the edge-weights the rates (but the rates can depend directly on other species (nodes)). I'm not sure how to best handle that.)

@rs028
Copy link
Collaborator Author

rs028 commented Jan 6, 2021

Yes I think that's right. But to be honest the "zero values" hypothesis is just that. I don't know what controls the speed of the simulation. I guess we need to do some testing and use some sort of progiling tool.

That being said, I am intrigued by the "graph" option. It could have interesting applications, so we may want to keep it in mind in any case.

@spco
Copy link
Collaborator

spco commented Feb 1, 2021

On the general speed, have you experimented with changing the optimisation flag? gfortran defaults to -O0 (no optimisation) so building with -O2 might give some sizeable speedup. I have no feel for how much difference that would make in this specific case, but in general the speedup can be orders of magnitude.

@rs028
Copy link
Collaborator Author

rs028 commented Feb 1, 2021

I haven't. This problem was actually reported by another user. I think the general point here is to understand why this happening.
Let's say we run the test mechanism and the entire MCM with the same default initial conditions (model/configuration/initialConcentrations.config):

CH4    4.9e+13
CO     3.6e+12
O3     5.2e+11
NO2    2.4e+11

You would expect the difference in runtime to be not that much different, but it is not the case, and I think it may be good to know why. It may give some clues as to how to speed the model generally (ie, identify the bottlenecks).

@spco
Copy link
Collaborator

spco commented Feb 1, 2021

Sure, yes I don't doubt there's an issue there. The use of -O2 is probably useful regardless, and should in some ways be used as the 'default' as it is designed to be 'safe' optimisations only - those that can be guaranteed not to affect numerical precision etc. That would be the usual approach in most codebases, and might be worth considering putting as a default flag in the Makefile.skel - users can always modify if required. Thoughts?

@rs028
Copy link
Collaborator Author

rs028 commented Feb 1, 2021

I honestly don't know enough to make a call here :) I think in general if we can speed it up without sacrificing accuracy it is a good thing.

Is it maybe all part of the same package of issues we generally called numeric stability? See notes at #265, #340 (comment), #384 (comment) etc...

@spco
Copy link
Collaborator

spco commented Feb 1, 2021

-O2 will do nothing to the numerics, so is 'safe' and won't have any effect on the numeric stability issues - it just takes a little longer to compile the executable. That is the only downside, (which is a very small one - our codebase is small anyway, so the compiler will have no difficulty compiling it to a higher optimisation level, and it would probably not be noticeable!)

@rs028
Copy link
Collaborator Author

rs028 commented Feb 1, 2021

Then yes. I don't particularly care if the compilation is a tad longer :)

@spco
Copy link
Collaborator

spco commented Feb 1, 2021

I just tested it on my Mac - -O0 is about 2 seconds to compile, -O2 is about 4.4s.

Just to make sure there is a real effect, I ran some of the testcases (with many more steps than normal), and timed just the run (so ignoring the compile, which stays below 5s for all of them). I think for several of these, the I/O will be the limiting factor, which the optimisation does very little about, so the 1.2-3x speedups here would very likely actually be an underestimate of the speedup in more realistic setups.

-O2 -O0 speedup
static 26s 1m17s 2.96x
spec_yes_env+no_with_photo 26s 31s 1.19x
short_no_pre 23s 29.5s 1.28x
spec_yes_env_no_with_jfac_fixed 12.5s 32s 2.56x

I will open a PR to set the default to -O2 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Performance and Profiling
Development

No branches or pull requests

2 participants