Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation of incidence quantity inside of numba functions in transmission.py not returning correct values when parallel=True #9

Open
KevinMcCarthyAtIDM opened this issue Dec 10, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@KevinMcCarthyAtIDM
Copy link
Collaborator

In these simple SI, SIR type (not for SIS) models, the cumulative sum of the incidence output should be equal to the total change in susceptibility, and for SI models should also be equal to the total infected count (since I = 1-S). However, while the total cases and the susceptibility behave as expected (and notably, aggregation into these output quantities happens outside of numba njit functions), the cumulative incidence is consistently smaller than what it should be. Switching to numba parallel=False fixes the issue.

With parallel=True
image

With parallel=False
image

There are ways to do aggregation inside of numba functions with parallel=True, but we may need a more careful implementation than what is here. I'll keep moving forward with parallel off for now.

@KevinMcCarthyAtIDM KevinMcCarthyAtIDM added the bug Something isn't working label Dec 10, 2024
@KevinMcCarthyAtIDM
Copy link
Collaborator Author

KevinMcCarthyAtIDM commented Dec 21, 2024

Instead of this pattern, which gets a race condition when accumulating incidences:

        for i in nb.prange(count):
            susceptibility = susceptibilities[i]
            if susceptibility > 0:
                nodeid = nodeids[i]
                force = susceptibility * forces[nodeid]  # force of infection attenuated by personal susceptibility
                if (force > 0) and (np.random.random_sample() < force):  # draw random number < force means infection
                    susceptibilities[i] = 0  # no longer susceptible
                    doi[i] = tick
                    incidence[nodeid] += 1

Use this thread-safe pattern for doing this:

        max_node_id = np.max(nodeids)
        thread_incidences = np.zeros((nb.config.NUMBA_DEFAULT_NUM_THREADS, max_node_id+1), dtype=np.uint32)
        for i in nb.prange(count):
            susceptibility = susceptibilities[i]
            if susceptibility > 0:
                nodeid = nodeids[i]
                force = susceptibility * forces[nodeid]  # force of infection attenuated by personal susceptibility
                if (force > 0) and (np.random.random_sample() < force):  # draw random number < force means infection
                    susceptibilities[i] = 0  # no longer susceptible
                    doi[i] = tick
                    thread_incidences[nb.get_thread_id(), nodeid] += 1
        for t in range(nb.config.NUMBA_DEFAULT_NUM_THREADS):
            for j in range(max_node_id+1):
                incidence[j] += thread_incidences[t, j]

Now we can set parallel = TRUE, which can be a substantial speedup; gets my simple SI model running in ~40% of the time, and that one is pretty much dominated by the transmission step since almost nothing else is going on.

@clorton
Copy link
Contributor

clorton commented Dec 22, 2024

The reduce step (last three lines) isn't being parallelized (which is probably fine as prange() has some overhead).

Should be able to reduce those lines to this:

indicidence[:] = thread_incidences.sum(axis=0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants