Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU: Fix data transfers if already on device #131

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

lukasm91
Copy link
Collaborator

@lukasm91 lukasm91 commented Aug 6, 2024

(requested by Judicael Grasset)

If data is already on device, we should not make an extra copy (similar to the openacc semantics: copy means copy if not present)

Should go after #127

@samhatfield
Copy link
Collaborator

@lukasm91 could you rebase this branch against latest develop?

@lukasm91
Copy link
Collaborator Author

lukasm91 commented Aug 9, 2024

@lukasm91 could you rebase this branch against latest develop?

done

@samhatfield samhatfield added bug Something isn't working gpu labels Aug 12, 2024
@samhatfield
Copy link
Collaborator

One thing I don't get about this change: first time we call DIR_TRANS, ACC_IS_PRESENT(PGPUV) (say) is .FALSE.. So a host -> device update happens for PGPUV. But the second time, ACC_IS_PRESENT(PGPUV) is .TRUE., so the update does not happen. That means the second call to DIR_TRANS is not operating on the latest values in PGPUV on the host side.

In other words, don't we always have to do a host->device update for all PRESENT (Fortran, not OpenACC) arrays in TRGTOL? And vice versa for TRLTOG.

@lukasm91
Copy link
Collaborator Author

Let me explain in more detail: I can think of using ectrans in two modes of operations.

  • Currently, the underlying assumption is that when you enter ectrans, the inputs/outputs are never on device. ectrans has to take care of copying the data onto device, and in the end has to make sure that the data is also updated back. The spectral space is fine, this is handled in ltdir_mod/ltinv_mod, because those are acc data copyout. On the grid-point space it is also fine because we do the updates
  • On the other hand ,we can use ectrans with PGP, ... already on device. ectrans should not make any copies, otherwise it would overwrite the data on the device. The spectral space is fine again, because acc data copyin/out only makes the copy if the data is not present. The grid-point space should only do the copy if the data is not present before we potentially create it.

In the current main branch, ectrans is behaving very weirdly: On the grid-point space it does the copy in any case, in the spectral space it only copies if the data is not present. This PR changes the behaviour for both spaces to "only copy if not present".

@samhatfield
Copy link
Collaborator

samhatfield commented Aug 19, 2024

Okay, let me put what you're saying in other terms to see if I understand:

Current behaviour

  • DIR_TRANS:
    • Inputs (PGP etc.) always copied from host to device (TRGTOL)
    • Outputs (PSPVOR etc.) always copied from device to host (LTDIR)
  • INV_TRANS:
    • Inputs (PSPVOR etc.) always copied from host to device (LTINV)
    • Outputs (PGP etc.) always copied from device to host (TRLTOG)

New behaviour

  • DIR_TRANS:
    • Inputs (PGP etc.) only copied from host to device when not already present on device (TRGTOL)
    • Outputs (PSPVOR etc.) always copied from device to host (LTDIR)
  • INV_TRANS:
    • Inputs (PSPVOR etc.) always copied from host to device (LTINV)
    • Outputs (PGP etc.) only copied from device to host when not already present on device (TRLTOG)

But writing out all that only raises more questions...

  1. DIR_TRANS: PGP etc. are arguments, not module variables. Hence PGP etc. cannot be made present on device by anyone other than an earlier call to TRGTOL. In that case, they would be present, but would contain stale values. So an update would be required, always.
  2. For INV_TRANS, the first time you call it, PGP etc. won't be present, and PGP etc. will be copied from device to host. The second time, PGP etc. will be present, so PGP etc. won't be copied from device to host. Why would you not want to copy the output back to host? Again, PGP etc. are arguments, so you cannot access them anywhere else.

@lukasm91
Copy link
Collaborator Author

Let me modify your explanation and correct it (current behaviour is not consistent):

Current behaviour

* `DIR_TRANS`:
  
  * Inputs (`PGP` etc.) _always_ copied from device to host (`TRGTOL`)
  * Outputs (`PSPVOR` etc.) only copied from device to host when not already present on device (`LTDIR`)

* `INV_TRANS`:
  
  * Inputs (`PSPVOR` etc.) only copied from host to device when not already present on device (`LTINV`)
  * Outputs (`PGP` etc.) _always_ copied from device to host (`TRLTOG`)

New behaviour

* `DIR_TRANS`:
  
  * Inputs (`PGP` etc.) only copied from host to device when not already present on device (`TRGTOL`)
  * Outputs (`PSPVOR` etc.) only copied from device to host when not already present on device (`LTDIR`)

* `INV_TRANS`:
  
  * Inputs (`PSPVOR` etc.) only copied from host to device when not already present on device (`LTINV`)
  * Outputs (`PGP` etc.) only copied from device to host when not already present on device (`TRLTOG`)

To answer the other questions:

1. `DIR_TRANS`: `PGP` etc. are arguments, not module variables. Hence `PGP` etc. cannot be made present on device by anyone other than an earlier call to `TRGTOL`. In that case, they would be present, but would contain stale values. So an update would be required, always.

Yes, they are arguments. PGP can be made present by the callee:

!$ACC DATA CREATE(PGP, ...)
call dir_trans(...)
call inv_trans(...)
!$ACC END DATA

2. For `INV_TRANS`, the first time you call it, `PGP` etc. won't be present, and `PGP` etc. will be copied from device to host. The second time, `PGP` etc. will be present, so `PGP` etc. won't be copied from device to host. Why would you not want to copy the output back to host? Again, `PGP` etc. are arguments, so you cannot access them anywhere else.

Same reason. This is not about the first/second/third call, it is about what happens if the data is managed outside of ectrans. Assume the whole code of IFS is ported to GPU, in this case all data always live on the GPU, including PGP, PSPVOR, ... There should be 0 copies, which is why the "current behaviour" is not what we want.

@wdeconinck
Copy link
Collaborator

@samhatfield is @lukasm91 explanation sufficient? Is there anything else blocking, besides conflicts that need to be resolved (rebase please)?

@samhatfield
Copy link
Collaborator

To the best of my memory, I didn't feel confident we could merge this as what Lukas is proposing (which is reasonable of course!) has implications outside of just ecTrans, and influences our overall strategy for device-side memory handling in the IFS. I need to refresh my memory though - let me take a look again.

@wdeconinck
Copy link
Collaborator

So what Sam understood is that PGP could be present (acc_mapped) on device just because of previous calls to inv_trans or dir_trans.
So a new call to dir_trans will then see that PGP is acc_mapped and assume its values are up to date.

The IFS currently has no notion of any device copy and only works with host data, treating the GPU as an offload-accelerator for ectrans.
The IFS should become itself GPU aware to track multiple memory spaces and provide info to ectrans if a field is up-to-date on the GPU or not. I guess that is the real problem. A new inv_trans / dir_trans API would need to be created for this.

Notwithstanding there are inconsistencies discovered by @lukasm91 about the treatment of PSPVOR that should be corrected. Probably to the safe side where the copy has to be made always ?
A new inv_trans/dir_trans API could then cater for the copy if not present?

@lukasm91
Copy link
Collaborator Author

lukasm91 commented Nov 19, 2024

I could easily rebase this branch if needed, and I can do that if we consider it for merging.

The IFS currently has no notion of any device copy and only works with host data, treating the GPU as an offload-accelerator for ectrans.

That's fine. If it only works with host data, there is no issue with this implementation. This implementation is only causing issues if you call it with PGP already allocated on GPU, e.g.

!$ACC DATA COPYIN(PGP) COPYOUT(PSP)
!$ACC UPDATE HOST (PGP)
call ectrans
!$ACC UPDATE DEVICE(PSP)
!$ACC END DATA

If you do that you run into trouble, because if the data exists on the device, this implementation assumes that the "live" copy of the data is on the device. But this is not a very obvious use-case and I believe it is reasonable to assume that a user makes the "device" copy alive if a "device" version of data exists.
The option you mention: GPU is offload-accelerator for ectrans - whenever you pass something to GPU, it's gonna be a host-only pointer, and ectrans takes care of it.

The current main implementation assumes that if data exists on the device, for PGP, the live data is the device data, for PSP the live data is the host data, which is certainly inconsistent.

We could revert the behaviour and say: Even if you pass data that exists on the device, ectrans is always assuming it is a host pointer, but that would trigger double copies: The user has to update the host data, in order to call ectrans, which does the device copy.

So what Sam understood is that PGP could be present (acc_mapped) on device just because of previous calls to inv_trans or dir_trans.

That won't happen. If the present counter of PGP is 0 when we call ectrans, it is incremented when it is copied in, before the end of ectrans it becomes 0 again, and PGP won't be mapped again. I am happy to explain this in more detail how this works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gpu
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants