-
Notifications
You must be signed in to change notification settings - Fork 7
Data Structures in C
Suppose you have an array of structs, where each struct member has a pointer to an array, and you want to move the whole structure to the device.
typedef struct{
float *a, *b;
int n;
}mystruct;
. . .
mystruct *x;
. . .
#pragma acc data copy( x[0:nn] )
{
. . . // what gets copied here?
}
This probably won't do what you want. It will copy x[0], x[1], x[2], ..., x[nn-1] from the host to device, including the host pointers for members a and b. You wanted the arrays for a and b to be copied as well. This requires much more work. You want (1) the array x to be copied, (2) for each element of x, copy its a and b members, and (3) attaching the device copy of x[i].a pointer to the device copy of x[i].a and similarly for b. This is 2nn+1 copy operations, and 2nn attach operations.
There are at least several ways to do this. One method is standard conforming, using API routines to move the data:
mystruct* dx; // device copy of x
dx = (mystruct*)acc_copyin( x, sizeof(mystruct)*nn ); // copy nn elements of x;
for( i = 0; i < nn; ++i ){
float* da, *db;
da = (float*)acc_copyin( x[i].a, sizeof(float)*x[i].n ); // copy x[i].a array
acc_memcpy_to_device( &dx[i].a, &da, sizeof(float*) ); // attach x[i].a
db = (float*)acc_copyin( x[i].b, sizeof(float)*x[i].n ); // copy x[i].b
acc_memcpy_to_device( &dx[i].b, &db, sizeof(float*) ); // attach x[i].b
}
Now, when you want the data to come out, it's important NOT to copy out the array x itself from the device to the host, since the device copy of x has its pointer members pointing to device memory, which is invalid on the host.
for( i = 0; i < nn; ++i ){
acc_update_self( &x[i].n, sizeof(int) ); // update nonpointer members
acc_copyout( x[i].a, sizeof(float)*x[i].n ); // copy x[i].a array and delete
acc_copyout( x[i].b, sizeof(float)*x[i].n ); // copy x[i].b array and delete
}
acc_delete( x, sizeof(mystruct)*nn );
The PGI compiler has two extensions to simplify this. The first is a new API routine, acc_attach, which works by updating the device copy of a pointer by the device copy of its pointee. The first loop above to do the copyin would be simplified (somewhat) to:
acc_copyin( x, sizeof(mystruct)*nn ); // copy nn elements of x;
for( i = 0; i < nn; ++i ){
acc_copyin( x[i].a, sizeof(float)*x[i].n ); // copy x[i].a array
acc_attach( &x[i].a ); // attach x[i].a
dacc_copyin( x[i].b, sizeof(float)*x[i].n ); // copy x[i].b
acc_attach( &x[i].b ); // attach x[i].b
}
The second extension is that when data clauses are used, if the data being moved is a struct member, and the struct itself is already present on the device, the pointee and pointer are attached. The corresponding code using directives would be:
#pragma acc enter data copyin(x[0:nn])
for( i = 0; i < nn; ++i ){
#pragma acc enter data copyin( x[i].a[0:x[i].n] );
#pragma acc enter data copyin( x[i].b[0:x[i].n] );
}
If these extensions prove useful, we will consider proposing them to the OpenACC committee in the future.