Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update master to most recent dev #54

Merged
merged 75 commits into from
May 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
1028c5c
Turn memset to async variant in split tools compactions
kstppd Jan 6, 2024
07d1ee4
Added some overloads, increased use of streams in methods
markusbattarbee Jan 8, 2024
922d248
Added option of providing device-side memory buffer to use in extract…
markusbattarbee Jan 11, 2024
bec8da5
Merge branch 'reuse_allocation_for_extract_keys' into use
markusbattarbee Jan 11, 2024
7ef9466
Add unit test for extractKeysByPattern overload
kstppd Jan 11, 2024
dae6d11
Fix to warp wide erase
kstppd Jan 17, 2024
57116ea
commented out erase optimizeGPU
markusbattarbee Jan 30, 2024
15b1d99
Merge branch 'dev' into use
markusbattarbee Jan 30, 2024
e63a493
Add prefetching to stream compactions and fix issue where compactions…
kstppd Feb 1, 2024
d21a3dd
Fix splitvec's .data() method to not dereferecne with no reason
kstppd Feb 1, 2024
3afb0ee
fixes to sizes, prefetches, and asserts
markusbattarbee Feb 1, 2024
b1e3eda
Merge branch 'dev' into use
markusbattarbee Feb 1, 2024
a7dde83
removed one comment
markusbattarbee Feb 1, 2024
453542a
Merge branch 'use' into markus_fix
markusbattarbee Feb 1, 2024
19a344a
Added back CPU-only methods without streams
markusbattarbee Feb 1, 2024
b24806a
Fix copy-paste error
markusbattarbee Feb 1, 2024
913ab95
Merge pull request #37 from markusbattarbee/markus_fix
kstppd Feb 1, 2024
e0c4392
Remove nodiscard options which are breaking compilation on lumi
markusbattarbee Feb 6, 2024
6e964cd
Merge pull request #39 from markusbattarbee/fix_lumni_nodiscard
kstppd Feb 6, 2024
f266573
Reduce stream syncs in split tools
kstppd Feb 7, 2024
4b360ae
WIP in faster stream compactions
kstppd Feb 19, 2024
73d152c
Revert a few stream syncs that are actually needed
kstppd Feb 19, 2024
e6f1ad3
Block compaction with one kernel up to 1024 elements
kstppd Mar 1, 2024
99f0c52
Remove sync
kstppd Mar 1, 2024
602108c
Rename Cuda_mempool to SplitStackArena
kstppd Mar 1, 2024
ce93cb0
Merge pull request #48 from kstppd/fasterCompactions
kstppd Mar 8, 2024
9e0a83c
Remove forgotten optimize call and add conditional prefetches to retr…
kstppd Mar 15, 2024
a980b63
first attempt at simple looping reduction kernel on cuda
markusbattarbee Mar 24, 2024
4e6628a
Added loop reduction kernels
markusbattarbee Mar 24, 2024
d641b49
add loop-extraction of key-value pairs (untested)
markusbattarbee Mar 24, 2024
28d6cc4
Added new extraction loop accessor for all valid key-value pairs
markusbattarbee Mar 26, 2024
376e9f6
Add methods that return emptybucket and tombstone values
kstppd Mar 26, 2024
3cf6b02
Grow size of target vectors in loop reductions by blocksize at a time
markusbattarbee Mar 28, 2024
1968e14
Fix signed-unsigned error which was causing erroneous outputs from wa…
markusbattarbee Mar 28, 2024
5e4590d
Use asserts after all in loop reductions if insufficient capacity
markusbattarbee Apr 1, 2024
3f214f5
Fix signed-unsigned error which was causing erroneous outputs from wa…
markusbattarbee Mar 28, 2024
ab4d280
Bring also HIP loop kernels up to speed
markusbattarbee Apr 2, 2024
859b2e7
Fixes to loop reductions, made both loop and block reductions single-…
markusbattarbee Apr 3, 2024
99d285a
Fix namespace issue with Blocksize in new loop variant compactions. A…
kstppd Apr 3, 2024
51d6c1c
Test on the same number of elements
kstppd Apr 3, 2024
05387e3
Namespace fixes; change loop reduction resize to be only required ins…
markusbattarbee Apr 3, 2024
91d5e6f
Merge branch 'loop_variant_kp' into loop_reductions
markusbattarbee Apr 3, 2024
cf28b06
Fix test to use UM. Guard against loop reductions using host memory
kstppd Apr 4, 2024
1631477
Sync it.
kstppd Apr 4, 2024
b1b59d2
Add support for HIP in loop variant compaction checks
kstppd Apr 6, 2024
acee2ef
Merge pull request #3 from kstppd/loop_variant_kp
markusbattarbee Apr 9, 2024
a702a9b
Merge pull request #49 from markusbattarbee/loop_reductions
kstppd Apr 9, 2024
d2ea45f
clean reallocation, add device_resize option to not clear/construct
markusbattarbee Apr 10, 2024
2400e5a
modify fill in-kernel to reduce paging back and forth
kstppd Apr 11, 2024
aa8daa5
delete forgotten fill reset, keep check only if in debug mode
kstppd Apr 11, 2024
a9d1a92
Different apparoch for splitvector upload
markusbattarbee Apr 11, 2024
9943af0
Merge branch 'in_kernel_fill_resets' into device_resize_no_construct
markusbattarbee Apr 11, 2024
39c2d21
Merge pull request #51 from kstppd/in_kernel_fill_resets
kstppd Apr 11, 2024
2896c0f
switch cleanup task order
markusbattarbee Apr 11, 2024
828e11f
Merge branch 'dev' into device_resize_no_construct
markusbattarbee Apr 11, 2024
20dc6bc
Added device_buckets accessor to hashinator
markusbattarbee Apr 16, 2024
59f7afa
Added a few prefetches to overwrite accessor
markusbattarbee Apr 19, 2024
246ea19
reverted cleanup tombstone ratio cutoff point due to hangs
markusbattarbee Apr 25, 2024
5bf1a0b
Fix include
kstppd May 27, 2024
03ebd10
automatically upload device_buckets when resizing. Made prefetches fl…
markusbattarbee May 28, 2024
b45ccee
compile fix and pass prefetches flag via performCleanupTasks
markusbattarbee May 28, 2024
ee32b91
Merge pull request #52 from markusbattarbee/device_resize_no_construct
kstppd May 28, 2024
f45ff8b
Place GPU commands inside ifdefs to allow CPU mode again. Reverted de…
markusbattarbee May 28, 2024
f2f33e6
Found culprit causing issues in Vlasiator and re-implemented device r…
markusbattarbee May 28, 2024
5bf8de0
Fix ifdef compilation error. Prefer nullptr instead of NULL
kstppd May 28, 2024
0f57fdd
Clang format pass
kstppd May 28, 2024
d9379ec
Make device_pointer const
kstppd May 28, 2024
a2cf21f
Few stylistic changed
kstppd May 28, 2024
2cc5a48
also add CPU-only ifdef to splitvec
markusbattarbee May 29, 2024
1fafa6e
Merge branch 'device_resize_no_construct' into fixes_premerge
markusbattarbee May 29, 2024
cbb899a
Merge pull request #4 from kstppd/fixes_premerge
markusbattarbee May 30, 2024
2dc8472
Added error checking to unit tests, removed double free of device spl…
markusbattarbee May 30, 2024
ba838da
Reverted alloc_multiplier behaviour in reserve, made upload() optimiz…
markusbattarbee May 30, 2024
550be65
Merge pull request #53 from markusbattarbee/device_resize_no_construct
kstppd May 30, 2024
5ecd282
Clang format pass
kstppd May 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion include/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,34 @@ constexpr inline size_t nextPow2(size_t v) noexcept {
* @brief Computes the next optimal overflow for the hasher kernels
*/
HASHINATOR_HOSTDEVICE
[[nodiscard]]
//[[nodiscard]]
constexpr inline size_t nextOverflow(size_t currentOverflow, size_t virtualWarp) noexcept {
size_t remainder = currentOverflow % virtualWarp;
return ((remainder)==0)?currentOverflow: currentOverflow + (virtualWarp - remainder);
}

inline bool isDeviceAccessible(void* ptr){
#ifdef __NVCC__
cudaPointerAttributes attributes;
cudaPointerGetAttributes(&attributes, ptr);
if (attributes.type != cudaMemoryType::cudaMemoryTypeManaged &&
attributes.type != cudaMemoryType::cudaMemoryTypeDevice) {
return false;
}
return true;
#endif

#ifdef __HIP__
hipPointerAttribute_t attributes;
hipPointerGetAttributes(&attributes, ptr);
if (attributes.type != hipMemoryType::hipMemoryTypeManaged &&
attributes.type != hipMemoryType::hipMemoryTypeDevice) {
return false;
}
return true;
#endif
}

/**
* @brief Enum for error checking in Hahsinator.
*/
Expand Down
13 changes: 13 additions & 0 deletions include/hashinator/defaults.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,17 @@ constexpr int MAX_BLOCKSIZE = 1024;
template <typename T>
using DefaultHashFunction = HashFunctions::Fibonacci<T>;
} // namespace defaults

struct Info {
Info(){};
Info(int sz)
: sizePower(sz), fill(0), currentMaxBucketOverflow(defaults::BUCKET_OVERFLOW), tombstoneCounter(0),
err(status::invalid) {}
int sizePower;
size_t fill;
size_t currentMaxBucketOverflow;
size_t tombstoneCounter;
status err;
};

} // namespace Hashinator
10 changes: 6 additions & 4 deletions include/hashinator/hashers.h
Original file line number Diff line number Diff line change
Expand Up @@ -143,20 +143,22 @@ class Hasher {

// Reset wrapper
static void reset(hash_pair<KEY_TYPE, VAL_TYPE>* src, hash_pair<KEY_TYPE, VAL_TYPE>* dst, const int sizePower,
size_t maxoverflow, size_t len, split_gpuStream_t s = 0) {
size_t maxoverflow, Hashinator::Info* info, size_t len, split_gpuStream_t s = 0) {
size_t blocks, blockSize;
launchParams(len, blocks, blockSize);
Hashinator::Hashers::reset_to_empty<KEY_TYPE, VAL_TYPE, EMPTYBUCKET, HashFunction, defaults::WARPSIZE,
elementsPerWarp>
<<<blocks, blockSize, 0, s>>>(src, dst, sizePower, maxoverflow, len);
<<<blocks, blockSize, 0, s>>>(src, dst, sizePower, maxoverflow, info, len);
SPLIT_CHECK_ERR(split_gpuStreamSynchronize(s));
}

// Reset wrapper for all elements
static void reset_all(hash_pair<KEY_TYPE, VAL_TYPE>* dst, size_t len, split_gpuStream_t s = 0) {
static void reset_all(hash_pair<KEY_TYPE, VAL_TYPE>* dst, Hashinator::Info* info, size_t len,
split_gpuStream_t s = 0) {
size_t blocksNeeded = len / defaults::MAX_BLOCKSIZE;
blocksNeeded = blocksNeeded + (blocksNeeded == 0);
reset_all_to_empty<KEY_TYPE, VAL_TYPE, EMPTYBUCKET><<<blocksNeeded, defaults::MAX_BLOCKSIZE, 0, s>>>(dst, len);
reset_all_to_empty<KEY_TYPE, VAL_TYPE, EMPTYBUCKET>
<<<blocksNeeded, defaults::MAX_BLOCKSIZE, 0, s>>>(dst, info, len);
SPLIT_CHECK_ERR(split_gpuStreamSynchronize(s));
}

Expand Down
Loading
Loading