Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a good design #1

Open
ktprime opened this issue Feb 6, 2022 · 4 comments
Open

a good design #1

ktprime opened this issue Feb 6, 2022 · 4 comments

Comments

@ktprime
Copy link

ktprime commented Feb 6, 2022

hi,
I have added your qc-hash into my benchmark (https://github.com/ktprime/emhash/blob/master/bench/ebench.cpp),
some cases qc-hash is very fast but others quite slow. and it can not run(hanged) and martin bench(martin_bench.cpp).
can you fix the issuse?
I add my owern emhash7/emhash5 into your bench file benchmark.cpp(https://github.com/ktprime/emhash/blob/master/bench/qbench.cpp), it's a good benchmark.

@Daskie
Copy link
Owner

Daskie commented Feb 8, 2022

Interesting, thanks for taking the time to add this to your benchmark. I have not encountered any hanging in my own tests, but if you can provide the operation and kind of data that makes it hang, I would love to fix any bugs I may have missed.

As a side note, I noticed you're using std::hash with the map in your benchmark. I recommend using qc::hash::IdentityHash (the default) for keys with reasonable low-order entropy (which I think your random gamma distributed keys are) or qc::hash::FastHash otherwise. I would expect either to yield a notable step-up in speed.

@ktprime
Copy link
Author

ktprime commented Feb 9, 2022

qc-hash is slow in insert-erase case. (ex https://martin.ankerl.com/2019/04/01/hashmap-benchmarks-03-03-result-RandomInsertErase/), you can test and reproduce https://github.com/ktprime/emhash/blob/master/bench/martin_bench.cpp (case 6 bench_insert, which is copyed from martin bench)
mybe the main reason is simple linear probing(I use 2-3 complex probling).

how to build my bench on bench dir
hyb@PC210500096:/mnt/f/emhash/bench$ clang++ -O3 -march=native -mtune=native -I.. -I../thirdparty -DQC_HASH=2 -std=c++20 martin_bench.cpp -o mbenc

or
clang++ -O3 -march=native -I.. -I../thirdparty -DFIB_HASH=2 -DQC_HASH=2 -std=c++20 ebench.cpp -o ebench

my bench can choose 6 hash functions by set FIB_HASH=(1-6) in build cmd.

template<typename T>
struct Int64Hasher
{
    static constexpr uint64_t KC = UINT64_C(11400714819323198485);
    inline std::size_t operator()(T key) const
    {
#if FIB_HASH == 1
        return key;
#elif FIB_HASH == 2
        return hashfib(key);
#elif FIB_HASH == 3
        return hash_mur3(key);
#elif FIB_HASH == 4
        return hashmix(key);
#elif FIB_HASH == 5
        return rrxmrrxmsx_0(key);
#elif FIB_HASH > 10000
        return key % FIB_HASH; //bad hash
#elif FIB_HASH > 100
        return key / FIB_HASH; //bad hash
#elif FIB_HASH == 6
        return wyhash64(key, KC);
#else
        auto x = key;
        x = (x ^ (x >> 30)) * UINT64_C(0xbf58476d1ce4e5b9);
        x = (x ^ (x >> 27)) * UINT64_C(0x94d049bb133111eb);
        x = x ^ (x >> 31);
        return x;
#endif
    }
};

@Daskie
Copy link
Owner

Daskie commented Feb 18, 2022

Thanks for the feedback! That's not a performance case I had considered. When I get some time I'll expand my benchmarking and decide whether that case is worth the increased complexity of a more sophisticated probing strategy, like the one you've come up with

@ktprime
Copy link
Author

ktprime commented Jun 18, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants