Skip to content

Commit

Permalink
perf: Amortise cost of make_secret for RandomWyHashState (#12)
Browse files Browse the repository at this point in the history
* perf: Amortise cost of `make_secret` for `RandomWyHashState`

* doc: Fix example for `new_with_secret`
  • Loading branch information
Bluefinger authored Apr 17, 2024
1 parent e436ed7 commit 34512d3
Show file tree
Hide file tree
Showing 6 changed files with 114 additions and 56 deletions.
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "wyrand"
version = "0.1.6"
version = "0.2.0"
edition = "2021"
authors = ["Gonçalo Rica Pais da Silva <[email protected]>"]
description = "A fast & portable non-cryptographic pseudorandom number generator and hashing algorithm"
Expand All @@ -12,7 +12,7 @@ exclude = ["/.*"]
include = ["src/", "LICENSE-*", "README.md"]
autobenches = true
resolver = "2"
rust-version = "1.60.0"
rust-version = "1.70.0"

[features]
debug = []
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The implementations for both the PRNG and hasher are based on the C reference im

This crate provides both the v4.2 final implementation of the WyRand/WyHash algorithm, or the older final v4 implementation. The two versions have different outputs due to changes in the algorithm and also with the constants used. Currently by default, the older final v4 algorithm will be used. In the future, this will be changed to the newer algorithm to be the default, but the old implementation will remain for backwards compatibility reasons.

This crate can be used on its own or be integrated with `rand_core`/`rand`, and it is `no-std` compatible. Minimum compatible Rust version is 1.60. This crate is also implemented with no unsafe code via `#![forbid(unsafe_code)]`.
This crate can be used on its own or be integrated with `rand_core`/`rand`, and it is `no-std` compatible. Minimum compatible Rust version is 1.70. This crate is also implemented with no unsafe code via `#![forbid(unsafe_code)]`.

## Example

Expand All @@ -35,9 +35,9 @@ The crate will always export `WyRand` and will do so when set as `default-featur
- **`serde1`** - Enables `Serialize` and `Deserialize` derives on `WyRand`.
- **`hash`** - Enables `core::hash::Hash` implementation for `WyRand`.
- **`wyhash`** - Enables `WyHash`, a fast & portable hashing algorithm. Based on the final v4 C implementation.
- **`randomised_wyhash`** - Enables `RandomisedWyHashBuilder`, a means to source a randomised state for `WyHash` for use in collections like `HashMap`/`HashSet`. Enables `wyhash` feature if it is not already enabled.
- **`fully_randomised_wyhash`** - Randomises not just the seed for `RandomisedWyHashBuilder`, but also the secret. Incurs a performance hit every time `WyHash` is initialised but it is more secure as a result. Enables `randomised_wyhash` if not already enabled.
- **`threadrng_wyhash`** - Enables sourcing entropy from `rand`'s `thread_rng()` method. Much quicker than `getrandom` and best used without the `fully_randomised_wyhash` flag as the overhead of calculating new secrets dwarfs any gains in entropy sourcing. Enables `randomised_wyhash` if not already enabled. Requires `std` environments.
- **`randomised_wyhash`** - Enables `RandomWyHashState`, a means to source a randomised state for `WyHash` for use in collections like `HashMap`/`HashSet`. Enables `wyhash` feature if it is not already enabled.
- **`fully_randomised_wyhash`** - Randomises not just the seed for `RandomWyHashState`, but also the secret. The new secret is generated once per runtime, and then is used for every subsequent new `WyHash` (with each `WyHash` instance having its own unique seed). Enables `randomised_wyhash` if not already enabled, and requires `std` environments.
- **`threadrng_wyhash`** - Enables sourcing entropy from `rand`'s `thread_rng()` method. Much quicker than `getrandom`. Enables `randomised_wyhash` if not already enabled. Requires `std` environments.
- **`v4_2`** - Switches the PRNG/Hashing algorithms to use the final v4.2 implementation.

## Building for WASM/Web
Expand Down
23 changes: 13 additions & 10 deletions src/hasher.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,9 @@ use crate::{
utils::{wymix, wymul},
};

use self::{
read::{is_over_48_bytes, read_4_bytes, read_8_bytes, read_upto_3_bytes},
secret::make_secret,
};
use self::read::{is_over_48_bytes, read_4_bytes, read_8_bytes, read_upto_3_bytes};

pub use self::secret::make_secret;

/// The WyHash hasher, a fast & portable hashing algorithm. This implementation is
/// based on the final v4/v4.2 C reference implementations (depending on whether the
Expand Down Expand Up @@ -58,7 +57,7 @@ pub struct WyHash {
}

impl WyHash {
/// Create hasher with a seed and a newly generated secret
/// Create hasher with seeds for the state and secret (generates a new secret, expensive to compute).
pub const fn new(seed: u64, secret_seed: u64) -> Self {
Self::new_with_secret(seed, make_secret(secret_seed))
}
Expand All @@ -69,8 +68,10 @@ impl WyHash {
Self::new_with_secret(seed, [WY0, WY1, WY2, WY3])
}

/// Create hasher with a seed value and a secret. Assumes the user created the secret with [`make_secret`],
/// else the hashing output will be weak/vulnerable.
#[inline]
const fn new_with_secret(mut seed: u64, secret: [u64; 4]) -> Self {
pub const fn new_with_secret(mut seed: u64, secret: [u64; 4]) -> Self {
seed ^= wymix(seed ^ secret[0], secret[1]);

WyHash {
Expand All @@ -85,7 +86,7 @@ impl WyHash {
#[inline]
fn consume_bytes(&self, bytes: &[u8]) -> (u64, u64, u64) {
let length = bytes.len();
if length <= 0 {
if length == 0 {
(0, 0, self.seed)
} else if length <= 3 {
(read_upto_3_bytes(bytes), 0, self.seed)
Expand Down Expand Up @@ -213,7 +214,9 @@ impl Default for WyHash {
impl Debug for WyHash {
fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
// Do not expose the internal state of the Hasher
f.debug_tuple("WyHash").finish()
f.debug_struct("WyHash")
.field("size", &self.size)
.finish_non_exhaustive()
}
}

Expand All @@ -234,8 +237,8 @@ mod tests {

assert_eq!(
format!("{rng:?}"),
"WyHash",
"Debug should not be leaking internal state"
"WyHash { size: 0, .. }",
"Debug should not be leaking sensitive internal state"
);
}

Expand Down
127 changes: 89 additions & 38 deletions src/hasher/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,55 @@ use core::hash::BuildHasher;
#[cfg(feature = "debug")]
use core::fmt::Debug;

#[cfg(feature = "fully_randomised_wyhash")]
use std::sync::OnceLock;

use crate::WyHash;

#[cfg(feature = "fully_randomised_wyhash")]
static SECRET: OnceLock<[u64; 4]> = OnceLock::new();

#[inline]
fn get_random_u64() -> u64 {
#[cfg(not(feature = "threadrng_wyhash"))]
{
const SIZE: usize = core::mem::size_of::<u64>();

let mut state = [0; SIZE];

// Don't bother trying to handle the result. If we can't obtain
// entropy with getrandom, then there is no hope and we might as
// well panic. It is up to the user to ensure getrandom is configured
// correctly for their platform.
getrandom::getrandom(&mut state)
.expect("Failed to source entropy for WyHash randomised state");

u64::from_ne_bytes(state)
}
#[cfg(feature = "threadrng_wyhash")]
{
use rand::RngCore;

// This is faster than doing `.fill_bytes()`. User-space entropy goes brrr.
rand::thread_rng().next_u64()
}
}

#[cfg_attr(docsrs, doc(cfg(feature = "randomised_wyhash")))]
#[derive(Clone, Copy)]
#[repr(align(8))]
/// Randomised state constructor for [`WyHash`]. This builder will source entropy in order
/// to provide random seeds for [`WyHash`]. This will yield a hasher with not just a random
/// seed, but also a new random secret, granting extra protection against DOS and prediction
/// attacks.
/// to provide random seeds for [`WyHash`]. If the `fully_randomised_wyhash` feature is enabled,
/// this will yield a hasher with not just a random seed, but also a new random secret,
/// granting extra protection against DOS and prediction attacks.
pub struct RandomWyHashState {
#[cfg(feature = "fully_randomised_wyhash")]
state: [u8; 16],
#[cfg(not(feature = "fully_randomised_wyhash"))]
state: [u8; 8],
state: u64,
secret: [u64; 4],
}

impl RandomWyHashState {
/// Create a new [`RandomWyHashState`] instance. Calling this method will attempt to
/// draw entropy from hardware/OS sources.
/// draw entropy from hardware/OS sources. If `fully_randomised_wyhash` feature is enabled,
/// then it will use a randomised `secret` as well, otherwise it uses the default wyhash constants.
///
/// # Panics
///
Expand All @@ -38,27 +68,47 @@ impl RandomWyHashState {
/// let mut hasher = s.build_hasher(); // Creates a WyHash instance with random state
/// ```
#[must_use]
#[inline]
pub fn new() -> Self {
#[cfg(feature = "fully_randomised_wyhash")]
const SIZE: usize = core::mem::size_of::<u64>() * 2;
use crate::hasher::secret::make_secret;

#[cfg(not(feature = "fully_randomised_wyhash"))]
const SIZE: usize = core::mem::size_of::<u64>();
use crate::constants::{WY0, WY1, WY2, WY3};

let mut state = [0; SIZE];
#[cfg(feature = "fully_randomised_wyhash")]
let secret = *SECRET.get_or_init(|| make_secret(get_random_u64()));
#[cfg(not(feature = "fully_randomised_wyhash"))]
let secret = [WY0, WY1, WY2, WY3];

#[cfg(not(feature = "threadrng_wyhash"))]
{
getrandom::getrandom(&mut state)
.expect("Failed to source entropy for WyHash randomised state");
}
#[cfg(feature = "threadrng_wyhash")]
{
use rand::RngCore;
Self::new_with_secret(secret)
}

rand::thread_rng().fill_bytes(&mut state);
/// Create a new [`RandomWyHashState`] instance with a provided secret. Calling this method
/// will attempt to draw entropy from hardware/OS sources, and assumes the user provided the
/// secret via [`super::secret::make_secret`].
///
/// # Panics
///
/// This method will panic if it was unable to source enough entropy.
///
/// # Examples
///
/// ```
/// use wyrand::{RandomWyHashState, make_secret};
/// use core::hash::BuildHasher;
///
/// let s = RandomWyHashState::new_with_secret(make_secret(42));
///
/// let mut hasher = s.build_hasher(); // Creates a WyHash instance with random state
/// ```
#[must_use]
#[inline]
pub fn new_with_secret(secret: [u64; 4]) -> Self {
Self {
state: get_random_u64(),
secret,
}

Self { state }
}
}

Expand All @@ -67,21 +117,7 @@ impl BuildHasher for RandomWyHashState {

#[inline]
fn build_hasher(&self) -> Self::Hasher {
#[cfg(feature = "fully_randomised_wyhash")]
{
let (first_seed, second_seed) = self.state.split_at(core::mem::size_of::<u64>());

let first_seed = u64::from_ne_bytes(first_seed.try_into().unwrap());
let second_seed = u64::from_ne_bytes(second_seed.try_into().unwrap());

WyHash::new(first_seed, second_seed)
}
#[cfg(not(feature = "fully_randomised_wyhash"))]
{
let seed = u64::from_ne_bytes(self.state);

WyHash::new_with_default_secret(seed)
}
WyHash::new_with_secret(self.state, self.secret)
}
}

Expand Down Expand Up @@ -127,5 +163,20 @@ mod tests {

// The two builders' internal states are different to each other
assert_ne!(&builder1.state, &builder2.state);

// The two builders' internal secrets are the same to each other
assert_eq!(&builder1.secret, &builder2.secret);

// When fully randomised, the generated secrets should not be the
// same as the default secret.
#[cfg(feature = "fully_randomised_wyhash")]
{
use crate::constants::{WY0, WY1, WY2, WY3};

let default_secret = [WY0, WY1, WY2, WY3];

assert_ne!(&builder1.secret, &default_secret);
assert_ne!(&builder2.secret, &default_secret);
}
}
}
5 changes: 3 additions & 2 deletions src/hasher/secret.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@ use crate::WyRand;
#[cfg(feature = "v4_2")]
use crate::hasher::primes::is_prime;

/// Generate new secret for wyhash
pub(super) const fn make_secret(mut seed: u64) -> [u64; 4] {
/// Generate new secret for wyhash. Takes a seed value and outputs an array of 4 suitable `u64` constants
/// for use with the hasher. The PRNG will always use the default constants provided.
pub const fn make_secret(mut seed: u64) -> [u64; 4] {
const C_VALUES: &[u8] = &[
15, 23, 27, 29, 30, 39, 43, 45, 46, 51, 53, 54, 57, 58, 60, 71, 75, 77, 78, 83, 85, 86, 89,
90, 92, 99, 101, 102, 105, 106, 108, 113, 114, 116, 120, 135, 139, 141, 142, 147, 149, 150,
Expand Down
3 changes: 3 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
#![no_std]
#![doc = include_str!("../README.md")]

#[cfg(feature = "fully_randomised_wyhash")]
extern crate std;

mod constants;
#[cfg(feature = "wyhash")]
mod hasher;
Expand Down

0 comments on commit 34512d3

Please sign in to comment.