Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core::simd-only crate #107

Draft
wants to merge 87 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
8fc060d
portable wip
hkratz Oct 20, 2024
22903f8
wip
hkratz Oct 20, 2024
633cd46
disable edition 2024 for now
hkratz Oct 20, 2024
051c22d
Merge branch 'main' into portable-only
hkratz Oct 20, 2024
a4c2eb6
wip
hkratz Oct 20, 2024
9a239d4
rm flexpect
hkratz Oct 20, 2024
b7b762a
wip
hkratz Oct 20, 2024
9e3f8eb
benchmark for portable simdutf8
hkratz Oct 22, 2024
4839d1f
assembly baseline
hkratz Oct 22, 2024
1ff0037
Merge branch 'main' into portable-only
hkratz Oct 22, 2024
906fb2d
don't check in Cargo.toml for lib
hkratz Oct 22, 2024
4076f77
Cargo.toml for benchmark
hkratz Oct 22, 2024
465b29f
update bench deps
hkratz Oct 22, 2024
ed8e6e1
portable basic tentative impl
hkratz Oct 22, 2024
865f675
fix
hkratz Oct 22, 2024
ef96a12
add tests
hkratz Oct 22, 2024
47abea7
clippy
hkratz Oct 22, 2024
365d671
clippy
hkratz Oct 22, 2024
0f6e5f6
clippy
hkratz Oct 22, 2024
59dfaae
move unsafe around
hkratz Oct 22, 2024
a41d672
updated bench lock file
hkratz Oct 22, 2024
3e4b66f
don't use custom memcpy
hkratz Oct 22, 2024
c5117f9
upd
hkratz Oct 22, 2024
b2b3314
simplify
hkratz Oct 22, 2024
de5f06b
simplify
hkratz Oct 22, 2024
4373047
upd
hkratz Oct 22, 2024
fbd4207
masked load experimentation
hkratz Oct 23, 2024
69dad0f
more experiments
hkratz Oct 23, 2024
7eb9ab0
wip
hkratz Oct 24, 2024
898d124
new baseline asm
hkratz Oct 24, 2024
60c78f6
Merge branch 'main' into portable-only
hkratz Oct 24, 2024
da72ebf
Merge branch 'main' into portable-only
hkratz Oct 24, 2024
c3c24a0
only used masked loads if fast (avx512 later)
hkratz Oct 24, 2024
df99787
inlining
hkratz Oct 24, 2024
b7d0bc8
cleanup
hkratz Oct 24, 2024
c0c7289
rm unsafe
hkratz Oct 24, 2024
e14fd37
upd
hkratz Oct 24, 2024
556d85e
doc
hkratz Oct 25, 2024
17fb65c
compat impl
hkratz Oct 25, 2024
723462e
wip
hkratz Oct 25, 2024
c663988
rename
hkratz Oct 25, 2024
8fe083d
rename
hkratz Oct 25, 2024
6f47f13
clippy
hkratz Oct 25, 2024
a562a70
public imp
hkratz Oct 25, 2024
f088015
wip
hkratz Oct 25, 2024
d5ed064
forbid unsafe impl, restructure
hkratz Oct 25, 2024
dc02eac
cleanup
hkratz Oct 25, 2024
934d666
wip
hkratz Oct 25, 2024
5d45066
wip
hkratz Oct 25, 2024
a7d93ca
wip
hkratz Oct 25, 2024
ef1c44b
simd 256 fixes
hkratz Oct 25, 2024
473361a
simplify, use v128 for now
hkratz Oct 25, 2024
bdf642b
cleanup
hkratz Oct 25, 2024
12d1420
make some fns const
hkratz Oct 25, 2024
7398350
nostd
hkratz Oct 25, 2024
6fec796
fix benchmark
hkratz Oct 27, 2024
ea024f0
use 256-bit impl on avx2
hkratz Oct 27, 2024
b6ad12b
missing inline
hkratz Oct 28, 2024
0602f21
don't always check the remainder
hkratz Oct 28, 2024
01373c5
fix check remainder only if present
hkratz Oct 28, 2024
8a22e8a
cleanup
hkratz Oct 28, 2024
5344baf
cleanup: don't use aligned buffers
hkratz Oct 31, 2024
cbed922
cmt
hkratz Oct 31, 2024
ce74394
simplify
hkratz Oct 31, 2024
5a42598
simplify
hkratz Oct 31, 2024
c26a718
fallback, etc.
hkratz Oct 31, 2024
0e16dd2
simplify/optimize
hkratz Oct 31, 2024
b65f1ec
cleanup
hkratz Oct 31, 2024
48168d0
simplify
hkratz Oct 31, 2024
a168c22
simplify
hkratz Oct 31, 2024
add0555
nit
hkratz Nov 3, 2024
e8b3a71
nit
hkratz Nov 3, 2024
d586ea1
more supported archs
hkratz Nov 3, 2024
6b27479
simplify, nits
hkratz Nov 4, 2024
bd42150
bench fix
hkratz Nov 4, 2024
64a9c87
fixes
hkratz Nov 9, 2024
b7301ab
bench partial
hkratz Nov 9, 2024
d23f569
workspace (nightly only)
hkratz Nov 10, 2024
3cecf1f
update
hkratz Nov 10, 2024
9dd881b
doc
hkratz Nov 10, 2024
c935c4b
.prettier cfg
hkratz Nov 10, 2024
563fb03
initial README
hkratz Nov 10, 2024
9766fb6
README wip
hkratz Nov 10, 2024
803b4bd
wasm32 simd128 support
hkratz Nov 10, 2024
7a58746
README WIP
hkratz Nov 10, 2024
79e883f
doc wip
hkratz Nov 27, 2024
565492b
edition 2024
hkratz Nov 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 32 additions & 24 deletions bench/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions bench/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ simdutf8_wasmtime = ["wasmtime"]
core_affinity = "0.8.1"
criterion = "0.5.1"
simdutf8 = { version = "*", path = "..", features = ["aarch64_neon"] }
simdutf8-portable = { version = "*", path = "../portable" }
simdjson-utf8 = { version = "*", path = "simdjson-utf8", optional = true }
# default is cranelift which is not as performant as the llvm backend
wasmer = { version = "2.1", optional = true, default-features = false }
Expand All @@ -47,6 +48,14 @@ harness = false
name = "throughput_compat"
harness = false

[[bench]]
name = "throughput_basic_portable"
harness = false

[[bench]]
name = "throughput_compat_portable"
harness = false

[[bench]]
name = "throughput_std"
harness = false
Expand Down
3 changes: 3 additions & 0 deletions bench/benches/throughput_basic_portable.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
use simdutf8_bench::define_throughput_benchmark;

define_throughput_benchmark!(BenchFn::BasicPortable);
3 changes: 3 additions & 0 deletions bench/benches/throughput_compat_portable.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
use simdutf8_bench::define_throughput_benchmark;

define_throughput_benchmark!(BenchFn::CompatPortable);
25 changes: 24 additions & 1 deletion bench/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
use criterion::{measurement::Measurement, BenchmarkGroup, BenchmarkId, Criterion, Throughput};
use simdutf8::basic::from_utf8 as basic_from_utf8;
use simdutf8::compat::from_utf8 as compat_from_utf8;
use simdutf8_portable::basic::from_utf8 as basic_from_utf8_portable;
use simdutf8_portable::compat::from_utf8 as compat_from_utf8_portable;

use std::str::from_utf8 as std_from_utf8;

Expand Down Expand Up @@ -29,6 +31,8 @@ pub enum BenchFn {
Basic,
BasicNoInline,
Compat,
BasicPortable,
CompatPortable,
Std,

#[cfg(feature = "simdjson")]
Expand Down Expand Up @@ -134,11 +138,12 @@ fn get_valid_slice_of_len_or_more_aligned(
fn bench<M: Measurement>(c: &mut Criterion<M>, name: &str, bytes: &[u8], bench_fn: BenchFn) {
let mut group = c.benchmark_group(name);
for i in [1, 8, 64, 512, 4096, 65536, 131072].iter() {
let i = i + 33;
let alignment = Alignment {
boundary: 64,
offset: 8, // 8 is the default alignment on 64-bit, so this is what can be expected worst-case
};
let (vec, offset) = get_valid_slice_of_len_or_more_aligned(bytes, *i, alignment);
let (vec, offset) = get_valid_slice_of_len_or_more_aligned(bytes, i, alignment);
let slice = &vec[offset..];
assert_eq!(
(slice.as_ptr() as usize) % alignment.boundary,
Expand Down Expand Up @@ -192,6 +197,24 @@ fn bench_input<M: Measurement>(
},
);
}
BenchFn::BasicPortable => {
group.bench_with_input(
BenchmarkId::from_parameter(format!("{:06}", input.len())),
&input,
|b, &slice| {
b.iter(|| assert_eq!(basic_from_utf8_portable(slice).is_ok(), expected_ok));
},
);
}
BenchFn::CompatPortable => {
group.bench_with_input(
BenchmarkId::from_parameter(format!("{:06}", input.len())),
&input,
|b, &slice| {
b.iter(|| assert_eq!(compat_from_utf8_portable(slice).is_ok(), expected_ok));
},
);
}
BenchFn::Std => {
group.bench_with_input(
BenchmarkId::from_parameter(format!("{:06}", input.len())),
Expand Down
2 changes: 2 additions & 0 deletions nightly_workspace/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/Cargo.lock
/target
6 changes: 6 additions & 0 deletions nightly_workspace/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[workspace]
members = [
"simdutf8",
"simdutf8/portable",
"simdutf8/bench"
]
1 change: 1 addition & 0 deletions nightly_workspace/simdutf8
6 changes: 6 additions & 0 deletions portable/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
/target
/.vscode
/.idea
/.zed
/.cargo
/Cargo.lock
2 changes: 2 additions & 0 deletions portable/.prettierrc.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
proseWrap = "always"
printWidth = 100
40 changes: 40 additions & 0 deletions portable/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
[package]
name = "simdutf8-portable"
version = "0.0.1"
authors = ["Hans Kratz <[email protected]>"]
edition = "2024"
description = "SIMD-accelerated UTF-8 validation using core::simd (experimental)"
documentation = "https://docs.rs/simdutf8-portable/"
homepage = "https://github.com/rusticstuff/simdutf8/tree/main/portable"
repository = "https://github.com/rusticstuff/simdutf8"
readme = "README.md"
keywords = ["utf-8", "unicode", "string", "validation", "simd"]
categories = ["encoding", "algorithms", "no-std"]
license = "MIT OR Apache-2.0"

[features]
default = ["std"]

std = []

# expose SIMD implementations in basic::imp::* and compat::imp::*
public_imp = []

# features to force a certain implementation. Features earlier in the list take
# precedence.

# force non-SIMD fallback implementation (for testing)
force_fallback = []
# force 128-bit/256-bit SIMD implementation.
# CAVE: slower than even the fallback implementation if not all SIMD functions
# have a fast implementation, in particular `swizzle_dyn` needs to be fast.
force_simd128 = []
force_simd256 = []

[package.metadata.docs.rs]
features = ["public_imp"]
rustdoc-args = ["--cfg", "docsrs"]
default-target = "x86_64-unknown-linux-gnu"

[dependencies]
cfg-if = "1.0.0"
Loading