Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating fast_float to optionally replace strtod #1260

Merged
merged 56 commits into from
Nov 25, 2024

Conversation

parthpatel
Copy link
Member

@parthpatel parthpatel commented Nov 4, 2024

Fast_float is a C++ header-only library to parse doubles using SIMD instructions. The purpose is to speed up sorted sets and other commands that use doubles. A single-file copy of fast_float is included in this repo. This introduces an optional dependency on a C++ compiler.

The use of fast_float is enabled at compile time using the make variable USE_FAST_FLOAT=yes. It is disabled by default.

Fixes #1069.

@madolson
Copy link
Member

madolson commented Nov 4, 2024

I suppose I don't really understand the benefit of a submodule vs inlining. Since we aren't tightly controlling the versioning between the main repo and releases, it becomes much harder to know if there is a security issue impacting a specific release. At the very least we should be pinning a specific version of fast_float that we are pulling.

@parthpatel
Copy link
Member Author

Since we aren't tightly controlling the versioning between the main repo and releases, it becomes much harder to know if there is a security issue impacting a specific release. At the very least we should be pinning a specific version of fast_float that we are pulling.

We track fast_float commit id in the valkey repository with this change - see the copy pasted change from my commit below. For every valkey commit, we can look up exact version of fast_float at all times using this method. Pulling new fast_float version is as simple as git pull on the submodule.


Submodule fast_float added at e800ca

@parthpatel parthpatel added the pending-refinement This issue/request is still a high level idea that needs to be further refined label Nov 4, 2024
@parthpatel parthpatel marked this pull request as draft November 4, 2024 21:19
@madolson
Copy link
Member

madolson commented Nov 4, 2024

Does it get pulled automatically as part of the release into the release artifacts?

@parthpatel
Copy link
Member Author

Does it get pulled automatically as part of the release into the release artifacts?

There is a "git submodule update --init" command in Makefile to initialize it automatically. So yes, It will automatically checkout the same commit every time during build.

@madolson
Copy link
Member

madolson commented Nov 4, 2024

There is a "git submodule update --init" command in Makefile to initialize it automatically. So yes, It will automatically checkout the same commit every time during build.

So we are adding a new dependency to the release process, since you need to be able to fetch the code from github. I think we should consider figuring out how to pull the code in when we do a release so that folks don't need to do git submodule update --init.

@parthpatel
Copy link
Member Author

parthpatel commented Nov 4, 2024

There is a "git submodule update --init" command in Makefile to initialize it automatically. So yes, It will automatically checkout the same commit every time during build.

So we are adding a new dependency to the release process, since you need to be able to fetch the code from github. I think we should consider figuring out how to pull the code in when we do a release so that folks don't need to do git submodule update --init.

The other approach would be to just check-in the whole git repo as a folder under valkey, which should work. What is the issue with git dependency in github release workflows? I can put a post-checkout hook that always initializes modules on checkout, as long as checkout happens outside of the release process.

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

Attention: Patch coverage is 83.33333% with 3 lines in your changes missing coverage. Please review.

Project coverage is 70.76%. Comparing base (86f33ea) to head (3f7cee8).
Report is 25 commits behind head on unstable.

Files with missing lines Patch % Lines
src/valkey-cli.c 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1260      +/-   ##
============================================
+ Coverage     70.55%   70.76%   +0.20%     
============================================
  Files           115      117       +2     
  Lines         63158    63305     +147     
============================================
+ Hits          44561    44797     +236     
+ Misses        18597    18508      -89     
Files with missing lines Coverage Δ
src/debug.c 53.17% <ø> (+0.12%) ⬆️
src/resp_parser.c 98.47% <100.00%> (ø)
src/sort.c 94.82% <100.00%> (+0.01%) ⬆️
src/t_zset.c 95.65% <100.00%> (ø)
src/util.c 71.47% <100.00%> (+0.04%) ⬆️
src/valkey_strtod.h 100.00% <100.00%> (ø)
src/valkey-cli.c 55.53% <0.00%> (+1.69%) ⬆️

... and 25 files with indirect coverage changes

---- 🚨 Try these New Features:

@parthpatel
Copy link
Member Author

I am dropping the submodule idea for now as it requires a larger discussion about release process. I don't have access to @swaingotnochill's repository or PR. Therefore, I pulled his commit into this CR to maintain his author-ship on the code he wrote. I integrated it with Redis and fixed the Makefile. It should build now and will be ready to push.

I will work on benchmarking this separately. I can also turn off fast_float by default if folks have concerns and test it on my branch.

Copy link
Contributor

@swaingotnochill swaingotnochill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to modify. I am on a vacation so it will be difficult for me to work until I am back. Cheers

@parthpatel
Copy link
Member Author

@zuiderkwast or @madolson any pointers on how to solve almalinux issue with missing g++?

cd fast_float && make
make[3]: Entering directory '/__w/valkey/valkey/deps/fast_float'
g++ -std=c++11 -O3 -fPIC -c fast_float_strtod.cpp -o fast_float_strtod.o
make[3]: g++: Command not found
make[3]: *** [Makefile:9: fast_float_strtod.o] Error 127
make[3]: Leaving directory '/__w/valkey/valkey/deps/fast_float'
make[2]: *** [Makefile:80: fast_float] Error 2
make[2]: *** Waiting for unfinished jobs....

@parthpatel parthpatel changed the title Integrating fast_float as a git submodule with Valkey to replace strtod invocation Integrating fast_float with Valkey to replace strtod invocation Nov 5, 2024
@zuiderkwast
Copy link
Contributor

I'm skeptical to submodules too. Offline builds get complicated. Source releases get complicated. So far, we've used either vendored dependencies or system installed ones (like OpenSSL).

I prefer that we vendor this one. We could copy only one or a few files, as we've done with crc64 and some other small libraries?

@parthpatel parthpatel marked this pull request as ready for review November 6, 2024 21:48
swaingotnochill and others added 10 commits November 6, 2024 22:01
* Simplified the interface to remove if branches.
* Simplified Makefile to be more readable.
* Integrating fast_float with the redis code base in resp_parser.c file.

Signed-off-by: Parth Patel <[email protected]>
…fy it explicitly in Makefile to fix 32-bit compilation issues.

Signed-off-by: Parth Patel <[email protected]>
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really close now, I'm going to just apply some of these comments right away.

.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
deps/fast_float_c_interface/fast_float_strtod.cpp Outdated Show resolved Hide resolved
src/valkey_strtod.h Outdated Show resolved Hide resolved
src/unit/test_valkey_strtod.c Outdated Show resolved Hide resolved
src/valkey_strtod.h Outdated Show resolved Hide resolved
src/valkey_strtod.h Outdated Show resolved Hide resolved
src/unit/test_valkey_strtod.c Outdated Show resolved Hide resolved
src/unit/test_valkey_strtod.c Outdated Show resolved Hide resolved
src/valkey_strtod.h Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
@madolson
Copy link
Member

src/Makefile Outdated Show resolved Hide resolved
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @eifrah-aws can you help followup with the CMake changes like we discussed offline?

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing this through! Just one nit.

deps/fast_float_c_interface/fast_float_strtod.cpp Outdated Show resolved Hide resolved
@swaingotnochill
Copy link
Contributor

should we do a benchmark on different architectures for this change?

@zuiderkwast
Copy link
Contributor

should we do a benchmark on different architectures for this change?

@swaingotnochill That never harms! I think it only affects commands with floats, like INCRBYFLOAT and sorted sets, so I would only benchmark things like that.

Errno is never set to zero by any C standard library function. This function mimics strtod.

Signed-off-by: Viktor Söderqvist <[email protected]>
@zuiderkwast zuiderkwast changed the title Integrating fast_float with Valkey to replace strtod invocation Integrating fast_float to optionally replace strtod Nov 25, 2024
@zuiderkwast zuiderkwast merged commit c4920bc into valkey-io:unstable Nov 25, 2024
53 of 57 checks passed
@zuiderkwast zuiderkwast added the release-notes This issue should get a line item in the release notes label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request major-decision-approved Major decision approved by TSC team performance release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate integrating with fast_float
4 participants