-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve cryptography tests. #3979
Conversation
29d039a
to
bbc768c
Compare
tiledb/sm/crypto/crypto_win32.cc
Outdated
@@ -32,11 +32,15 @@ | |||
|
|||
#ifdef _WIN32 | |||
|
|||
#include "tiledb/sm/crypto/crypto_win32.h" | |||
#include <windows.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NOMINMAX
macro definition needs to accompany the inclusion of windows.h
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted.
tiledb/sm/crypto/crypto_win32.cc
Outdated
#include "tiledb/sm/crypto/crypto_win32.h" | ||
#include <windows.h> | ||
|
||
#include <bcrypt.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need <windows.h>
, or is this header alone sufficient?
Also, if there's an inclusion order dependency between external headers, that should be documented.
tiledb/sm/crypto/test/unit_crypto.cc
Outdated
// hex. The function is generic over the hash and the input type. | ||
template <class Hash> | ||
static void test_hash( | ||
const uint8_t* input, uint64_t length, const std::string& expected_hash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer span
of separate character pointer and length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean std::span
? It in C++ 20. Until then I used std::string
, to unify tests for SHA256 where the input is in hex and MD5 where it isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a replacement for span
in our code base. It's available through common.h
.
tiledb/sm/crypto/test/unit_crypto.cc
Outdated
expected_hash); | ||
} | ||
|
||
struct MD5Hash { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a traits class and should be named accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the implementation a bit. Instead of trait types the test function directly takes the hash function and its digest size as template arguments.
tiledb/sm/crypto/test/unit_crypto.cc
Outdated
}; | ||
|
||
TEST_CASE("Crypto: Test MD5", "[crypto][md5]") { | ||
auto test_md5 = [](const std::string input, const std::string expected_hash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function definition is more or less repeated below. It would be appropriate to define a function template and to instantiate it in a using
statement. using test_md5 = validate_argument_vs_return<MD5Hash>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My refactorings have reduced the code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it would be better to put the citations for where the test vectors come from up in the file-level documentation at the top. @section Test Plan
can contain them. It should explain that the test plan is to verify the test vectors in the reference materials.
The test plan should also have argument validation, for example against nullptr
. I'm not seeing those tests.
We fill two 64-byte buffers with random data and check that their content is not equal. Also `crypto_win32.h` was refactored to not require `windows.h`.
They are redundant now that we have the new ones and were using the deprecated sprintf.
bbc768c
to
7083a61
Compare
(force-pushed to update the branch) |
…ointers. It became easier to use.
…of trait types. This also validates that the expected hash has the correct length.
The hash and RNG implementations don't check for null. I don't think anything more than an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New code needs span
. We have incorporated it into the global namespace in common.h
. The implementation is a replacement for C++20 std::span
. It's in our code base as part of an effort to write out occurrences of Buffer
. Using string
instead of span
is hack that we don't need.
Switching to span
and adding citations for the AES-GCM look like the only further changes we'll need.
FYI: We also want to replace Buffer
with span in all the crypto functions. I think there's a story for that already.
tiledb/sm/crypto/crypto.cc
Outdated
} | ||
|
||
Status Crypto::get_random_bytes(unsigned char* output, unsigned num_bytes) { | ||
return PlatformCrypto::get_random_bytes(output, num_bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new function should use span
.
tiledb/sm/crypto/test/unit_crypto.cc
Outdated
* | ||
* SHA-256 test vectors were taken from SHA256ShortMsg.rsp, from "SHA Test | ||
* Vectors for Hashing Bit-Oriented Messages" | ||
* (https://csrc.nist.gov/Projects/cryptographic-algorithm-validation-program/Secure-Hashing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly what's needed. Citations allow auditing the code against published standards.
tiledb/sm/crypto/test/unit_crypto.cc
Outdated
0); | ||
auto test_md5 = | ||
test_expected_hash_value<Crypto::md5, Crypto::MD5_DIGEST_BYTES>; | ||
static const std::vector<std::pair<std::string, std::string>> test_cases{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string_view
can be initialized with string literals and converts readily to span
.
243fbb3
to
10ab36d
Compare
CI is green, this is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Major: Why do we need get_random_bytes
exposed. Doesn't seem like it has any upside.
tiledb/sm/crypto/CMakeLists.txt
Outdated
#this_target_link_libraries(OpenSSL::Crypto) | ||
target_link_libraries(tiledb_crypto PRIVATE OpenSSL::Crypto) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're not going to use this_target_link_libraries
, you need documentation explaining why.
Nit: indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed indentation. I don't know why we didn't use this_target_link_libraries
, let me investigate…
tiledb/sm/crypto/test/main.cc
Outdated
(void)sizeof(tiledb::sm::Crypto); | ||
return 0; | ||
} | ||
#define CATCH_CONFIG_MAIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's only one test file, and there are unlikely to ever be more than that one, this definition could just go in the test source file, which would allow this file to be eliminated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this file is not needed at all and the entry point is provided by the Catch2::Catch2WithMain
target. I removed it.
tiledb/sm/crypto/crypto.h
Outdated
@@ -158,6 +158,14 @@ class Crypto { | |||
*/ | |||
static Status sha256( | |||
const void* input, uint64_t input_read_size, Buffer* output); | |||
|
|||
/** | |||
* Generates a number of cryptographically random bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single line of documentation for something acting as a PRNG is inadequate.
- How's the IV initialized?
- Where's the state held between calls?
- Is the state global?
- Is it thread safe?
As long as these functions were not exposed to the outside, these issues didn't matter as much. If you're going to run tests against them, it matter. And the one test that does anything would pass with poor RNGs such as CRC shift or linear congruential.
Better yet. Simplify this PR and eliminate this function and its do-almost-nothing tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also. We have a PRNG in the code that has all this documented. We don't need another, and we certainly do not want another that's poorly documented.
@@ -129,16 +143,6 @@ class Win32CNG { | |||
uint64_t input_read_size, | |||
Buffer* output, | |||
LPCWSTR hash_algorithm); | |||
|
|||
private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave it private. Or better yet, since it's static, take it out of the class and define it only in the .cc
file.
CHECK(Crypto::get_random_bytes(buf1).ok()); | ||
CHECK(Crypto::get_random_bytes(buf2).ok()); | ||
CHECK(buf1 != buf2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't do anything useful. What's it here for?
* Test that the given input (optionally in hex) has the expected hash value in | ||
* hex. The function is generic over the hash and the input size. | ||
*/ | ||
template <Status Hash(const void*, uint64_t, Buffer*), int Digest_Bytes> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sizes should be size_t
unless they need to be compatible with existing code that (incorrectly) uses any other type for in-memory sizes. Here's that's Digest_Bytes
.
Nit: template arguments should be lower case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions in the Crypto
class use uint64_t
, but changing the test code here to size_t
works. Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the change to size_t
due to macOS CI failures. It will go away soon either way when we use spans instead of pointer-size pairs.
This file is no longer necessary with Catch2 v3, which provides an entry point through the `Catch2::Catch2WithMain` target.
Feedback addressed. I had added the |
Switching it to `size_t` caused failures in macOS CI.
e717a68
to
8b08750
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One remaining question about OpenSSL::Crypto
in the object library specification.
#this_target_link_libraries(OpenSSL::Crypto) | ||
target_link_libraries(tiledb_crypto PRIVATE OpenSSL::Crypto) | ||
#this_target_link_libraries(OpenSSL::Crypto) | ||
target_link_libraries(tiledb_crypto PRIVATE OpenSSL::Crypto) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the resolution on this? It hasn't been addressed from last time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is that this_target_link_libraries
links with PUBLIC
visibility, which is undesirable in this case, because we use OpenSSL as an internal implementation detail. Changing it right now does not break anything, but it will have undesirable effects once the tiledb_crypto
object library is incorporated to the main build (and the OpenSSL::Crypto
target would be a requirement even for shared libraries).
I am going to remove the commented line, and add a comment on why we can't use this_target_link_libraries
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@eric-hughes-tiledb do you have any additional comments? |
@eric-hughes-tiledb, @KiterLuc is there anything left to do here? The last review comment was on a line that already existed before this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
SC-51030
Improved test coverage for the hash algorithms by using official test vectors.
Prerequisite for #3973.
TYPE: NO_HISTORY
DESC: Improve cryptography tests.