Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update our encryption to match changes in Facebook sources #18

Open
wants to merge 59 commits into
base: stardog/develop
Choose a base branch
from

Conversation

matthewvon
Copy link
Member

PENDING

abhimadan and others added 30 commits December 17, 2018 15:33
Summary:
To support the flush/compaction use cases of RangeDelAggregator
in v2, FragmentedRangeTombstoneIterator now supports dropping tombstones
that cannot be read in the compaction output file. Furthermore,
FragmentedRangeTombstoneIterator supports the "snapshot striping" use
case by allowing an iterator to be split by a list of snapshots.
RangeDelAggregatorV2 will use these changes in a follow-up change.

In the process of making these changes, other miscellaneous cleanups
were also done in these files.
Pull Request resolved: facebook/rocksdb#4740

Differential Revision: D13287382

Pulled By: abhimadan

fbshipit-source-id: f5aeb03e1b3058049b80c02a558ee48f723fa48c
Summary:
RangeDelAggregatorV2 now supports ShouldDelete calls on
snapshot stripes and creation of range tombstone compaction iterators.
RangeDelAggregator is no longer used on any non-test code path, and will
be removed in a future commit.
Pull Request resolved: facebook/rocksdb#4758

Differential Revision: D13439254

Pulled By: abhimadan

fbshipit-source-id: fe105bcf8e3d4a2df37a622d5510843cd71b0401
Summary:
Now that v2 is fully functional, the v1 aggregator is removed.
The v2 aggregator has been renamed.
Pull Request resolved: facebook/rocksdb#4778

Differential Revision: D13495930

Pulled By: abhimadan

fbshipit-source-id: 9d69500a60a283e79b6c4fa938fc68a8aa4d40d6
Summary: Pull Request resolved: facebook/rocksdb#4793

Differential Revision: D13509363

Pulled By: abhimadan

fbshipit-source-id: 530b4765e3335d6ecd016bfaa89645f8aa98c61f
Summary:
Declare Jemalloc non-standard APIs as weak symbols, so that if Jemalloc is linked with the binary, these symbols will be replaced by Jemalloc's, otherwise they will be nullptr. This is similar to how folly detect jemalloc, but we assume the main program use jemalloc as long as jemalloc is linked: https://github.com/facebook/folly/blob/master/folly/memory/Malloc.h#L147
Pull Request resolved: facebook/rocksdb#4844

Differential Revision: D13574934

Pulled By: yiwu-arbug

fbshipit-source-id: 7ea871beb1be7d5a1259cc38f9b78078793db2db
Summary:
1. DBImplReadOnly::GetLiveFiles should not return NotSupported. Instead, it
   should call DBImpl::GetLiveFiles(flush_memtable=false).
2. In DBImp::Recover, we should also recover the OPTIONS file name and/or
   number so that an immediate subsequent GetLiveFiles will get the correct
   OPTIONS name.
Pull Request resolved: facebook/rocksdb#4681

Differential Revision: D13069205

Pulled By: riversand963

fbshipit-source-id: 3e6a0174307d06db5a01feb099b306cea1f7f88a
Summary:
If one column family is dropped, we should simply skip it and continue to flush
other active ones.
Currently we use Status::ShutdownInProgress to notify caller of column families
being dropped. In the future, we should consider using a different Status code.
Pull Request resolved: facebook/rocksdb#4708

Differential Revision: D13378954

Pulled By: riversand963

fbshipit-source-id: 42f248cdf2d32d4c0f677cd39012694b8f1328ca
Summary:
in certain cases, we do not perform memtable switching if the active
memtable of the column family is empty. Two exceptions:
1. In manual flush, if cached_recoverable_state_empty_ is false, then we need
   to switch memtable due to requirement of transaction.
2. In switch WAL, we need to switch memtable anyway because we have to seal the
   memtable if the WAL on which it depends will be closed.

This change can potentially delay the occurence of write stalls because number
of memtables increase more slowly.
Pull Request resolved: facebook/rocksdb#4792

Differential Revision: D13499501

Pulled By: riversand963

fbshipit-source-id: 91c9b17ae753578578039f3851667d93610005e1
Summary:
as titled.
Since different bg flush threads can flush different sets of column families
(due to column family creation and drop), we decide not to let one thread
perform atomic flush result installation for other threads. Bg flush threads
will install their atomic flush results sequentially to MANIFEST, using
a conditional variable, i.e. atomic_flush_install_cv_ to coordinate.
Pull Request resolved: facebook/rocksdb#4791

Differential Revision: D13498930

Pulled By: riversand963

fbshipit-source-id: dd7482fc41f4bd22dad1e1ef7d4764ef424688d7
Summary:
Previously for point lookup we decided which file to look into based on user key overlap only. We also did not truncate range tombstones in the point lookup code path. These two ideas did not interact well in cases like this:

- L1 has range tombstone [a, c)#1 and point key b#2. The data is split between file1 with range [a#1,1, b#72057594037927935,15], and file2 with range [b#2, c#1].
- L1's file2 gets compacted to L2.
- User issues `Get()` for b#3.
- L1's file1 is opened and the range tombstone [a, c)#1 is found for b, while no point-key for b is found in L1.
- `Get()` assumes that the range tombstone must cover all data in that range in lower levels, so short circuits and returns `NotFound`.

The solution to this problem is to not look into files that only overlap with the point lookup at a range tombstone sentinel endpoint. In the above example, this would mean not opening L1's file1 or its tombstones during the `Get()`.
Pull Request resolved: facebook/rocksdb#4829

Differential Revision: D13561355

Pulled By: ajkr

fbshipit-source-id: a13c21c816870a2f5d32a48af6dbd719a7d9d19f
Summary:
as titled.
Currently it's possible to create a local object of type PerfContext since it's
part of public API. Then it's safe to initialize the two members to 0.
If PerfContext is created as thread-local object, then all members are
zero-initialized according to C++ standard.
Pull Request resolved: facebook/rocksdb#4859

Differential Revision: D13614504

Pulled By: riversand963

fbshipit-source-id: 406ff548e105a074f379ad1054d56fece5f524a0
Summary:
By convention, time_t almost always stores the integral number of seconds since
00:00 hours, Jan 1, 1970 UTC, according to http://www.cplusplus.com/reference/ctime/time_t/.
We surely want more precision than seconds.
Pull Request resolved: facebook/rocksdb#4868

Differential Revision: D13633046

Pulled By: riversand963

fbshipit-source-id: 4e01e23a22e8838023c51a91247a286dbf3a5396
Summary:
Right now, deleting blob files is not rate limited, even if SstFileManger is specified.
On the other hand, rate limiting blob deletion is not supported. With this change, Blob file
deletion will go through SstFileManager too.
Pull Request resolved: facebook/rocksdb#4904

Differential Revision: D13772545

Pulled By: siying

fbshipit-source-id: bd1b1d0beb26d5167385e00b7ecb8b94b879de84
Summary:
Blob DB files are not tracked by the SFM, so they currently don't get
deleted in the background. Force them to be deleted in background so
rate limiting can be applied
Pull Request resolved: facebook/rocksdb#4928

Differential Revision: D13854649

Pulled By: anand1976

fbshipit-source-id: 8031ce66842ff0af440c715d886b377983dad7d8
Summary:
If we do not do this, then reading MutableCFOptions may have a race condition
with SetOptions which modifies MutableCFOptions.

Also reserve space in advance for vectors to avoid reallocation changing the
address of its elements.

Test plan
```
$make clean && make -j32 all check
$make clean && COMPILE_WITH_TSAN=1 make -j32 all check
$make clean && COMPILE_WITH_ASAN=1 make -j32 all check
```
Pull Request resolved: facebook/rocksdb#4876

Differential Revision: D13644500

Pulled By: riversand963

fbshipit-source-id: 4b8112c5c819d5a2922bb61ad1521b3d2fb2fd47
Summary:
1. this commit fixes our handling of a combination of two separate edge
cases. If a flush job does not pick any memtable to flush (because another
flush job has already picked the same memtables), and the column family
assigned to the flush job is dropped right before RocksDB calls
rocksdb::InstallMemtableAtomicFlushResults, our original code passes
a FileMetaData object whose file number is 0, failing the assertion in
rocksdb::InstallMemtableAtomicFlushResults (assert(m->GetFileNumber() > 0)).
2. Also piggyback a small change: since we already create a local copy of column family's mutable CF options to eliminate potential race condition with `SetOptions` call, we might as well use the local copy in other function calls in the same scope.
Pull Request resolved: facebook/rocksdb#4932

Differential Revision: D13901322

Pulled By: riversand963

fbshipit-source-id: b936580af7c127ea0c6c19ea10cd5fcede9fb0f9
Summary:
Fix the ouput overlap bug when using subcompactions, the upper bound of output
file was extended incorrectly.
Pull Request resolved: facebook/rocksdb#4898

Differential Revision: D13736107

Pulled By: ajkr

fbshipit-source-id: 21dca09f81d5f07bf2766bf566f9b50dcab7d8e3
James Pack and others added 28 commits March 19, 2019 19:03
* mostly compiling version of env_encrypt_2.h ... one std::move error ... fix next

* cleans up build (must add export EXTRA_LDFLAGS=-lcrypto -lssl manually)

* hmm, somebody messed with SYNC_POINT defines ... and made debug unbuildable ... unit tests too

* Add the original encryption Env to the env_basic_test unit test suite.

* I lied.  Forced to clean up Java style code in env_encryption by moving declaration half to env_encryption.h to be able to unit test new OpenSSL encryption.  What a pain.

* Going back to original layout.  Rebuilt env_encryption.h/.cc into proper C++ declaration / definition split to allow proper inheritance (resuse).

* first batch of tests, other than one for file size, work with new encryption code that does not yet encrypt (infrastructure validation).

* ok, GetFileSize() corrected (slow but corrected).  Considering whether or not to do same for GetChildrenFileAttributes

* and add GetChildrenFileAttributes() update

* Activate AESBlockAccessCipherStream.  Remove dead code from original copying of env_encryption.cc.  Unit test works.

* change unique_ptr with deleter to traditional pointer code.  Circle build did not want to compile it.  And add some error checking to EncryptBlock()

* use EVP_MD_CTX_create/destroy instead of new/free.  This is openssl 1.0 syntax that is compatible in openssl 1.1

* move the definition of Sha1Description_t(std::string &) to .cc file in hopes of eliminating link issues in starrocks unit tests

* again move some AES stuff from .h to .cc

* add helper constructor to AesCtrKey_t (NOT TESTED). add IsValid() to Sha1Description and AesCtrKey_t

* attempt to make initialization easier with Sha1Description_t as const.

* const was a really bad idea

* need explicit copy constructor with move disabled

* removed delete of move constructor ... removal seems suspect

* add env_encrypt_2_test to builds.  Test Sha1Description_t.

* saving for safety.  first NIST AES case matches.  code is in hack state.  will clean and add other cases tomorrow.

* code clean up within EncryptBlock.  push to see if circle compiles

* add remaining NIST cases.

* added operator== for unit testing

* remove non-portable byteswap.h ... not using it anymore ... and breaks OSX build

* attempt include fix for osx

* make openssl dependency OS specific

* backport files used in Facebook/rocksdb PR

* rename our env_encrypt_2 to more rocksdb-like env_encrypt2

* create conditional build of EnvEncrypt2 based on flag ROCKSDB_OPENSSL_AES_CTR

* linux library loader code.  not integrated.  not yet supporting OSX

* Linux library load ready for libcrypto SHA1 and RAND functions (includes unit tests).  AES CTR functions next.

* Add remaining functions from libcrypto that are used in EncryptedEnv2

* slight change to have .dylib names instead of .so names on OSX build

* Create EncryptedEnv2::WriteKey_t and ReadKeys_t to simplify look of code.

* create EncryptedEnv2::Default() to help time static loading of libcrypto.

* clang-format applied

* remove conditional openssl from OSX build

* hmm, missed removing include files for openssl

* Revert "hmm, missed removing include files for openssl"

This reverts commit e22a1f6.

* Revert "remove conditional openssl from OSX build"

This reverts commit 4eef8d4.

* address PR comments from Alex

Co-authored-by: matthewvon <[email protected]>
Co-authored-by: MatthewVon <[email protected]>
…uffers. Raising to 17 to match Basho. Have seen 5% improvement in bsbm100m create. (#9)

Co-authored-by: MatthewVon <[email protected]>
* add ROCKSDB_LITE ifdef around encryption

* adjust code per Facebook PR comments

* adjust code per Facebook PR comments

Co-authored-by: matthewvon <[email protected]>
…nique_ptr in env_encryption declarations. added change to build_detect_platform that only benefits developer builds of rocksdb, i.e. Alex and me (#11)

Co-authored-by: matthewvon <[email protected]>
* this is a current file in facebook/rocksdb.  copying it into our older code to simplify backporting encryption.

* backport DynamicLibrary and LoadLibrary from rocksdb master

* add missing line for CTR AES build

* copy encryption updates required by Facebook

* add new include file to stardog bazel build file

* there is a likely code path where valid_ is not set in EncryptedEnvV2.  this corrects.

* correct code and usage of EncryptedRandomRWFileV2

* fix formating and grammar in some logging

* per man page, switch to EVP_EncryptFinal_ex as soft reset of context object for reuse

Co-authored-by: matthewvon <[email protected]>
* this is a current file in facebook/rocksdb.  copying it into our older code to simplify backporting encryption.

* backport DynamicLibrary and LoadLibrary from rocksdb master

* add missing line for CTR AES build

* copy encryption updates required by Facebook

* add new include file to stardog bazel build file

* there is a likely code path where valid_ is not set in EncryptedEnvV2.  this corrects.

* correct code and usage of EncryptedRandomRWFileV2

* fix formating and grammar in some logging

* per man page, switch to EVP_EncryptFinal_ex as soft reset of context object for reuse

* add test for RAND_poll and RAND_bytes loading

* change method for making sure libcrypto is available ... matches changes in facebook code

* add ToString to Sha Aes and provider

* backport two changes from Facebook encryption PR: make key_lock_ a member variable instead of static to fix Mac OSX crash, and switch to ReadLock() and WriteLock() wrapper objects previously suggested by Alex.

Co-authored-by: matthewvon <[email protected]>
I confirm facebook/rocksdb has made this same change.  The size_t function is const.  It was trying to use lock_ and a recent compiler update spit that out as "bad".  Must make the lock mutable so it can change state in a const function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants