Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/wipe gribjump #29

Merged
merged 8 commits into from
Aug 5, 2024
Merged

Feature/wipe gribjump #29

merged 8 commits into from
Aug 5, 2024

Conversation

ChrisspyB
Copy link
Member

Add support for fdb-move, fdb-wipe and fdb-purge to cleanup "auxiliary" files created by external programs or plugins.
Developed to facilitate removal of .data.gribjump files generated by the gribjump cache.

@codecov-commenter
Copy link

codecov-commenter commented Jul 30, 2024

Codecov Report

Attention: Patch coverage is 92.93478% with 13 lines in your changes missing coverage. Please review.

Project coverage is 63.92%. Comparing base (c2fbb55) to head (6a52320).

Files Patch % Lines
src/fdb5/toc/TocWipeVisitor.cc 71.42% 8 Missing ⚠️
src/fdb5/database/Store.h 0.00% 2 Missing ⚠️
src/fdb5/toc/TocStore.cc 93.33% 2 Missing ⚠️
src/fdb5/database/PurgeVisitor.h 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop      #29      +/-   ##
===========================================
+ Coverage    63.52%   63.92%   +0.40%     
===========================================
  Files          236      237       +1     
  Lines        13410    13587     +177     
  Branches      1291     1322      +31     
===========================================
+ Hits          8519     8686     +167     
- Misses        4891     4901      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member Author

@ChrisspyB ChrisspyB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

src/fdb5/database/AuxRegistry.cc Outdated Show resolved Hide resolved
src/fdb5/toc/TocWipeVisitor.cc Outdated Show resolved Hide resolved
src/fdb5/toc/TocWipeVisitor.cc Show resolved Hide resolved
src/fdb5/toc/TocPurgeVisitor.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@simondsmart simondsmart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. Only minor comments from me now. Once you've addressed, feel free to merge (or to poke me to merge it)

src/fdb5/api/local/PurgeVisitor.cc Outdated Show resolved Hide resolved
src/fdb5/daos/DaosStore.h Outdated Show resolved Hide resolved
src/fdb5/database/PurgeVisitor.h Outdated Show resolved Hide resolved
for (const auto& it : dataUsage_) { // <std::string, size_t>

// Check if .data file is deletable
bool deletable = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this logic being duplicated from somewhere?

Copy link
Member Author

@ChrisspyB ChrisspyB Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of, similar logic is run immediately before deleting the .data files. But we need to run it earlier in order to collect the auxiliary files (for reporting purposes).

It would be better to not duplicate the logic, but that would require rewriting the tool to explicitly split the dataUsage_ object into keepData_ and deleteData_ objects, which is a bigger rewrite than I want to do right now.

out << std::endl;

out << "Auxiliary files to be kept:" << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above we only list the stuff to remove (there is likely to be a lot more stuff to keep than remove). Not sure that we should be listing the keepAuxFiles_ here.

Copy link
Member Author

@ChrisspyB ChrisspyB Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't just list what we are going to remove: all of the .data files in dataUsage_ are listed, alongside their usage (0 if to be deleted, N otherwise). So I do the same, by listing all of the aux files related to these files whether they are to be deleted or not.

src/fdb5/toc/TocPurgeVisitor.cc Show resolved Hide resolved
eckit::URI TocStore::getAuxiliaryURI(const eckit::URI& uri, const std::string& ext) const {
// Filebackend: ext is a suffix to append to the file name
ASSERT(uri.scheme() == type());
eckit::PathName path = uri.path() + "." + ext;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like an odd function to have here in light of the auxFileExtensions() method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to be the generic method, that all of the stores implement. So DaosStore::getAuxiliaryURI(uri, "gribjump") would return whatever DAOS's implementation of auxiliary objects would be.

auxFileExtensions Is a private method used by the store's moveTo method.


std::set<std::string> TocStore::auxFileExtensions() const {
std::set<std::string> extensions;
for (const auto& e : LibFdb5::instance().auxiliaryRegistry()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this does a lot of string manipulation/memory allocation every time it is called.

Given that this will be constant for all instances, is it worth (a) making it static, (b) memoising the result, (c) returning that result by const reference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. Though, it is only called once, and only in the moveTo method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to rewrite this a bit. TocStore now has a member std::set<std::string> auxFileExtensions_ which is set using the auxFileExtensions() method in the ctor. So, definitely only called once now.

src/fdb5/toc/TocStore.cc Outdated Show resolved Hide resolved
tests/fdb/api/test_auxiliary.cc Show resolved Hide resolved
@simondsmart simondsmart added the approved-for-ci Approved for CI run label Aug 1, 2024
@simondsmart simondsmart merged commit 1122f01 into develop Aug 5, 2024
111 checks passed
@simondsmart simondsmart deleted the feature/wipe-gribjump branch August 5, 2024 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved-for-ci Approved for CI run
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants