-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/wipe gribjump #29
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #29 +/- ##
===========================================
+ Coverage 63.52% 63.92% +0.40%
===========================================
Files 236 237 +1
Lines 13410 13587 +177
Branches 1291 1322 +31
===========================================
+ Hits 8519 8686 +167
- Misses 4891 4901 +10 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good. Only minor comments from me now. Once you've addressed, feel free to merge (or to poke me to merge it)
for (const auto& it : dataUsage_) { // <std::string, size_t> | ||
|
||
// Check if .data file is deletable | ||
bool deletable = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this logic being duplicated from somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of, similar logic is run immediately before deleting the .data files. But we need to run it earlier in order to collect the auxiliary files (for reporting purposes).
It would be better to not duplicate the logic, but that would require rewriting the tool to explicitly split the dataUsage_ object into keepData_ and deleteData_ objects, which is a bigger rewrite than I want to do right now.
out << std::endl; | ||
|
||
out << "Auxiliary files to be kept:" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above we only list the stuff to remove (there is likely to be a lot more stuff to keep than remove). Not sure that we should be listing the keepAuxFiles_ here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't just list what we are going to remove: all of the .data files in dataUsage_ are listed, alongside their usage (0 if to be deleted, N otherwise). So I do the same, by listing all of the aux files related to these files whether they are to be deleted or not.
eckit::URI TocStore::getAuxiliaryURI(const eckit::URI& uri, const std::string& ext) const { | ||
// Filebackend: ext is a suffix to append to the file name | ||
ASSERT(uri.scheme() == type()); | ||
eckit::PathName path = uri.path() + "." + ext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like an odd function to have here in light of the auxFileExtensions() method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be the generic method, that all of the stores implement. So DaosStore::getAuxiliaryURI(uri, "gribjump") would return whatever DAOS's implementation of auxiliary objects would be.
auxFileExtensions Is a private method used by the store's moveTo method.
|
||
std::set<std::string> TocStore::auxFileExtensions() const { | ||
std::set<std::string> extensions; | ||
for (const auto& e : LibFdb5::instance().auxiliaryRegistry()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this does a lot of string manipulation/memory allocation every time it is called.
Given that this will be constant for all instances, is it worth (a) making it static, (b) memoising the result, (c) returning that result by const reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. Though, it is only called once, and only in the moveTo method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've decided to rewrite this a bit. TocStore now has a member std::set<std::string> auxFileExtensions_
which is set using the auxFileExtensions() method in the ctor. So, definitely only called once now.
Add support for fdb-move, fdb-wipe and fdb-purge to cleanup "auxiliary" files created by external programs or plugins.
Developed to facilitate removal of
.data.gribjump
files generated by the gribjump cache.