-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix list_ for presense detection #10
base: main
Are you sure you want to change the base?
Fix list_ for presense detection #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thanks for taking the time to contribute!
I took a while to look into your bug description and this PR. Please note that in general, this cache module is not broken completely, so we can't just introduce several breaking changes without considering the necessity. It's just the ls
/list_
function that needs fixing.
The solution appears to be subtly different:
First, note that a cache module is a general data cache, not a minion_data_cache
-specific one, so we can't do this since it's specific to a different layer:
Cache modules are modeled after a filesystem-like hierarchy, where the cache bank is a path of nested directories, the cache key a file. The master stores minion data inside the "directory" minions/<minion_id>
in a file named data
:
The manage.alived
runner function ends up listing the minions
"directory", expecting a list of sub-"directories" (i.e. the minion names):
The mysql cache module does not support this hierarchy properly though, it treats the bank
as a monolithic key and lists "files" only:
query = "SELECT etcd_key FROM {} WHERE bank=%s".format(__context__["mysql_table_name"]) |
Imho the (basic, to explain the issue) fix to this is using a LIKE
query to amend the returned result:
@@ -3,11 +3,19 @@
Return an iterable object containing all entries stored in the specified
bank.
"""
+ bank = bank.strip("/")
_init_client()
- query = "SELECT etcd_key FROM {} WHERE bank=%s".format(
+ keys_query = "SELECT etcd_key FROM {} WHERE bank=%s".format(
__context__["mysql_table_name"]
)
- cur, _ = run_query(__context__.get("mysql_client"), query, args=(bank,))
+ cur, _ = run_query(__context__.get("mysql_client"), keys_query, args=(bank,))
out = [row[0] for row in cur.fetchall()]
cur.close()
+ banks_query = "SELECT bank FROM {} WHERE bank LIKE %s".format(
+ __context__["mysql_table_name"]
+ )
+ cur, _ = run_query(__context__.get("mysql_client"), banks_query, args=(f"{bank}/%",))
+ bank_len = len(bank.split("/"))
+ out += list(set(row[0].split("/")[bank_len] for row in cur.fetchall()))
+ cur.close()
return out
This can probably be optimized down to a single query for both bank
and etcd_key
by deciding whether to include the etcd_key
or the bank
part in Python.
@lkubb I have made adjustments based on your comments, I got it wrong when i make it only work for minions only. Indeed I came across cases where in the code they use the cache.list function with these parameters:
These are the bank, the assumption I made is that the text behind the last / is the minion name or domain name like shown with other banks. Also while its possible to make use of the etcd_key, its not a argument for this function therefor I did choose to not use it at all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this :)
I'd really recommend to install and use pre-commit
to avoid simple lint errors.
It would be awesome if you could add test coverage as well (https://salt-extensions.github.io/salt-extension-copier/topics/testing/index.html), otherwise I'll need to find the time to do it myself.
""" | ||
bank_like = bank + '%' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: This should work with and without trailing /
bank_like = bank + '%' | |
bank = bank.rstrip("/") | |
bank_like = bank + "/%" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not know, that in the code they are using a trailing /
But its easy to handle like this indeed.
out = [row[0] for row in cur.fetchall()] | ||
cur.close() | ||
return out | ||
data = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue(blocking): This does not deduplicate, is missing files and does not list subdirectories correctly. Take the following structure (key == etc_key for brevity):
- bank: root/sub/dir
key: foo
- bank: root/sub/dir
key: bar
- bank: root/another_sub/dir
key: baz
- bank: root
key: file
Listing root
should yield:
- sub
- another_sub
- file
But this returns:
- dir
- dir
- dir
Please see the last paragraph of my previous review comment for suggestions.
Edit: Noticed another issue caused by returning the last path segment. You need to strip the bank
that is being listed from the entry, then return the first path segment only. If you fix this, the function would still incorrectly return:
- sub
- sub
- another_sub
Thus, I'd use a set for data
and only cast it to a list when returning from the function, which would yield:
- sub
- another_sub
The last thing that's missing is etcd_key
entries where the bank
entry matches the bank
parameter exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review. I'm thinking of using the last part returned by mysql of the bank this should contain sub, another_sub, file
. Using sets sounds like a great idea to fix the duplicate entries. Does it happen the etcd_key that matches the full bank parameter, perhaps it could be fixed by one SQL query without LIKE. I really don't like the idea of having another SQL query.
I'm grateful for all that you have done, I will do some thinking of my own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work with a single query similar to this mockup (definitely needs adjustments, e.g. client initialization, possibly the for
loop):
def list_(bank):
bank = bank.strip("/")
query = "SELECT bank, etcd_key FROM {} WHERE bank LIKE %s".format(
__context__["mysql_table_name"]
)
cur, _ = run_query(__context__.get("mysql_client"), keys_query, args=(bank,))
res = set()
for res_bank, res_key in cur.fetchall():
if res_bank == bank:
res.add(res_key)
else:
res.add(res_bank[len(bank) + 1:].split("/")[0])
return list(res)
Co-authored-by: jeanluc <[email protected]>
What does this PR do?
It fixes the ls function now named list_
As far as I am aware this never worked or it used to work a very long time ago.
What issues does this PR fix or reference?
Fixes: #9
Previous Behavior
No list from manage.alived
And the salt/presence/present event list is empty
New Behavior
The variable names have been changed to mysql_cache because it is not possible to use both MySQL for the returner and the cache with different variables. Therefore, I needed to change the variable names.
Merge requirements satisfied?
[NOTICE] Bug fixes or features added to Salt require tests.
Commits signed with GPG?
Yes
Please review Salt's Contributing Guide for best practices.
See GitHub's page on GPG signing for more information about signing commits with GPG.