-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch littered status feedback in csv_parser #338
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the loop to a comprehension. For the other comment I don't have a strong opinion.
tardis/adapters/sites/slurm.py
Outdated
cleaned_stdout = [] | ||
for row in slurm_status.stdout.splitlines(): | ||
if "|" in row: | ||
cleaned_stdout.append(row) | ||
for row in csv_parser( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to directly use csv.DictReader
here. This would avoid the useless roundtrip of creating an iterable of lines, joining that to newline separated lines, and then creating an iterable of lines from it again.
Not sure how straightforward that is nor whether I'm overlooking something, though.
I'm fine either way, please follow your preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it uses csv.DictReader
.
tardis/tardis/utilities/utils.py
Lines 36 to 66 in 4f93e96
Parses CSV formatted input | |
:param input_csv: CSV formatted input | |
:type input_csv: str | |
:param fieldnames: corresponding field names | |
:type fieldnames: [List, Tuple] | |
:param delimiter: delimiter between entries | |
:type delimiter: str | |
:param replacements: fields to be replaced | |
:type replacements: dict | |
:param skipinitialspace: ignore whitespace immediately following the delimiter | |
:type skipinitialspace: bool | |
:param skiptrailingspace: ignore whitespace at the end of each csv row | |
:type skiptrailingspace: bool | |
""" | |
if skiptrailingspace: | |
input_csv = "\n".join((line.strip() for line in input_csv.splitlines())) | |
replacements = replacements or {} | |
with StringIO(input_csv) as csv_input: | |
csv_reader = csv.DictReader( | |
csv_input, | |
fieldnames=fieldnames, | |
delimiter=delimiter, | |
skipinitialspace=skipinitialspace, | |
) | |
for row in csv_reader: | |
yield { | |
key: value if value not in replacements.keys() else replacements[value] | |
for key, value in row.items() | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I should not touch this if there is no clear request from you side. So I leave it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes should definitely go to
tardis/tardis/utilities/utils.py
Line 53 in 1c9cd84
csv_parser
like Moab and HTCondor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, it would be nice to have a unittest for that issue as well. It should go to the following class
tardis/tests/utilities_t/test_utils.py
Line 90 in 4bc109d
class TestCSVParser(TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought as well that the csv_parser could be modified to skip lines which don't contain the delimiter. But I wasn't sure if this is always intended or expected. Should we add another parameter skiplineswithoutdelimiter with some default to make it adjustable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the unittest be a separate test, which would duplicate some of the existing tests' code, or just an additional nonsense line in the input of one of the existing tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the unittest be a separate test, which would duplicate some of the existing tests' code, or just an additional nonsense line in the input of one of the existing tests?
Should be a separate member function of that class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought as well that the csv_parser could be modified to skip lines which don't contain the delimiter. But I wasn't sure if this is always intended or expected. Should we add another parameter skiplineswithoutdelimiter with some default to make it adjustable?
I think we have no use-case, where the delimiter is expected to be missing. To be save, I would suggest to require len(fieldnames)>1
in addtion, so the delimiter is expected to exist. There is still the possibility to run in this issue, if the delimiter is a space. However, I think this is something that is hard to avoid and in addition, our site adapters currently use either \t
or |
as delimiter and generally using a space as a delimiter in a csv is close to suicide ;-).
Co-authored-by: Max Fischer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline comments.
I believe I addressed all comments. Please check again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR-Reviewers wish you happy merges!
(Yam, missed waiting for the CI to run... 😓 )
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #338 +/- ##
==========================================
+ Coverage 98.87% 98.92% +0.04%
==========================================
Files 55 55
Lines 2225 2226 +1
==========================================
+ Hits 2200 2202 +2
+ Misses 25 24 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your contribution. LGTM!
I observed a case where the ssh-executor running the squeue command returned slurm_status.stdout containing
Switch to work directory on corresponding scratch directory.
Switch to work directory on corresponding scratch directory.
Switch to work directory on corresponding scratch directory.
5545112||PENDING
where only the last line is expected to be returned from squeue. The other lines might be some system messages (?).
This PR adds a filter that only keeps lines with "|" before feeding them into the csv-Parser.