-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid ResourceWarning: unclosed file #395
Conversation
|
||
# embed SVG diagram (only if used) | ||
def svgdata() -> str: | ||
return re.sub( | ||
"^<[?]xml [^?>]*[?]>[^<]*<!DOCTYPE [^>]*>", | ||
"<!-- XML and DOCTYPE declarations from SVG file removed -->", | ||
open_file_read(f"{filename}.tmp.svg").read(), | ||
file_read_text(f"{filename}.tmp.svg"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This question might be slightly out of scope for this PR, but needs to be raised: Here, the SVG file is specified to be utf-8 encoded, but it's not specified at:
https://github.com/wireviz/WireViz/blob/close_files/src/wireviz/svgembed.py#L62-L64
What is correct? Should we specify utf-8 at both places, or does it really depend on what encoding is specified in the leading part of the SVG file itself?
Or doesn't the encoding matter at the latter location because it's just copying a file to another file? In that case, maybe read_bytes()
and write_bytes()
(or something like shutil.copyfile()
) is a better alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Since the HTML template contains <meta charset="UTF-8">
, it would be best if the embedded SVG also was UTF-8. Not sure if we know or have control over how Graphviz chooses to encode its output. Are we just lucky that it's already UTF-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should assume utf-8 and specify that encoding at both places. I also found a third place where that is already done: https://github.com/wireviz/WireViz/blob/close_files/src/wireviz/Harness.py#L662
Then, we probably also should (as a minimum only where embedding SVG into HTML, or ideally at all places reading SVG) verify that the encoding
property of the leading xml
tag is either absent (utf-8 is default, I believe) or equal to any of the legal value variations that specify utf-8. If we detect a discrepancy, should we raise an exception or just print a warning stating e.g. that some characters might be rendered wrongly due to an unexpected encoding
in the SVG file?
If we can get or create some SVG with e.g. encoding="ISO-8859-1"
containg some known characters outside the common ASCII range, we could test to see the effect of assuming the wrong encoding at the different parts of our code. Then it'll be easier to describe possible consequences in a warning message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I would prefer to keep it simple here and work with the assumption, in order to finish the release of v4.1.
Verification can be done as a separate feature/PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #400 to follow up this issue. Please add your concerns and opinions there.
Update: I also added TODOs at a few code locations that might need attention about such encoding
and charset
issues because line number references as I've used earlier in this thread will probably not survive the #251 merge-in. 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks. I see some recent force-pushes so please re-request review once it's ready :)
A number of such warnings showed up when running e.g. PYTHONWARNINGS=always python build_examples.py PYTHONWARNINGS=always wireviz ../../examples/demo0?.yml See #309 (comment) Fix: All open() calls should be in a "with open() as x" statement to ensure closing the file when exiting the block in any way. Otherwise, use the new file_read_text() or file_write_text() functions to read or write the whole utf-8 text file and closing it.
A number of such warnings showed up when running e.g. PYTHONWARNINGS=always python build_examples.py PYTHONWARNINGS=always wireviz ../../examples/demo0?.yml See #309 (comment) Fix: All open() calls should be in a "with open() as x" statement to ensure closing the file when exiting the block in any way. Otherwise, use the new file_read_text() or file_write_text() functions to read or write the whole utf-8 text file and closing it.
A number of such warnings showed up when running (with wireviz v0.4) e.g.
PYTHONWARNINGS=always python build_examples.py PYTHONWARNINGS=always wireviz ../../examples/demo0?.yml
See #309 (comment)
Fix: All
open()
calls should be in a "with open() as x
" statement to ensure closing the file when exiting the block in any way. Otherwise, use the newfile_read_text()
orfile_write_text()
thin wrapper functions to read or write the whole utf-8 text file and closing it.