Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid ResourceWarning: unclosed file #395

Merged
merged 2 commits into from
Jul 7, 2024
Merged

Conversation

kvid
Copy link
Collaborator

@kvid kvid commented Jun 23, 2024

A number of such warnings showed up when running (with wireviz v0.4) e.g.

PYTHONWARNINGS=always python build_examples.py
PYTHONWARNINGS=always wireviz ../../examples/demo0?.yml

See #309 (comment)

Fix: All open() calls should be in a "with open() as x" statement to ensure closing the file when exiting the block in any way. Otherwise, use the new file_read_text() or file_write_text() thin wrapper functions to read or write the whole utf-8 text file and closing it.

@kvid kvid mentioned this pull request Jun 23, 2024
25 tasks

# embed SVG diagram (only if used)
def svgdata() -> str:
return re.sub(
"^<[?]xml [^?>]*[?]>[^<]*<!DOCTYPE [^>]*>",
"<!-- XML and DOCTYPE declarations from SVG file removed -->",
open_file_read(f"{filename}.tmp.svg").read(),
file_read_text(f"{filename}.tmp.svg"),
Copy link
Collaborator Author

@kvid kvid Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question might be slightly out of scope for this PR, but needs to be raised: Here, the SVG file is specified to be utf-8 encoded, but it's not specified at:
https://github.com/wireviz/WireViz/blob/close_files/src/wireviz/svgembed.py#L62-L64

What is correct? Should we specify utf-8 at both places, or does it really depend on what encoding is specified in the leading part of the SVG file itself?

Or doesn't the encoding matter at the latter location because it's just copying a file to another file? In that case, maybe read_bytes() and write_bytes() (or something like shutil.copyfile()) is a better alternative?

Copy link
Collaborator

@formatc1702 formatc1702 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Since the HTML template contains <meta charset="UTF-8">, it would be best if the embedded SVG also was UTF-8. Not sure if we know or have control over how Graphviz chooses to encode its output. Are we just lucky that it's already UTF-8?

Copy link
Collaborator Author

@kvid kvid Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should assume utf-8 and specify that encoding at both places. I also found a third place where that is already done: https://github.com/wireviz/WireViz/blob/close_files/src/wireviz/Harness.py#L662

Then, we probably also should (as a minimum only where embedding SVG into HTML, or ideally at all places reading SVG) verify that the encoding property of the leading xml tag is either absent (utf-8 is default, I believe) or equal to any of the legal value variations that specify utf-8. If we detect a discrepancy, should we raise an exception or just print a warning stating e.g. that some characters might be rendered wrongly due to an unexpected encoding in the SVG file?

If we can get or create some SVG with e.g. encoding="ISO-8859-1" containg some known characters outside the common ASCII range, we could test to see the effect of assuming the wrong encoding at the different parts of our code. Then it'll be easier to describe possible consequences in a warning message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I would prefer to keep it simple here and work with the assumption, in order to finish the release of v4.1.
Verification can be done as a separate feature/PR.

Copy link
Collaborator Author

@kvid kvid Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #400 to follow up this issue. Please add your concerns and opinions there.

Update: I also added TODOs at a few code locations that might need attention about such encoding and charset issues because line number references as I've used earlier in this thread will probably not survive the #251 merge-in. 😃

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks. I see some recent force-pushes so please re-request review once it's ready :)

A number of such warnings showed up when running e.g.
PYTHONWARNINGS=always python build_examples.py
PYTHONWARNINGS=always wireviz ../../examples/demo0?.yml
See #309 (comment)

Fix: All open() calls should be in a "with open() as x" statement
to ensure closing the file when exiting the block in any way.
Otherwise, use the new file_read_text() or file_write_text() functions
to read or write the whole utf-8 text file and closing it.
@formatc1702 formatc1702 merged commit ae03bd6 into release/v0.4.1-rc Jul 7, 2024
4 checks passed
formatc1702 pushed a commit that referenced this pull request Jul 7, 2024
A number of such warnings showed up when running e.g.
PYTHONWARNINGS=always python build_examples.py
PYTHONWARNINGS=always wireviz ../../examples/demo0?.yml
See #309 (comment)

Fix: All open() calls should be in a "with open() as x" statement
to ensure closing the file when exiting the block in any way.
Otherwise, use the new file_read_text() or file_write_text() functions
to read or write the whole utf-8 text file and closing it.
@formatc1702 formatc1702 deleted the close_files branch July 7, 2024 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants