Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print-circle is not used, so circular structures cause enormous memory allocations, and general performance issues #30

Open
alphapapa opened this issue Sep 18, 2023 · 9 comments

Comments

@alphapapa
Copy link

Hi,

I've found that, since print-circle isn't used by inspector, inspecting objects that contain circular or repeated references can cause enormous amounts of memory to be allocated, sometimes to the point of exhausting all available virtual memory and invoking the OOM killer on Emacs.

For example, when inspecting the ement-syncs variable from my Ement.el Matrix client, which points to the ement-session struct for the session being synced, when the session had been connected for hours and had a decent amount of data stored in it, attempting to inspect it caused Emacs to use up all my RAM and swap and invoke the OOM killer.

After restarting Emacs, I tried binding print-circle like this:

(defun inspector--princ-to-string (object)
  "Print OBJECT to string using `princ'."
  (with-output-to-string
    (let ((print-circle t))
      (princ object))))

And when I inspected ement-syncs again, it seemed to perform better: it still allocated over 3 GB of memory, but that was a far cry from the ~32 GB it had before (although the Ement session was just-connected, so it didn't have as much data in it, so it's hard to compare the two attempts exactly).

More generally, it seems that the performance of the inspector doesn't quite make sense, given how the objects being printed are so shortly truncated. For example, inspecting the list of rooms in the ement-session object, I get a buffer showing 100 room structs, with the printed representation of each being truncated to 80 characters, yet Emacs takes over 30 seconds to prepare it and allocates 6.5 GB of RAM in the process.

I'm guessing that the problem is that each object is printed to a string and then the string is truncated; this must result in heavy consing and GC'ing, only to show 80 characters of text for an object potentially hundreds of MB in its full printed representation.

Unfortunately, this makes the inspector nearly useless for what ought to be its "star moment", inspecting objects that are too large to navigate easily by other means.

I have some vague ideas about how to improve on these issues, but I don't know how feasible they are, and I don't expect to have time to work on them anytime soon. But at least this issue will serve to note the problems, and maybe we can figure out good solutions. Emacs sorely needs a tool like this, but it needs to be able to handle large objects well.

Thanks for your work on this tool.

@mmontone
Copy link
Owner

Ok. Thanks for reporting. It is good to know this is an issue, and your explanation is very useful.

I'll have a look and see what can be done, and probably apply your patch in the meantime.

@mmontone
Copy link
Owner

30 seconds to prepare it and allocates 6.5 GB of RAM in the process.

wow, that's insane

@mmontone
Copy link
Owner

My tests didn't involve huge data sizes . If you have ideas on some big dataset I can use to test, that's welcomed too.

mmontone added a commit that referenced this issue Sep 18, 2023
This is code by @alphapapa.

See Github issue #30
@mmontone
Copy link
Owner

I'm releasing a version with your patch, as it helps at least with this issue.

@mmontone
Copy link
Owner

mmontone commented Sep 18, 2023

inspector--print-truncated uses with-temp-buffer and I suspect that must be one of the possible causes of the problem.

@alphapapa
Copy link
Author

My tests didn't involve huge data sizes . If you have ideas on some big dataset I can use to test, that's welcomed too.

Well, as I said, I discovered it when trying to inspect Ement's ement-session structure on a connected session. If you're willing to register a free account on matrix.org, you could easily try it for yourself. I can even help you join a few rooms and exchange a few messages to start.

I'm releasing a version with your patch, as it helps at least with this issue.

Thanks. Note, though, that I don't know how enabling print-circle will affect the results. That is, if a value is replaced with a circular reference, it would probably be helpful for that reference to be a link to the actual value.

inspector--print-truncated uses with-temp-buffer and I suspect that must be one of the possible causes of the problem.

Yes, I'm guessing that would lead to excessive GC. A simple and common mitigation technique is to use a single temp buffer for such a function and erase it before each use.

@mmontone
Copy link
Owner

inspector--print-truncated uses with-temp-buffer and I suspect that must be one of the possible causes of the problem.

Yes, I'm guessing that would lead to excessive GC. A simple and common mitigation technique is to use a single temp buffer for such a function and erase it before each use.

I'm going to try that.

mmontone added a commit that referenced this issue Sep 25, 2023
@mmontone
Copy link
Owner

I've pushed an implementation. It feels fast, but I haven't measured with big data yet.

@mmontone
Copy link
Owner

mmontone commented Sep 27, 2023

I'm releasing a version with your patch, as it helps at least with this issue.

Thanks. Note, though, that I don't know how enabling print-circle will affect the results. That is, if a value is replaced with a circular reference, it would probably be helpful for that reference to be a link to the actual value.

It could be interesting to try a custom implementation of printing of circular structures for the inspector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants