-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-added the string cache, disabled by default (bonus: also thread-safe) #172
base: master
Are you sure you want to change the base?
Conversation
…lso very importantly memory fragmentation). Added StringCache.pas. Changed attributes to use the string cache. Note required a define USESTRINGCACHE which is off by default. The cache is not threadsafe.
…ionary of attributes, but attributes no longer work as a dictionary.)
…en threadsafe, any individual instance has get/add method contents wrapped in a lock, to lock the internal structures. The instance is never cleared when the count of objects using it drops to 0, since that's a bit more complex to get right (should only clear while the refcount is 0, but have to lock that to ensure it's not changed while clearing is happening, which severely slows inc/decrementing the count. I couldn't think of a good compare-lock-exchange algorithm. So just don't bother; it's never cleared while alive, when threadsafe.)
…both operations; now holds it for both. Safe to enter a CS twice.)
This reverts commit 95233e9.
Thanks, @vintagedave Btw, I'm asking because I've never tried. You use class as a dictionary key. Does it work? I afraid it could use pointer, not a content. |
Great :) I also have another branch (not done yet) for using a memory pool for all syntax nodes, using this technique: https://parnassus.co/custom-object-memory-allocation-in-delphi-bypassing-fastmm-for-fun-and-profit/ Is that of interest too? Re dictionary - is that TStringCache.FStringToId? It would use the pointer, but I construct the dictionary with a TStringRecValueEqualityComparer instance which is use for the comparisons. It compares by the string value in each TStringRec only. |
Ah, OK. I see :) |
Do you think memory fragmentation is an issue? |
For me it is, but that only occurs when using DAST a lot, and under memory pressure, eg in the pre-XE8 IDE. It might be for you for FixInsight over a large project, for example. I think FI still runs DAST a lot less than Navigator or Bookmarks do, which basically re-parse after a short delay after every single code change. |
Can't this be solved way easier?
|
Yes, though the current code allows usage tracking - you can get stats out On 28 April 2016 at 18:44, Stefan Glienke [email protected] wrote:
|
Stefan's approach seems to be less intrusive for me. I think it can be improved by adding a method that will output statistics. We can read string's ref counter for usage tracking. What do you think @vintagedave ? |
Yes, I think it's much better. @sglienke told me he submitted a patch where strings are interned in the lexer itself, which means it's much lower-level too. I'd go with that option as much better than mine. |
I don't see his pull request :) |
Look at the feature/stringinterning branch of my fork to see my approach - I did not submit a PR because this is just to show the concept. |
So everyone agree to take Stefan's implementation of string cache? |
For me, yes. |
My string pool implementation was taken into master long ago - this can be closed |
I added a string cache to DAST in August last year, which kept down memory fragmentation when it was used repeatedly. It was never pulled into the main branch and that branch is now very hard to merge. I've re-implemented it.
It's controlled by a define, which is off by default, so behaviour is unchanged unless you explicitly turn it on. If you turn it on, attributes and valued syntax nodes share strings. This greatly reduces memory fragmentation and also memory usage. The actual changed code is quite small so it should be easy to maintain or keep in the product.
I investigated using injection or some similar "clean code" behaviour to control this, instead of a define, but it's hard when also keeping it performant. Since not much code is affected, ifdef-ing the required places is cheap and hidden from the external user.
The string cache also has basic thread safety added, where the Add and Get operations hold a critical section while modifying the values. This too is controlled by a define, but it's on by default. It hasn't been extensively reviewed, but I couldn't see any major logic errors.