Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new ICL mechanism (TIP 280+530 replacement) v3, partial string segments back-port #6

Open
wants to merge 16 commits into
base: core-8-6-branch
Choose a base branch
from

Conversation

sebres
Copy link
Owner

@sebres sebres commented Sep 19, 2019

Interim artificial PR.

This will be rebased (partially) to fossil-repository, so this interim PR is temporary (review, comments, etc).
I'll provide an RFE as soon as it gets introduced in fossil repo.

Contents:

This is a replacement implementation of TIP 280 (and workaround TIP 530) with more faster and precise ICL (invisible continuation lines) for parsing, substitution and compilation routines.
The implementation does not have bottleneck (no global ICL hash table anymore), so don't have line continuations overhead (and thus supersedes TIP 530).

Due to certain incompatibility of some parts (back-porting from my featured branches) or many artificial tests (but also obvious bugs and mistakes in some Tcl-tests) this variant implements not whole intended functionality, I provided initially.
My other parser implementation is still more precise and faster (due to in-place wrapping continuations resp. escape sequences, so on demand), but as already said not quite compatible.
Therefore it is reverted/rewritten in some parts and almost 100% compatible now.

As for bugs in tests and original implementation - commit f80a3ec "fixes" the bug in test-case and illustrates why it is basically an error.

More info later...

Additionally it contains new internal primitives (string segments) which allow to share same code or string and its parts, unless object shimmering in unsupported type or NTS string representation is expected...

More info later...

Pros:

  • several objects or parts of objects (byte code, etc) could share same string segment (so the same string
    representation);
  • it is faster replacement for TIP # 280, as well as TIP # 530 (which gets unneeded if it will be merged);
  • therefore fewer memory consumption (and fewer cpu cache washouts);
    for example tcl-test suite running 10266 tests (filtered) within single interpreter consumed 185MB previously, where new version reserves only 140MB memory on same set of tests.
  • this corrects several issues produced by previous implementation of TclContinuations* routines (e. g. clLocPtr
    binding on the object address does not affected by inplace modification of object, so if previously not
    canonical unshared list gets new elements (or it shrinks), then its string representation changes, but
    it still retains its old clLocPtr (bound in global hash table on the address of object only),
    this is worse and can produce very unexpected results by delivering of invisible continuation lines.
  • new features opens an opportunity to share it across threads (if string segment gets thread safe ref-counting and
    source/thread- or tpool-send/etc will be extended), similar handling used successfully in my own fork
    already several years;

Cons:

  • (although the solution manages the memory more economical as original) at the moment it could reserve more memory as really expected (for example if some part of source code executed only once on evaluation of script), because currently it could be reserved in root of string segment.
    The possible solutions are:
    • either a code segments should be created in parsing process (each body would get own code segment
      or at end of compile it cuts unneeded parts out, so splitting of large segments occurs already by parse);
    • or a small "GC" to rewrite or split root code segments after compile process (e. g. during optimization);
  • certain incompatibility (or rather discrepancy) by some tcl-objects (like parts of code, body, etc) which don't have a string representation initially, where it was always available previously,
    it affects mostly internals only, example Tcl_ObjHasBytes(objPtr) (which is still internal) vs. objPtr->bytes

…tLineInfo which is allocated with ByteCode now during compile process).
…h an implementation of code- and string-segment objects (amend/review needed)
…object in compEnv.strSegPtr (env.test, exec.test)
… table (replacement of TIP 530 applied), still shimmering problem with list objects becoming continuation (amend expected);

extend List to avoid possible shimmering issues on invisible continuations.
… with support of list-type (allows to find string segment of list)
…8, this change illustrating "broken" behavior of previous implementation of TclContinuations* routines:

  clLocPtr binding on the object address does not affected by in-place modification of the object (basically other object then),
  so if unshared string gets amended or previously not canonical unshared list gets new elements (or it shrinks),
  then its string representation changes, but it still retains its old clLocPtr (bound in global hash table on the address of object only),
  this is worse and can produce very unexpected results by delivering of invisible continuation lines
  (inclusive SF or panic "Derived ICL data for object using offsets from before the script").
…EGREP really create full (own) segment representation, that avoid rewrite of ICL in shared segments (supplied as clNext to TclContinuationsEnterDerived), todo: rewrite this without usage of TclContinuationsEnterDerived at all (wouldn't need if segments completely back-ported);

+ protect against several cases could occur very rarely;
@sebres sebres changed the title Sebres line cont tip280 530 v3 new ICL mechanism (TIP 280+530 replacement) v3, partial string segments back-port Sep 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant