Clock scan - replacing regexp with scan #34

UnitedMarsupials-zz · 2019-05-03T00:53:33Z

In competition with #4, I took a shot today at modifying the ::tcl::clock::scan to replace usage of regexp with scan. Obviously, the gains are different depending on the format-string complexity.

The below output from my non-regression test shows the timings for both the current code (in Tcl-8.7a1) and mine for several formats.

In both cases the first iteration was ignored -- those take the longest to parse the format initially. The results show average timings for 1000 subsequent iterations, after the scanner-procedure is already created.

With TCL compiled for debugging (no optimization)

Thu May 02 19:18:21 2019:       1024.239 microseconds per iteration
Thu May 02 19:18:21 2019:       230.092 microseconds per iteration
Monday 2019:    412.82 microseconds per iteration
Monday 2019:    195.325 microseconds per iteration
Thu May 02 19:18:21 EDT 2019:   1780.403 microseconds per iteration
Thu May 02 19:18:21 EDT 2019:   246.275 microseconds per iteration
Thu 02 19:18:21 2019:   753.468 microseconds per iteration
Thu 02 19:18:21 2019:   252.612 microseconds per iteration
02/11/18:       273.898 microseconds per iteration
02/11/18:       188.197 microseconds per iteration

TCL compiled with `-O2`

Thu May 02 19:18:21 2019:       835.164 microseconds per iteration
Thu May 02 19:18:21 2019:       159.321 microseconds per iteration
Monday 2019:    266.272 microseconds per iteration
Monday 2019:    139.571 microseconds per iteration
Thu May 02 19:18:21 EDT 2019:   1396.415 microseconds per iteration
Thu May 02 19:18:21 EDT 2019:   168.879 microseconds per iteration
Thu 02 19:18:21 2019:   532.426 microseconds per iteration
Thu 02 19:18:21 2019:   273.006 microseconds per iteration
02/11/18:       188.973 microseconds per iteration
02/11/18:       134.863 microseconds per iteration

The default -- the longest of the tested -- format shows the biggest improvement (7-8 times -- I hope, the bounty would be based on this :-) ).

Should I continue working in this direction -- making sure, existing test-cases all pass, etc., or is @aidanhs so far ahead of me already, I may as well not bother?

The text was updated successfully, but these errors were encountered:

sebres · 2019-05-03T10:54:35Z

I think this bounty is closed.
Flightaware uses my module tclclockmod as long as it is still not released in tcl.core (branch "sebres-8-6-clock-speedup-cr2"), and looks like already accepted from TCT in large part.

BTW, the complete implementation is impossible with pure scan, for example because to be backwards compatible one would need greedy as well as non greedy matching rules for parts with dynamic length, etc pp and you'd almost end up with own implementing of RE-engine (which would be then slower as NRE using scan with dynamic coating).

UnitedMarsupials-zz · 2019-05-03T13:39:16Z

My approach -- so far -- was not a complete rewrite, I still use the existing functions to parse various names, for example. My change only replaces regexp with scan, which is quite a bit faster. I expect it to be fully compatible with the existing implementation -- warts and all -- whatever that's worth.

Another approach would be to continue using regexp, but move the regular expression-strings outside of the scanner-procedures. I suspect, the expressions are currently recompiled anew every time the proc is called. Making them global (inside the namespace) would retain the first compilations' result for each one, I think.

resuna · 2019-05-03T14:32:20Z

This bounty has been completed and we are just waiting for the TCL core team to pick it up. I will update the readme.

UnitedMarsupials-zz · 2019-05-03T19:58:48Z

Oh well, as a TCL user I'm certainly glad to hear about this development and I can't wait for the improvements to hit a TCL release (and, maybe, be backported into 8.6 as well).

For the record, here is the gist of my own changes. As I indicated, I replace the regexp-call followed by multiple calls to scan with a single scan. For example, here are the changes for the handling of %y - changes for other for other formatting characters are very similar:

                    y {                 # Two-digit Gregorian year
-                       append re \\s*(\\d\\d?)
+                       append re " %02d"
                        dict set fieldSet yearOfCentury [incr fieldCount]
-                       append postcode "dict set date yearOfCentury \["  "::scan \$field" [incr captureCount] " %d"  "\]\n"
+                       append postcode "dict set date yearOfCentury "  "\$field" [incr captureCount] "\n"
                    }

And then, at the end:

@@ -340,13 +336,9 @@
     }

-    # Clean up any unfinished format groups
-
-    append re $state \\s*\$
-
     # Build the procedure

     set procBody {}
     append procBody "variable ::tcl::clock::TZData" \n
-    append procBody "if \{ !\[ regexp -nocase [list $re] \$string ->"
+    append procBody "if \{ !\[ ::scan \$string {$re} "
     for { set i 1 } { $i <= $captureCount } { incr i } {
        append procBody " " field $i

On my humble i5, running FreeBSD-i386, I get the following timings (microseconds per iteration):

Format	String	Current regexp-based implementation	New scan-based implementation
%a %b %d %H:%M:%S %Y	Thu May 02 19:18:21 2019	822.6088	128.5014
%A %Y	Monday 2019	272.235	129.8344
%A %Y	Monday 2011	275.4886	125.038
%a %b %d %H:%M:%S %Z %Y	Thu May 02 19:18:21 EDT 2019	1344.6232	136.2222
%a %d %H:%M:%S %Y	Thu 02 19:18:21 2019	536.1878	146.4762
%Z %Z	EDT EST	227.217	135.1222
%w %y	3 11	172.109	123.082
%A %Y	Wombat 2019	Both fail
%m/%d/%y	02/11/18	191.5808	117.0322

May this be a lesson to anyone using regexp, where scan would do... :)

(Oh, and making the regular-expression strings global didn't help at all. I guess, the compilation-results are cached with the procedure's bytecode, when the regex is a known constant string.)

sebres · 2019-05-04T12:17:43Z

May this be a lesson to anyone using regexp, where scan would do... :)

The emphasis is here on the word "where". Again, it is not really always possible and I'm sure you'll see that not later than you would start a test-cases for you modified clock (like tclsh tests/clock.test).

BTW, I use neither the regexp nor the scan in my engine... (C-lang rules;).
Anyway on my humble i5, I don't see the times (822µs) you have measured by regexp (you may missed warming up or something else). Anyway here are my times:

-# original tcl
+# tclclockmod (or clock-perf-branch)
 % timerate {clock scan "Thu May 02 19:18:21 2019" -format "%a %b %d %H:%M:%S %Y" -gmt 1 -locale en}
-167.844 µs/# 5957 # 5957.9 #/sec 999.846 nett-ms
+1.006097 µs/# 947962 # 993939 #/sec 953.742 nett-ms

Additionally I cannot believe the regexp should be so slower as scan (vise versa I saw regexp many time faster sometimes). Like here in this example:

% timerate {scan 02 "%02d"}
0.579908 µs/# 1598753 # 1724412 #/sec 927.129 nett-ms
% timerate {regexp {^\d\d?$} 02}
0.269321 µs/# 3175604 # 3713044 #/sec 855.256 nett-ms

UnitedMarsupials-zz · 2019-05-05T20:37:02Z

C-lang rules
It certainly does -- and the best illustration is you getting the benchmark down 160 times!

The i5 I used before was at 2GHz -- used inside a FreeBSD/i386 VM running inside a Windows guest.

Not sure, what your timerate does... Trying the below simple script:

puts [time {scan 11 %2d ns} 10000]
puts [time {regexp {^(\d{2})$} 11 nr} 10000]

puts "$ns vs. $nr"

on a FreeBSD/amd64 using the E5-1620 CPUs at 3.6GHz, I get (with tcl-8.6):

0.763 microseconds per iteration
1.1125 microseconds per iteration
11 vs. 11

That is, scan is a lot faster even for simple anchored expressions...

Now, the default compiler on FreeBSD is clang. So I built tcl8.7 with gcc8 and tried the same script again. Same thing:

0.7488 microseconds per iteration
1.1592 microseconds per iteration
11 vs. 11

sebres · 2019-05-05T21:16:19Z

Not sure, what your timerate does...

timerate is a new measurement command for all newest Tcl-versions (since TIP 527 got merged in Tcl >= 8.5)

That is, scan is a lot faster even for simple anchored expressions...

You may be a bit wrong:

your number of iterations (10000) is too small - the time and overhead of the compilation of regex (occurred 1st time only) is too large compared to the whole execution time (in my example above using timerate it has reached 3175604/1598753 iterations by regexp/scan accordingly in 1 second); this is that what I meant as warming-up.
command time is not really suitable for a performance measurement (because the script body does not invoked as compiled inside single iterations, so the overhead for jit-related thing is simply too large, especially on very fast commands); this means also it is executed not as byte-compiled instructions (and then scan could be faster, due to simpler and faster "non compiled" execution).

Anyway, also by your times, 0.7 to 1.1 is not really a lot faster in my opinion (factor 1.5) and does not explain the times you meant before - 128 vs. 822 (factor 6.5).

UnitedMarsupials-zz · 2019-05-06T02:19:16Z

timerate is a new measurement command for all newest Tcl-versions

Interesting -- FreeBSD ports do not install it (neither for tcl86 nor for tcl87). I'll investigate...

You may be a bit wrong:

Ok, here is the updated script, taking your concerns into account: 100-times more iterations, one separate iteration each to have the expression compiled:

set scan	{scan 11 %2d ns}
set regexp	{regexp {^(\d{2})$} 11 nr}

eval $scan	;# The first, throw-
eval $regexp	;# away evaluations.

puts [time $scan 1000000]
puts [time $regexp 1000000]

puts "$ns vs. $nr"

The per-iteration numbers are as before...

0.7 to 1.1 is not really a lot faster in my opinion (factor 1.5)

That's consistent with what I've got for the short strings above (like %Z %Z).

128 vs. 822 (factor 6.5)

These differences are what I've got for the longer expressions -- %a %b %d %H:%M:%S %Y.

sebres · 2019-05-06T08:40:32Z

FreeBSD ports do not install it (neither for tcl86 nor for tcl87). I'll investigate...

Although it is merged but not yet released. :) You must compile newest sources from fossil-repository or clones (e. g. on github).

resuna closed this as completed May 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clock scan - replacing regexp with scan #34

Clock scan - replacing regexp with scan #34

UnitedMarsupials-zz commented May 3, 2019

sebres commented May 3, 2019

UnitedMarsupials-zz commented May 3, 2019

resuna commented May 3, 2019

UnitedMarsupials-zz commented May 3, 2019 •

edited

Loading

sebres commented May 4, 2019

UnitedMarsupials-zz commented May 5, 2019

sebres commented May 5, 2019

UnitedMarsupials-zz commented May 6, 2019

sebres commented May 6, 2019

Clock scan - replacing regexp with scan #34

Clock scan - replacing regexp with scan #34

Comments

UnitedMarsupials-zz commented May 3, 2019

With TCL compiled for debugging (no optimization)

TCL compiled with -O2

sebres commented May 3, 2019

UnitedMarsupials-zz commented May 3, 2019

resuna commented May 3, 2019

UnitedMarsupials-zz commented May 3, 2019 • edited Loading

sebres commented May 4, 2019

UnitedMarsupials-zz commented May 5, 2019

sebres commented May 5, 2019

UnitedMarsupials-zz commented May 6, 2019

sebres commented May 6, 2019

TCL compiled with `-O2`

UnitedMarsupials-zz commented May 3, 2019 •

edited

Loading