-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tclx profile support for 8.6 #11
Comments
Further simplified the changes for Tcl 8.6 in check-in |
This is the patch for tclXprofile.c, no Tcl core changes seem necessary https://www.androwish.org/index.html/fdiff?v1=a6f5cf9f8b21bf4d&v2=eaeaabfa2e4ac909&patch |
Stupid me, I was on a wrong old TclX checkout. Seems some of the work was done 2 years ago. An unoptimal draft implementation for TclX's profile command using "trace add|remove execution ..." internally is in https://www.androwish.org/index.html/info/c6070d8891e3d4a0 It is rudimentary tested with a somewhat fused profile.test and may still have memory leaks. As I have no more time to spend on this topic, if anybody finds it useful, it could serve as a starting point. |
Thanks for the effort. The bounty is still up for grabs. |
I ran valgrind full leak check on a small example and did not see any leaks. |
By Tcl_Obj'fying parts of TraceExecutionProc() of tclTrace.c the overhead can be reduced to about 60% which is still slow compared to unprofiled execution, see attached patch. But this is a core change and as usual unfinished, untested, and untipped ;-) |
Here's the slightly improved patch: And this script (including some recorded output) was used to get an idea on profile's timing impact: Conclusion: profiled execution is 100 to 250 times slower on 8.6.6 (core-8-6-branch). |
For some reason unknown to me, the Tcl_FindCommand() call does not find the "yield" when called from the Tclx profiler. If the profile code ignores/works-around this error it appears to collect profile data for coroutines without crashing. Here is a patch to ignore the issue with "yield" as proof. It should do until the issue with Tcl_FindCommand() is understood and/or fixed. I don't think the Tcl_Obj-ing of the Tcl trace code requires a TIP. A Core maintainer should be able to preview and apply the patch. The "crash" is fixed and the profile results seem reasonable. I think the next step is for FA to try the profiler on some real code to see if the performance and results are expectable. |
Regarding the "yield" effect/crash please would you post a test case? I've tried this snippet (w/o the yield patch) and had no crashes (on 32bit i386 with core-8-6-branch plus Tcl_Obj'fication):
|
Modify coroutine.test to load package Tclx and turn profile on.
This panic only occurs in tcltest and is due to Tcl_FindCommand returning NULL. I may have gone off the deep end with this. It looks like the Tcl_Obj-ifying of the TclTrace code is causing failures and a hang in the test suite. I'll work on getting a complete list of failures and post the results here. |
There are 11 failures, including event-5.3 which hangs so it has to be skipped in order to run the full suite. Looking into the problem I see that the patch is trying to build a list for the callback command. The basic problem is that although a list is a valid command, a command is not necessarily a valid list. Many of the failing cases have callback commands that are not valid lists, or when made into a list, end up not being the intended command. I conclude that the Tcl_Obj-ification of the tclTrace code is not possible. It must be a string that is subsequently parsed by Tcl. An alternative would be to add profiling directly into the core. The trace infrastructure could be used without having to bother with callbacks. |
Apparently, the first version of Tcl_Obj'fication wasn't a big hit. Here's the second version which uses an array of parsed Tcl_Obj's attached to the TraceCommandInfo struct: |
This patch is better, it has only 3 regressions in the test suite:
|
Turns out that the latest patch doesn't honour this important sentence from "man Tcl_EvalObjv": Thus, the new strategy is to build a pre-parsed Tcl_Obj array only for simple (single command) cases in order to speed up execution tracing, and to fall back to the old way of building a string to be Tcl_EvalEx'ed otherwise. See this patch: tclTrace-v3.diff.txt This does not fix the failure of "info-40.1" since in the optimized case the Tcl "info frame" layout is different and must be "info frame -2" instead of "info frame -3" for that test to succeed. As some centuries ago in Denmark, the fine question now is to claim or not to claim ;-) What are FA's expectations regarding impacts of profiling on overall timing, how compares this approach to Tcl 8.5? |
Draft implementation in AndroWish (and undroidwish), see this check-in
https://www.androwish.org/index.html/info/c29648d90f7440aa
Somewhat tested on Android x86 and Linux x86 where it didn't crash anymore.
The text was updated successfully, but these errors were encountered: