Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snoopi "misses" top-level functions if the whole block gets inferred. i.e. @time @snoopi breaks results. #141

Open
NHDaly opened this issue Oct 1, 2020 · 3 comments

Comments

@NHDaly
Copy link
Collaborator

NHDaly commented Oct 1, 2020

I've been meaning to post about this issue for a while, because its behavior has been baffling me for some time.

I was finally gearing up to post about it today, but thanks to some great sleuthing work together with @Sacha0, we managed to understand the situation! :) I'm not sure about a fix yet, but I'm glad to be able to understand and explain everything that's going on.


First, here is the bizarre behavior I've observed, which maybe others have seen before, too:

Adding @time to a @snoopi call changes the results!

Consider this example, where I create a module with some functions, where I've forced an inference barrier to let us see inference at work. It includes a forced runtime dispatch, so we see three separate results from @snoopi - one for each of the top-level calls to inference. This result is repeatable; as long as you redefine the module and resnoop, you again get [g,i,i2].

The interesting behavior is that when we then @time the call to @snoopi, we are missing the top-level inference call! We instead only get the calls for the functions called via dynamic dispatch.

julia> module M
          i(x) = x+5
          i2(x) = x+2
          h(a::Array) = i2(a[1]::Integer) + i(a[1]::Integer) + 2
          g(y::Integer, x) = h(Any[y]) + Int(x)
       end;

julia> timings = SnoopCompileCore.@snoopi let
              M.g(2,3.0)
          end;             # Run once to compile all the downstream functions (like + and getindex) to make the subsequent logs clearer.

julia> module M
          i(x) = x+5
          i2(x) = x+2
          h(a::Array) = i2(a[1]::Integer) + i(a[1]::Integer) + 2
          g(y::Integer, x) = h(Any[y]) + Int(x)
       end;
WARNING: replacing module M.

julia> timings = SnoopCompileCore.@snoopi let
              M.g(2,3.0)
          end
3-element Array{Tuple{Float64,Core.MethodInstance},1}:
 (0.0002079010009765625, MethodInstance for i(::Int64))
 (0.000225067138671875, MethodInstance for i2(::Int64))
 (0.0015878677368164062, MethodInstance for g(::Int64, ::Float64))

julia> module M
          i(x) = x+5
          i2(x) = x+2
          h(a::Array) = i2(a[1]::Integer) + i(a[1]::Integer) + 2
          g(y::Integer, x) = h(Any[y]) + Int(x)
       end;
WARNING: replacing module M.

julia> @time timings = SnoopCompileCore.@snoopi let
              M.g(2,3.0)
          end
  0.004031 seconds (1.45 k allocations: 86.355 KiB)
2-element Array{Tuple{Float64,Core.MethodInstance},1}:
 (0.00015401840209960938, MethodInstance for i(::Int64))
 (0.00024318695068359375, MethodInstance for i2(::Int64))

I found this result quite surprising! But don't worry, if you don't already see the problem, it'll be clear in the end.


Next, here is another very surprising behavior I've seen with the @time macro, which baffled us for a few hours:

julia> elapsedtime1 = Base.time_ns(); @time db = Delve.API.create_database();   elapsedtime = Base.time_ns() - elapsedtime1;
 29.711536 seconds (34.38 M allocations: 1.745 GiB, 2.00% gc time)

julia> elapsedtime / 1e9
98.999127862

create_database() is some internal function in our codebase that takes quite a long time to compile.

But somehow, the time reported by time is 70 seconds less than the time I measured! It's weird, right!?


We finally found the answer, and it's the same answer for both cases. What is happening here is that the @time macro causes its entire body to be compiled before executing it. And this includes the body of the code that we want to infer with @snoopi!

So basically, by the time we actually get to the part of the code that enables snooping within julia, we've already finished inferring the top-level call we wanted to measure!

We can of course fix this with any kind of inference barrier inside the body we're measuring. e.g. an @eval will do it:

julia> module M
          i(x) = x+5
          i2(x) = x+2
          h(a::Array) = i2(a[1]::Integer) + i(a[1]::Integer) + 2
          g(y::Integer, x) = h(Any[y]) + Int(x)
       end;
WARNING: replacing module M.

julia> @time timings = SnoopCompileCore.@snoopi let
              @eval M.g(2,3.0)   # thanks to this @eval, we now delay inferring `g()` until _after_ we've enabled snooping.
          end
  0.017935 seconds (11.62 k allocations: 569.847 KiB)
3-element Array{Tuple{Float64,Core.MethodInstance},1}:
 (0.0002300739288330078, MethodInstance for i(::Int64))
 (0.0002770423889160156, MethodInstance for i2(::Int64))
 (0.0019459724426269531, MethodInstance for g(::Int64, ::Float64))

And so in my example above, where we've always thought our create_database() function takes 30 seconds to compile the first time, it ACTUALLY takes 100 seconds! I was just only starting my timer after all the top-level inference was completed! And since it's fairly type-stable code, we were apparently able to get most of the way through it.


Hooray! :) It all makes sense now!

But I'm not sure what should be done about it.

It strikes me that this may surprise other users, since users of @snoopi really are intending to measure inference times of the body of the macro. So it seems to me that we should change the macro to go out of our way to prevent it from being inferred before we enable snooping.

I'm not sure if we should go as far as adding an @eval, since that does actually change the meaning of the code, but is there any other clever way to add an inference barrier that we can use?

:) Thanks!

@NHDaly
Copy link
Collaborator Author

NHDaly commented Oct 1, 2020

For completeness, the relevant section in the @time macro that makes it different from my elapsed-time measurement is this while false end trick to force the compiler to compile the whole block:

https://github.com/JuliaLang/julia/blob/4422648a3c4d136dc1c2f5617041cae95e05dbaf/base/timing.jl#L182

@NHDaly
Copy link
Collaborator Author

NHDaly commented Oct 2, 2020

After thinking more about this, I don't see any way to avoid this other than to put the body of the macro into an eval statement.

And I now increasingly think this would be a good change, and mirrors the way @snoopc (and @snoopl) will run the body in an entirely new julia process. It seems quite reasonable to me to run the body at the top-level. That's a less wild change and i think will provide results that are closer to what users expect! :)

Does that sound reasonable to you?

@timholy
Copy link
Owner

timholy commented Nov 12, 2020

Yes, it does seem that we need to hide it a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants