Skip to content
This repository has been archived by the owner on Feb 26, 2021. It is now read-only.

Meeting notes 3/20/2018 #38

Open
divega opened this issue Mar 20, 2018 · 19 comments
Open

Meeting notes 3/20/2018 #38

divega opened this issue Mar 20, 2018 · 19 comments

Comments

@divega
Copy link

divega commented Mar 20, 2018

Attendees: @anpete @ajcvickers @bgrainger @DamianEdwards @sebastienros @divega

(feel free to make edits or reply if you have any clarification to make)

Status

MySqlConnector

A few new optimization added to MySqlConnector between 0.32 and 0.37 (thanks Bradley for joining the call!):

  • Fast path for single connection string in the pool
  • Opt out from ping to keep idle connections alive
  • Allocation reduction

Results are very encouraging:

MySqlConnector Linux Fortunes best RPS
.NET Core 2.0 Libuv + 0.32.0 19,012
.NET Core 2.0 Libuv + 0.37.0 80,754
.NET Core 2.1 Sockets + 0.32.0 70,626
.NET Core 2.1 Sockets + 0.37.0 129,195

@bgrainger has also implemented a "micro-provider" (homologous to Peregrine) for MySQL that is useful for experimenting with new features and interpolating their potential gains.

Next planned steps:

  • Lock free connection pool (~3% improvement expected)
  • Prepared and auto-prepared statements (~15% improvement expected)

Npgsql

@roji released 4.0.0 preview1 to NuGet and @sebastienros took it for a spin. These are the results:

Npgsql Linux Fortunes best RPS  
.NET Core 2.0 Libuv + 3.2.7 100,683
.NET Core 2.0 Libuv + 4.0.0 123,990
.NET Core 2.1 Sockets + 3.2.7 152,406
.NET Core 2.1 Sockets + 4.0.0 217,057

With these numbers we should land on Top 10.

Next steps for Npgsql

  • @divega to ask for a review on the lock free code
  • @sebastienros is looking at some inconsistencies between the results we get in our own runs and TE CI.

Note that for both Npgsql and MySqlConnector:

  1. We are still seeing consistently 15-20% faster perf on Windows
  2. We can't get .NET Core 2.1 bits into the official benchmarks until we are at least in RC

EF Core

@anpete has checked in a few improvements for EF Core 2.1 that reduce allocations. We don't have results yet.

Query demultiplexing (aka gating, etc.)

We discussed briefly the concept of demultiplexing identical queries issues from different requests into a single query. A couple of variations of this have been raised by @sebastienros and @anpete. Although we don't know if this would be in the spirit of the benchmarks, it sounds like it could be potentially very useful for real-world workloads, and is something we would like to pursue.

@anpete
Copy link
Contributor

anpete commented Mar 21, 2018 via email

@roji
Copy link
Member

roji commented Mar 21, 2018

We are still seeing consistently 15-20% faster perf on Windows

Hmm... Do you guys have a plan for investigating this? Since the numbers above are for Linux, that means that there's an additional 15-20% speedup that can be done... If we manage to get this fixed we could land in an even better place, so I'm guessing this should be pretty high priority?

There's also considerable potential here to optimize things for everyone using dotnet on Linux, not just data.

BTW I still intend to do cross-platform socket benchmarks (#30) but my work list is long and time is short...

We can't get .NET Core 2.1 bits into the official benchmarks until we are at least in RC

I'm assuming that's some sort of TE rule? What does it mean for Npgsql 4.0.0-preview1, should I be planning an RC (or RTM) for when you guys RC?

Lock free connection pool (~3% improvement expected)

That's odd, I saw significantly more when switching to lock-free (7-8% IIRC). In any case, I'll be looking into making my work available as an external component so that maybe it can be shared with the MySQL provider.

Prepared and auto-prepared statements (~15% improvement expected)

Here as well there's some potential for at least some code-copying across... Feel free to involve me when you start working on this @bgrainger.

@roji
Copy link
Member

roji commented Mar 21, 2018

PS opened #39 to track and discuss the 15-20% Windows/Linux gap.

@niemyjski
Copy link

This is amazing! Keep up the good work!

@tmds
Copy link

tmds commented Mar 21, 2018

@divega The Sockets numbers are higher than Libuv, is that due to comparing .NET Core 2.1 with .NET Core 2.0? Does .NET Core 2.1 Libuv perform even better?

@divega
Copy link
Author

divega commented Mar 21, 2018

@tmds I heard from @DamianEdwards and @sebastienros that sockets is actually faster in this scenario, but switching to sockets doesn't explain the whole delta. There have been various other improvements in .NET Core and ASP.NET Core 2.1.

@tmds
Copy link

tmds commented Mar 21, 2018

@tmds I heard from @DamianEdwards and @sebastienros that sockets is actually faster in this scenario

That's surprising and interesting! My expectation for this test is libuv/sockets should perform very similar and libuv should be on top. Can you share the numbers for libuv 2.1?

@roji
Copy link
Member

roji commented Mar 21, 2018

Just to make sure everyone's on the same page, Npgsql itself is of course always on managed sockets, the libuv vs. sockets affects only the web side.

@tmds
Copy link

tmds commented Mar 21, 2018

Exactly. That is why I expect similar results since most of the work is the same. My expectation libuv performs a bit better comes from it performing better on all other benchmarks.

@divega
Copy link
Author

divega commented Mar 21, 2018

Can you share the numbers for libuv 2.1?

@sebastienros can you?

@sebastienros
Copy link
Member

No, I obviously didn't do that on purpose ;) because I ran them when I built the tables and the differences where about noise level. And also, I chose Libuv for 2.0 and Socket for 2.1 because this is what customers are actually on (decided by default on WebHostBuilder).

@sebastienros
Copy link
Member

If you want to see the differences between Libuv and Sockets on common scenarios I can refer you to these charts: https://aka.ms/aspnet/benchmarks

@tmds
Copy link

tmds commented Mar 22, 2018

and the differences where about noise level

ok, that matches with my expectations.

We are still seeing consistently 15-20% faster perf on Windows

A large part of this probably is due to Npgsql using managed sockets. If it could leverage the Kestrel Transport, then for Libuv we'd see better performance.

@roji
Copy link
Member

roji commented Mar 22, 2018

We are still seeing consistently 15-20% faster perf on Windows

A large part of this probably is due to Npgsql using managed sockets. If it could leverage the Kestrel Transport, then for Libuv we'd see better performance.

I'm not sure why that should be... Npgsql uses managed sockets on both Windows and Linux, they should be on par unless something's off in the .NET Linux socket implementation... Or it could be somewhere completely different in the runtime that's less optimized on Linux - we keep assuming it's socket-related but that remains to be shown. I'm also told that with the latest 2.1 managed socket performance is supposed to be very similar to libuv (at least on web-only perf).

At some point we did discuss testing Npgsql with libuv. But the problem is that Npgsql is by nature a client networking model - there typically isn't any event loop to hook into...

@sebastienros, do you see the same 15-20% Linux/Windows gap on plaintext, where no database is being used? This is in order to isolate whether the Linux/Windows gap is a result of something Npgsql does or just a general overall platform difference.

@tmds
Copy link

tmds commented Mar 22, 2018

we keep assuming it's socket-related but that remains to be shown.

If you look at performance of Json for Libuv vs Sockets here https://aka.ms/aspnet/benchmarks.
You see 463.160 vs 373.789. So +24%.
For plaintext the difference is much less, that is because plaintext is pipelined http. That is: only 1/16 request actually uses the Transport.

At some point we did discuss testing Npgsql with libuv. But the problem is that Npgsql is by nature a client networking model

To do this, you need some 'Socket' type that gets handled by the Libuv thread. The Transport would need to give you a way to get this. For example ITransport (https://github.com/aspnet/KestrelHttpServer/blob/dev/src/Kestrel.Transport.Abstractions/Internal/ITransport.cs) needs a method like:

Task<IDuplexPipe> ConnectTcpAsync(string host, int port);

@roji
Copy link
Member

roji commented Mar 22, 2018

we keep assuming it's socket-related but that remains to be shown.

If you look at performance of Json for Libuv vs Sockets here https://aka.ms/aspnet/benchmarks.
You see 463.160 vs 373.789. So +24%.
For plaintext the difference is much less, that is because plaintext is pipelined http. That is: only 1/16 request actually uses the Transport.

I guess I'm confused, I was under the impression that in 2.1 managed sockets got to a point where it was very similar to libuv.

To do this, you need some 'Socket' type that gets handled by the Libuv thread. The Transport would need to give you a way to get this. For example ITransport (https://github.com/aspnet/KestrelHttpServer/blob/dev/src/Kestrel.Transport.Abstractions/Internal/ITransport.cs) needs a method like:

I know this can be done, but does it really make sense for a database driver to open a libuv thread+loop like this? I'm not sure this is something that should be forced upon users (especially if the gains are small). The programming model simply seems more appropriate for a server such as Kestrel.

And again, what's really intriguing me is the Windows/Linux performance gap, with both using the same API (managed sockets). If replacing managed sockets with libuv on Linux brings perf up to something comparable with Windows, that may point towards managed socket implementation issues on the .NET Linux side; it's better for these to be resolved rather than bypassing the socket layer entirely by using libuv...

(of course if the differences are big and there's no other alternative libuv could be an option)

@tmds
Copy link

tmds commented Mar 22, 2018

I was under the impression that in 2.1 managed sockets got to a point where it was very similar to libuv.

For plaintext and fortunes performance is similar (on Linux).

benchmark

I know this can be done, but does it really make sense for a database driver to open a libuv thread+loop like this? I'm not sure this is something that should be forced upon users (especially if the gains are small). The programming model simply seems more appropriate for a server such as Kestrel.

Indeed, it wouldn't make sense for the database driver to do this. More like: when running on Kestrel, Kestrel could expose an ITcpConnect feature that the database driver could consume.
Even then, it would be a lot of work for Kestrel to support this and also for database drivers to use the 'socket' type (e.g. IDuplexPipe) created by the Transport.
As Kestrel is moving towards Sockets, it would also be a change in the wrong direction (since Sockets doesn't benefit).

that may point towards managed socket implementation issues on the .NET Linux side; it's better for these to be resolved rather than bypassing the socket layer entirely by using libuv

The managed implementation has improved vastly compared to 2.0. I think part of the performance difference is due to the Socket API adds a layer of abstraction (cross-platform API for .NET) which is not present for Libuv (closer to OS primitives).

@roji
Copy link
Member

roji commented Mar 22, 2018

that may point towards managed socket implementation issues on the .NET Linux side; it's better for these to be resolved rather than bypassing the socket layer entirely by using libuv

The managed implementation has improved vastly compared to 2.0. I think part of the performance difference is due to the Socket API adds a layer of abstraction (cross-platform API for .NET) which is not present for Libuv (closer to OS primitives).

That's definitely possible, although at least in theory both are layers above OS primitives. libuv is probably lower-level than managed sockets (which has to be cross-platform) which could explain the difference.

But the big question for me is again, not sockets vs. libuv (which seem to be very similar in fortunes), but rather Windows vs. Linux where we may have an opportunity for another 15-20% gain. And there really shouldn't be any reasonable cause for this to perform so much better on Windows.

@tmds
Copy link

tmds commented Mar 22, 2018

not sockets vs. libuv (which seem to be very similar in fortunes), but rather Windows vs. Linux where we may have an opportunity for another 15-20% gain. And there really shouldn't be any reasonable cause for this to perform so much better on Windows.

The cause is that the Socket API was designed for Windows and then made cross-platform. corefx has an SocketAsyncContext for non-Windows and I think this does things which are all handled by the Windows kernel. That said, I also hope we see more improvements that close the gap further.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants