Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance regression on Windows #79

Open
ian-p-cooke opened this issue Nov 29, 2019 · 3 comments
Open

performance regression on Windows #79

ian-p-cooke opened this issue Nov 29, 2019 · 3 comments

Comments

@ian-p-cooke
Copy link

ian-p-cooke commented Nov 29, 2019

after updating my dependencies I had a performance regression from 0.5s to 1.8s. I tracked it down to repeated calls to allocation_granularity() every time that consume() is called in my buf_redux created ring buffer (they updated their slice_deque dependency from 0.1 to 0.2) . allocation_granularity() calls GetSystemInfo() on Windows which is very slow (1.3 seconds of my 1.8 second run-time).

can you tell if all of the calls to allocation_granularity are necessary or if buf_redux is misusing the API? Stack trace looks like this:

   5: slice_deque::mirrored::winapi::allocation_granularity
             at C:\Users\icooke\tmp\slice_deque\src\mirrored\winapi.rs:49
   6: slice_deque::mirrored::buffer::Buffer<u8>::size_in_bytes<u8>
             at C:\Users\icooke\tmp\slice_deque\src\mirrored\buffer.rs:112
   7: slice_deque::SliceDeque<u8>::move_head_unchecked<u8>
             at C:\Users\icooke\tmp\slice_deque\src\lib.rs:683
   8: slice_deque::SliceDeque<u8>::move_head<u8>
             at C:\Users\icooke\tmp\slice_deque\src\lib.rs:735
   9: buf_redux::buffer::slice_deque_buf::SliceDequeBuf::consume
             at C:\Users\icooke\tmp\buf_redux\src\buffer\slice_deque_buf.rs:81
  10: buf_redux::buffer::BufImpl::consume
             at C:\Users\icooke\tmp\buf_redux\src\buffer\mod.rs:42
  11: buf_redux::Buffer::consume
             at C:\Users\icooke\tmp\buf_redux\src\lib.rs:1175
  12: buf_redux::{{impl}}::consume<std::fs::File,buf_redux::policy::MinBuffered>
             at C:\Users\icooke\tmp\buf_redux\src\lib.rs:413
  13: std::io::read_until<buf_redux::BufReader<std::fs::File, buf_redux::policy::MinBuffered>>
             at /rustc/4560ea788cb760f0a34127156c78e2552949f734\src\libstd\io\mod.rs:1625
  14: std::io::BufRead::read_line::{{closure}}<buf_redux::BufReader<std::fs::File, buf_redux::policy::MinBuffered>>
             at /rustc/4560ea788cb760f0a34127156c78e2552949f734\src\libstd\io\mod.rs:1866

buf_redux code is here: https://github.com/abonander/buf_redux/blob/4c59a0f6c423f80bcd099f543c547d449d92b1d3/src/buffer/slice_deque_buf.rs#L80

so you can see every line I read involves a GetSystemInfo() call which is killing me.

If the calls to allocation_granularity() are necessary I recommend caching the value somehow. I made a local copy of slice_deque and used lazy_static to calculate it once and my performance did return to 0.5s. If you want to add the dependency on lazy_static I can put together a pull request for that.

@gnzlbg
Copy link
Owner

gnzlbg commented Nov 29, 2019

Dear @ian-p-cooke, thank you for the report!

I've pushed a PR here: #80 with a potential fix. Sadly, I don't have a windows machine to test if it solves your issue.

Would you be so kind of patching your slice_deque dependency to use the branch in my PR, and try it out ?

Thank you!

@ian-p-cooke
Copy link
Author

@gnzlbg that works well for me! performance is better than my branch with lazy-static and better than slice-deque 0.1 by a little bit too. All tests pass for me.

@Boscop
Copy link

Boscop commented Mar 11, 2020

Any update on this? I'm also on Windows..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants