Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a C++ parser for our EVENT logs #171

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

Quincunx271
Copy link
Member

Benefits:

  • Parses a 4 GB log file in ~ 4GB of memory and 10-12 seconds, compared to an unknown amount of memory (>10 GB) and more than a minute.

Drawbacks:

  • A lot of code that would have to be maintained...
  • Uses C++20 and a whole bunch of libraries

WIP - need to:
 - Benchmark name parsing
 - File name parsing
 - RawLog parsing (lazy; we don't want to hold onto the memory unless we
 need it)
 - Integration into the Python code
Previously, accessing a single element required creating the entire
array. Adding these reference types solves that issue.
absl::flat_hash_{set,map} are called "swiss tables", and they provide a
small perf win for us.
@Quincunx271 Quincunx271 added the utils Concerning utilities rather than the main project label Aug 24, 2021
@Quincunx271 Quincunx271 force-pushed the feature/c++logparsing branch from a8642dc to a61e637 Compare August 24, 2021 23:13
This way, by placing all of these modules in a directory on the
LD_LIBRARY_PATH, g++-11, libtbb, etc. do not need to be installed to use
the Python module, as the needed libraries are bundled here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
utils Concerning utilities rather than the main project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant