-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
binary ninja: optimize feature extraction #2402
Comments
0953cc3b77ed2974b09e3a00708f88de931d681e2d0cb64afbaf714610beabe6 (100KB or so) takes a huge amount of time to load into Binary Ninja. Maybe there's an infinite loop somewhere. |
To run capa against 321338196a46b600ea330fc5d98d0699, it takes 2:48. But :36 is spent just in We can also see that
edit: maybe we can cache the results of fetching the llil/mlil to save some time. Still is surprising that it takes 3x longer to fetch the llil than do the complete analysis. Maybe its Python serialization overhead? |
I opened the file in binja GUI and the analysis only took 4.3 seconds:
My machine is probably faster then the CI box used by GitHub, still quite surprising to see such a huge difference |
@xusheng6 on my test rig it took maybe 13s to load the binary. Then lots longer to extract the features (minutes). So accessing the LLIR/MLIR is taking integer multiples of the total load time 😕 Maybe 3s vs 13s comes from only having about two cores available in the test environment. |
thx for letting me know about it, it seems either I wrote the backend in a bad way, or the Python wrapping adds significant overhead to it |
The profiler didn't expose any invocation counts, so I'm not yet sure if we're calling the API way too many times or if the API itself is slow. Given that it's both LLIR and MLIR, I sorta suspect the latter. But, in the few minutes I looked at the bindings, it didn't seem like all that much was happening (on the Python side). |
During some initial profiling, I'm finding that the Binary Ninja backend is substantially slower than vivisect or IDA. This thread will enumerate all the things we discover. It might include: bugs in Binary Ninja, things we're doing wrong, workarounds, etc.
Given how good Binary Ninja's code analysis is, we'd really like to be able to use it widely. So, let's prepare the code for this.
The text was updated successfully, but these errors were encountered: