-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving fuzzing speed ideas #52
Comments
Hey, thanks for taking the time to look into all of these mechanisms and sharing your ideas. The "put int3s on whole code section" idea did come up several times before (also recently due to problems with ARM64 macOS, specifically libraries in the dyld cache no longer starting/ending on a page boundary). However, as you noted, without a way to deterministically and in a performant way distinguish code and data, it results in a much less clean model than the current way of using memory protection flags. I am not a fan of using external tooling to get code/data information because (a) I think it would make TinyInst more complicated to use and (b) external tools can (and do occasionally) also fail to distinguish code/data correctly. So this is still an open problem. However, I'd like to point out that using breakpoints is not the only way of solving slowdowns due to entries. This comes from the way in which entries occur, basically
The first case could be solved by going through the imports of every loaded library and replacing the instrumented library's imports with their translated equivalents (via |
Sup! Thanks for the answer,
Not at all, I'm currently implementing my own instrumentor alongside inspired by your ideas :)
yep, this one is trivial, but there is next one
This one doesn't look clear to me. Let's say we have something like I know it's only a speculation right now, in reality it could be easier considering assumptions we could made about the binaries emitted by standard compilers. But sounds a bit vague to me. |
I think this is quite rare and mostly instead of manually calling
We wouldn't blindly patch everything that looks like a pointer. Let's say on one run you observe an entry at 0x7fff12345678. You would then search the instrumented library's data specifically for the value 0x7fff12345678 and then replace only that one. Still possible to make a mistake though and if it gets implemented in the TinyInst it should be guarded by a flag and never enabled by default, but the chance of error would be much smaller than if we replaced everything that looks like a data->code pointer. |
FYI there is now an implementation of the "search and replace pointers to the previously observed entrypoints with their instrumented equivalents" idea using |
Interesting, let me look :) |
Hello @ifratric! I really enjoyed an elegant instrumentation idea behind the TinyInst.
However, I was thinking about reducing the slowdown caused by "entries" into the instrumented module and first idea that came to my mind was next.
Why not to put int3s on whole code section, instrument code as usual, and after that, put jump instead of particular int3s?
Several issues with this approach immediately arisen:
jmp <rel32>
.This could be tackled in several ways, which seams realistically solvable.
This is major problem to me and this is actually my question. Several solutions came into my mind
2.1) It could be solved by taking information about basicblocks from huge disassemblers like IDA or Ghydra (this is what Mesos does) and placing int3s only at the start of the basicblock. This solution works (at least for my tests on regular Microsoft's dlls), but requires additional dependency.
2.1) Instrument each indirect
mov
instruction and check if the data is taken from code section and redirect it to proper data (similarly to the indirect branches current instrumentation). This is actually slow and would be a bit complex task to implement.Am I'm overlooking anything? Maybe there is some fast code flow analysis tactic to distinguish data from code?
The text was updated successfully, but these errors were encountered: