Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a function to provide the exact name of the relocation, show it in Rizin and Cutter outputs for relocations #4641

Open
4 tasks
Roeegg2 opened this issue Sep 18, 2024 · 14 comments

Comments

@Roeegg2
Copy link
Contributor

Roeegg2 commented Sep 18, 2024

  • rz_bin_reloc_as_string() function
  • use it in the ir command output
  • use it in the rz-bin -R output
  • use it in the Cutter output

Is your feature request related to a problem? Please describe.

Couldn't find a way to change the relocation type naming to the names that readelf outputs (same names which are used in glibc, elf.h, etc) (eg. change ADD_64 relocation to appear as R_X86_64_64)

Describe the solution you'd like

An option on Preferences->Analysis (or whatever place seems fit for this option) to be able to toggle between the default and the classic relocation names.
That way it would be easier to read the relocations since their names are aligned to the common naming convention, and the relocations would be more specific (from what I've seen so far, for example SET_64 gets assigned to both R_X86_64_TPOFF64 and R_X86_64_GLOB_DAT)
Describe alternatives you've considered

Additional context
If this sounds like a good idea I'll be more than happy to add it myself.

Here is an example of what the current naming looks like in Cutter:
image
And this is naming option I would like to add (not in order, just an example to what the naming looks like):

 R_X86_64_TPOFF64 
 R_X86_64_GLOB_DAT 
 R_X86_64_64
.
.
.
@XVilka XVilka added the rizin label Sep 18, 2024
@wargio
Copy link
Member

wargio commented Sep 18, 2024

we could introduce e reloc.real or similar to have this sort of output.

@Roeegg2
Copy link
Contributor Author

Roeegg2 commented Sep 18, 2024

we could introduce e reloc.real or similar to have this sort of output.

What do you mean by e reloc.real?

@karliss
Copy link
Member

karliss commented Sep 18, 2024

I don't think this is a simple matter of different naming style. To me it looks like the problem is that underlying rizin data structures do not properly distinguish between different relocation types.

The current code both in Cutter and rizin more or less does format(reloc->additive ? "ADD_%d" : "SET_%d", reloc->size).

But the real set of different relocation types supported by ELF format is much more complex.

For example here is the full table for x86_64 ones
image

The set of different relocation types is specific to architecture, and also executable format (PE has different set of relocation types).

And if the name/type isn't even properly handled, I doubt that the actual relocation address calculations are done correctly (at least in all the cases). I am guessing that in the best case it currently handles only the most common relocation types.

@wargio
Copy link
Member

wargio commented Sep 18, 2024

what i meant is that we could just have an option to just print R_X86_64_TPOFF64, after all is just a switch

@XVilka
Copy link
Member

XVilka commented Sep 18, 2024

In any case, we just need something like rz_bin_reloc_as_string() function, that we can use both in Cutter and Rust. And we can remove old string representation completely, I think.

Currently relocations are handled in:

  • librz/bin/format/elf/elf_reloc.s
  • librz/bin/format/elf/elf_import.c
  • librz/bin/p/bin_elf.inc

@karliss
Copy link
Member

karliss commented Sep 18, 2024

Currently relocations are handled in:

There is also bin_pe.inc, mach0_relocs.c and possibly more for other executable formats.

@Roeegg2
Copy link
Contributor Author

Roeegg2 commented Sep 18, 2024

I'll start working on it.
And by the way since this change should go in rizin, should I open a new issue there?

@XVilka
Copy link
Member

XVilka commented Sep 19, 2024

Lets just transfer

@XVilka XVilka transferred this issue from rizinorg/cutter Sep 19, 2024
@XVilka XVilka changed the title Add option to change ELF relocation naming to the classic naming Add a function to provide the exact name of the relocation, show it in Rizin and Cutter outputs for relocations Sep 19, 2024
@XVilka XVilka added RzBin API and removed rizin labels Sep 19, 2024
@Roeegg2
Copy link
Contributor Author

Roeegg2 commented Sep 19, 2024

I came up with this solution:

break down the printing to 2 parts: the machine prefix, and the actual relocation type suffix (for example break R_X86_64_RELATIVE into R_X86_64 suffix and and RELATIVE)
add a machine_str_index and reloc_str_index members to the RzBinReloc type.
create 2 arrays of string literals - one for the suffixes and one for the prefixes.
create 2 enums for prefix indices and suffix indices.

on reloc_convert() we could assign these 2 indices, and then when printing index into the suffix and prefix arrays to get the final correct string to print.

We could also alternatively use a hash map. I could implement both methods and check which one is faster

@wargio
Copy link
Member

wargio commented Sep 19, 2024

i think having the name as by documentation for each format is better, since a user can search its meaning.

@karliss
Copy link
Member

karliss commented Sep 19, 2024

No need to complicate things more than necessary with two indexes. The code needs to deal with more than just the ELF files, not every binary format will have the same naming scheme. Space savings from doing a split will be negligible, there will be few dozen relocation types per target platform at most. Even worse having two indexes means that you have replaced the fixed cost of few dozen relocation types with the runtime RAM usage for each relocation and there could be thousands of relocations per executable. Having a split strings also makes searching code more difficult.

And the last thing I am somewhat skeptical about str_index in general (with or without split). Having str_index means that there needs to be single central relocation name string table. But the set of relaction types are specific to architecture/ binary format combination. And rizin can have dynamic plugins for adding support of new binary formats, which is incompatible with the idea of single central string table.

You can probably just have a simple non owned const char* pointing to a string literal.

@Roeegg2
Copy link
Contributor Author

Roeegg2 commented Sep 19, 2024

Yeah you're right.

Anyway I'll open a PR soon with the changes

@Roeegg2
Copy link
Contributor Author

Roeegg2 commented Sep 19, 2024

I fixed the ELF relocations, seems to be running well.

I wanted to start working on the other formats, but apparently they aren't even checking for relocation types, just setting the additive field... Shouldn't be a problem to implement though.

Should I PR each format, or PR everything in one chunk?

@wargio
Copy link
Member

wargio commented Sep 20, 2024

you can do it in one pr and on commit for format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants