Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proto: Add comment on location address adjustment #882

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

brancz
Copy link
Contributor

@brancz brancz commented Jul 24, 2024

This comment reflects how the pprof toolchain already treats this case, so this change is primarily to signify that it is the expected behavior from consumers.

@aalexand

This comment reflects how the pprof toolchain already treats this case,
so this change is primarily to signify that it is the expected behavior
from consumers.
@@ -186,6 +186,10 @@ message Location {
// for the corresponding mapping. A non-leaf address may be in the
// middle of a call instruction. It is up to display tools to find
// the beginning of the instruction if necessary.
// If Mapping.memory_start is 0, Mapping.memory_limit is 0 or the max uint64,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending the change!

I'm not sure I agree with the proposed text, some comments below.

What I expect tools to do is to use the Mapping.* values together with the Offset and VirtAddr for the executable segment in ELF program headers to translate the addresses to the objdump-compatible addresses.

In reality, if Offset == VirtAddr for the segment in the ELF headers, then setting Mapping.memory_start == Mapping.file_offset would mean effectively no translation, for any values (can be zero as in the proposed comment). I think we should document these mechanics.

I don't think we should give any special meaning to Mapping.memory_limit of zero or max uint64. I would expect it to be set to Mapping.memory_start + size of the VMA always.

I hope this makes sense, happy to discuss this more. I'm glad we are trying to improve the docs around these nuances.

$ readelf --segments --wide /bin/ls | grep -P "LOAD|Type"
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0037b8 0x0037b8 R   0x1000
  LOAD           0x004000 0x0000000000004000 0x0000000000004000 0x015549 0x015549 R E 0x1000
  LOAD           0x01a000 0x000000000001a000 0x000000000001a000 0x009090 0x009090 R   0x1000
  LOAD           0x0232b0 0x00000000000242b0 0x00000000000242b0 0x001330 0x002610 RW  0x1000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case this is because debuginfo provided by distributions may have a different offset or virtaddr of the executable that the data was obtained from, this is because the debuginfo is stripped, meaning only the actual production executable has the correct values for these.

As the code mentions some tools may want to specifically signify that the location has already been adjusted. Do you have a suggestion on how we can document this behavior accurately?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let me think about it and propose something concrete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, let me know if I can be helpful! Also happy to discuss on CNCF Slack if you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants