Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#14895: enable gp-rel in kernels #15043

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

#14895: enable gp-rel in kernels #15043

wants to merge 1 commit into from

Conversation

nathan-TT
Copy link
Contributor

Ticket

#14895

Problem description

The compiler does not know that extern global variables are placed near each other, and thus generates each address separately. This can be aleviated by enabling GP-relative addressing in kernels (it is already enabled in firmware)

What's changed

  1. Export __global_pointer$ from the firmware's linker script. Do not recompute it in the kernel.
  2. Refer to __global_pointer$ from the kernel's startup -- this is sufficient to tell the linker to do the relaxation we want.
  3. Do not do such relaxation in erisc kernels, that didn't work, so such kernels no longer use tmu-crt0k.

Of the 4 kernels I looked at, this reduced the text size between 4 and 10% -- in particular the issue's assembly becomes:

    3b98:       ffb207b7                lui     a5, 0xffb20
    3b9c:       2087a703                lw      a4, 520(a5) # ffb20208 <__global_pointer$+0x1fa18>
    3ba0:       98e1aa23                sw      a4, -1644(gp) # ffb00184 <noc_reads_num_issued>
    3ba4:       2287a603                lw      a2, 552(a5)
    3ba8:       98c1a623                sw      a2, -1652(gp) # ffb0017c <noc_nonposted_writes_num_issued>
    3bac:       2047a603                lw      a2, 516(a5)
    3bb0:       98c1a223                sw      a2, -1660(gp) # ffb00174 <noc_nonposted_writes_acked>
    3bb4:       2007a603                lw      a2, 512(a5)
    3bb8:       9ac1a423                sw      a2, -1624(gp) # ffb00198 <noc_nonposted_atomics_acked>
    3bbc:       22c7a703                lw      a4, 556(a5)
    3bc0:       96e1ae23                sw      a4, -1668(gp) # ffb0016c <noc_posted_writes_num_issued>
    3bc4:       ffb487b7                lui     a5, 0xffb48
    3bc8:       0107a583                lw      a1, 16(a5) # ffb48010 <__global_pointer$+0x47820>
    3bcc:       ffb007b7                lui     a5, 0xffb00
    3bd0:       1a478713                addi    a4, a5, 420 # ffb001a4 <__global_pointer$+0xfffff9b4>
    3bd4:       00c75503                lhu     a0, 12(a4)
    3bd8:       99418693                addi    a3, gp, -1644 # ffb00184 <noc_reads_num_issued>
    3bdc:       ffb48837                lui     a6, 0xffb48
    3be0:       0ff00613                li      a2, 255

Checklist

  • [YES] Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • New/Existing tests provide coverage for changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant