Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Age-ordered RS #783

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Age-ordered RS #783

wants to merge 11 commits into from

Conversation

tilk
Copy link
Member

@tilk tilk commented Jan 16, 2025

This PR changes RS so that oldest entries are taken first. The idea is that the indexes in get_ready_list and take are no longer physical RS row numbers, but they are ordered from oldest to newest. The priority encoder in WakeupSelect then naturally selects the oldest ready entry.

The main RS test was extended to introduce random delays and testing multiple ready lists. The individual method tests are dropped, for three reasons:

  1. These tests don't just verify conformance to some protocol, they need to look into RS internals to work. Because of that, changes to internals invalidate these tests, creating more work for the developer.
  2. There is quite a lot of lines of code in these tests, which increases the amount of maintenance work even more.
  3. The main test seems good enough, and even if some error is not found there, the tests for the whole core will most probably fail then.

@tilk tilk added enhancement New feature or request optimization This is *just* an optimization! benchmark Benchmarks should be run for this change labels Jan 16, 2025
Copy link

Benchmarks summary

Performance benchmarks

aha-mont64 crc32 minver nettle-sha256 nsichneu slre statemate ud
▲ 0.421 (+0.004) ▲ 0.539 (+0.014) ▲ 0.373 (+0.003) ▲ 0.632 (+0.001) ▲ 0.359 (+0.000) ▲ 0.293 (+0.002) ▼ 0.327 (-0.000) ▲ 0.442 (+0.002)

You can view all the metrics here.

Synthesis benchmarks (basic)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▼ 13274 (-209) ▲ 4407 (+9) 1456 (0) 1164 (0) ▲ 50 (+1)

Synthesis benchmarks (full)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▼ 21961 (-3858) ▲ 7022 (+9) 1818 (0) 1216 (0) ▲ 41 (+1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Benchmarks should be run for this change enhancement New feature or request optimization This is *just* an optimization!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant