-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove func blocks unifier indirections #774
base: master
Are you sure you want to change the base?
Remove func blocks unifier indirections #774
Conversation
Benchmarks summaryPerformance benchmarks
You can view all the metrics here. Synthesis benchmarks (basic)
Synthesis benchmarks (full)
|
Benchmarks summaryPerformance benchmarks
You can view all the metrics here. Synthesis benchmarks (basic)
Synthesis benchmarks (full)
|
@@ -116,30 +113,24 @@ async def producer(sim: TestbenchContext): | |||
|
|||
async def consumer(self, sim: TestbenchContext): | |||
# TODO: this test doesn't do anything, fix it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment up-to-date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so. The condition in while
looks to be false in the beginning. Maybe a negation was intended there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After adding the negation, the test fails. Created issue #775 for this.
self.insert.proxy(m, self.rs.insert) | ||
self.select.proxy(m, self.rs.select) | ||
self.update.proxy(m, self.rs.update) | ||
self.get_result.proxy(m, collector.method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.get_result
should be removed (+ in docstring too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Benchmarks summaryPerformance benchmarks
You can view all the metrics here. Synthesis benchmarks (basic)
Synthesis benchmarks (full)
|
Benchmarks summaryPerformance benchmarks
You can view all the metrics here. Synthesis benchmarks (basic)
Synthesis benchmarks (full)
|
afc3583
to
df115b7
Compare
Benchmarks summaryPerformance benchmarks
You can view all the metrics here. Synthesis benchmarks (basic)
Synthesis benchmarks (full)
|
The Fmax drop is a little bit worrying and it looks like the FuncBlockUnifier is on critical path now. I checked the synthesis results (https://github.com/kuznia-rdzeni/coreblocks/actions/runs/12359494095/job/34492356220) and it looks like:
|
@@ -18,19 +17,10 @@ def __init__( | |||
): | |||
self.rs_blocks = [(block.get_module(gen_params), block.get_optypes()) for block in blocks] | |||
|
|||
self.result_collector = Collector([block.get_result for block, _ in self.rs_blocks]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I correctly see, removing that Collector cause that all FUs get_results methods are joined with the announcement methods (so with RS and RF), which make scheduling more complex and critical path longer. In Collector there is hidden a Forwarder which cut the critical path on data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LNGTM
The changes are affecting Fmax and make transactron network more complicated. Additionaly they removed buffers in announcement.
@@ -52,6 +52,16 @@ class FetchResumeKey(UnifierKey, unifier=Collector): | |||
pass | |||
|
|||
|
|||
@dataclass(frozen=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should start adding the doc strings to our keys? In practice they are a global variables and we haven't documented them...
@@ -87,12 +85,9 @@ def elaborate(self, platform): | |||
m.submodules[f"func_unit_{n}"] = func_unit | |||
m.submodules[f"wakeup_select_{n}"] = wakeup_select | |||
|
|||
m.submodules.collector = collector = Collector([func_unit.accept for func_unit, _ in self.func_units]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also complicates the transactron network. Probably its is connected with observed IPC drop. Previously, when two results were ready in the same cycle, one have been announced and second stored in Forwarder for a cycle, what made the FU ready to process the next instruction. Now the FU have to stall till it can push out its result.
This PR simplifies the announcement mechanism using the dependency system. At the same time, two layers of
Collector
s for accepting results were flattened to a single collector. Thefunc_blocks_unifier
module became trivial, and it might make sense to remove it later.Benchmark results dropped slightly for some reason,
but device utilization also seems to be reduced.