Asynchronous interrupts #532

piotro888 · 2023-12-03T11:19:33Z

Implementations of asynchronous interrupts based on idea that we came up with @lekcyjna123 and @Arusekk

Async interrupts are handled just like exceptions (via internal ExceptionCauseRegister) and are inserted only at branch instructions (which already know its pc in FU) (+ conditonally immediately after (at) MRET and CSR because it is required by spec)
This way we don't need to store PC in ROB or use it in all FUs. Additionaly the same logic would be used for handling mispredictions (very easy to do)!

InterruptController is temporary solution and will be replaced with proper CSR based controller with the same interface in other PR (in progress)

Marked as draft because I want to add some simple tests (+ probably port ones from feature/interrupts)

Of course depends on #523 (and needed to see a nice diff :) )

lekcyjna123

lgtm

Co-authored-by: Kristopher38 <[email protected]>

piotro888 · 2023-12-13T20:01:34Z

PR is ready - I integrated @Kristopher38 's tests from feature/interrupts and extended them to cover pending interrupts during ISR execution.

Two bugs were found and fixed:

Missing conflict for exception_stall and fetch_verify at the same cycle (probably could happen also on exceptions) - fetch was stalling due to invalid state
Core not resumed after commiting instruction with async interrupt that is the only instruction in core (introduced and only related to this PR)

I added Zicsr to basic_core_config, because it supported ExceptionUnit, and handling them wouldn't make to much sense without csr support (mepc etc)

Kristopher38 · 2023-12-14T14:10:46Z

Does this mean we're really close to getting speculative execution?

Kristopher38

(Un)healthy does of bikeshedding - finished reviewing this and turns out almost all of my comments are about comments, but it's probably a good sign if anything.

coreblocks/core.py

coreblocks/stages/retirement.py

test/stages/test_retirement.py

Kristopher38 · 2023-12-15T13:35:02Z

test/test_core.py

+        # 10-15 is the smallest feasible cycle count between interrupts to provide forward progress
+        ("interrupt.asm", 300, {4: 21, 8: 9349}, {2: 21, 7: 9349, 31: 0xDE}, 10, 15),


Have you tried to lower hi and lo params? This comment was true in my implementation but might no longer hold.

it turns out we always have forward progression! updated values

Aren't we required by the spec to handle next interrupt immediately after the previous one returns?

yes, but hi/lo reffers only to execution cycles while not in ISR. Interrupts that happen during ISR and are executed immediately at MRET are triggred by independent random() < 0.4 condition in test.

test/test_core.py

coreblocks/structs_common/csr.py

coreblocks/fu/priv.py

piotro888 · 2023-12-16T15:40:17Z

Does this mean we're really close to getting speculative execution?

Yes! And I plan it to integrate if (at least partially) via reporting branch mispredictions as exceptions in next, or nextnext PR.
Things that will be still TODO:

Support for branch predictor (for now I think i will hard-code pc+4 prediction to jb unit)
Experimenting on performance of different branch predictors

Things TODO for future (that we forgot about):

Our exception handing method/core flushing is the slowest possible one - maybe optimize it/write new (or make faster one for jumps, see next point)
Further optimization or alternative - make checkpoints on speculative jumps, so if we have missprediction, we won't need to wait unit jump is retired to clear whole ROB, but we can discard missed speculation path prefix of it, and start fetching and executing proper path immediately. Can be done with separate structure, and only for jumps, real exceptions/interrupts could use slower method.

Kristopher38

Yes! And I plan it to integrate if (at least partially) via reporting branch mispredictions as exceptions in next, or nextnext PR.

Awesome, can't wait to see it happen! I might even bang out some nontrivial branch predictor if my time constraints allow it.

tilk

LGTM

tilk · 2023-12-16T19:13:22Z

Unfortunately, Fmax went down a lot, this needs to be investigated.

piotro888 · 2023-12-16T21:46:18Z

@tilk
I managed to fix the FMax problem by inserting FIFO in front of ExceptionCauseRegister report method (back to 52 MHz). I will post PR tommorow, but I don't think this is the real problem that we are trying to solve.
Before this PR (logs from previous commit), JB unit was on critical path with a some ~~weird~~ dependency of RS_Optype->Decoder->JB_taken->RS_Funct3(???)->FIFO_branch. ~~Maybe I've read it incorrectly, but in any case,~~ it doesn't seem good to me, that JB FU takes our entire timing margin. This PR increased the already critical path in JBUnit by "only" 2 ns, so it become visible. I think it needs further investigation.
Also none of above explain gigantic spike in Carry LUT usage and IPC loss on some benchamarks.
What do you think about it?

tilk · 2023-12-16T21:58:46Z

I managed to fix the FMax problem by inserting FIFO in front of ExceptionCauseRegister report method (back to 52 MHz).

Is the additional delay to ExceptionCauseRegister safe? I fear a race condition can happen.

Maybe I've read it incorrectly, but in any case, it doesn't seem good to me, that JB FU takes our entire timing margin. This PR increased the already critical path in JBUnit by "only" 2 ns, so it become visible. I think it needs further investigation.

Indeed, this looks fishy. I wonder what will happen when jumps will no longer stall the fetcher.

Also none of above explain gigantic spike in Carry LUT usage and IPC loss on some benchamarks.

I wouldn't worry about the Carry LUT spike on the basic core, a corresponding spike is not seen on the full core. IPC loss is possibly due to more transaction conflicts. I want to introduce a simple Transactron profiler soon, which will hopefully help find such bottlenecks.

piotro888 · 2023-12-17T11:41:50Z

Haha, I know now.

Carry LUT and RAM LUTs increased, because I added CSRUnit to Basic Core Config. Without it all current CSR registers were optimized away. This is also why there is no spike on full core config. Whats interesting is that device utilization decreased, but it happened previously for no reason too.
IPC problem - yes it is caused by adding method conflicts in Fetch. When implementing it, I looked at waveforms and I'm sure that I checked and saw that Retirement transaction could run in the same time with Fetcher verify - so there should be no degradation. But it looks like I was wrong. Anyway adding conflicts was only a nicer looking alternative for adding run condition for Verify Transaction in Retirement. It is equivalent, because stall_exception is also only used in Retirement. I only need to revert to the previous solution and IPC should be the same.
FMax as in my previous comment (I didn't look at JBUnit yet). Probably it is safe, I need to think it through for a moment

piotro888 added the enhancement New feature or request label Dec 3, 2023

piotro888 added this to the Implement machine mode ISA milestone Dec 3, 2023

piotro888 linked an issue Dec 3, 2023 that may be closed by this pull request

Interrupt implementation #97

Closed

piotro888 added 3 commits December 5, 2023 10:21

Interrupt handling core

b4e3f49

mret witk re-trigger

78c9352

trigger interrupt after CSR change

b7f42b2

piotro888 force-pushed the async-interrupts branch from dc75ef0 to b7f42b2 Compare December 5, 2023 09:23

piotro888 added 2 commits December 6, 2023 21:01

Separate FU for proper mret handling

5302823

Revert previous mret handling method

7b21cc3

piotro888 added the microarch Involves the processor's microarchitecture label Dec 6, 2023

All fixes

0515e2c

piotro888 force-pushed the async-interrupts branch from f369afc to 0515e2c Compare December 6, 2023 21:57

lekcyjna123 reviewed Dec 11, 2023

View reviewed changes

piotro888 and others added 5 commits December 13, 2023 19:44

Include interrupt support in core configs

0cc1bbf

Port @Kristopher38 asm interrupt tests from feature/interrupts

1df2e4e

Co-authored-by: Kristopher38 <[email protected]>

Retirement resume fix

084a486

add pending interrupts to test

5c7c24d

Fix missing fetch conflict

6ca784d

piotro888 marked this pull request as ready for review December 13, 2023 19:50

piotro888 requested review from lekcyjna123 and tilk December 13, 2023 20:01

lekcyjna123 approved these changes Dec 15, 2023

View reviewed changes

Kristopher38 reviewed Dec 15, 2023

View reviewed changes

piotro888 added 3 commits December 16, 2023 15:21

Comments update

1bde8c7

Test adjustments

9f4498f

Merge branch 'master' into async-interrupts

b302b08

piotro888 force-pushed the async-interrupts branch from d42cac7 to b302b08 Compare December 16, 2023 15:30

piotro888 requested a review from Kristopher38 December 16, 2023 15:30

Kristopher38 approved these changes Dec 16, 2023

View reviewed changes

tilk approved these changes Dec 16, 2023

View reviewed changes

tilk merged commit c63beb4 into master Dec 16, 2023
8 checks passed

tilk deleted the async-interrupts branch December 16, 2023 18:50

github-actions bot pushed a commit that referenced this pull request Dec 16, 2023

Asynchronous interrupts (#532)

2275d3b

piotro888 mentioned this pull request Dec 18, 2023

Fix performance problems after #532 #545

Merged

tilk pushed a commit that referenced this pull request Jan 4, 2024

Fix performance problems after #532 (#545)

9b5a156

github-actions bot pushed a commit that referenced this pull request Jan 4, 2024

Fix performance problems after #532 (#545)

c3152ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asynchronous interrupts #532

Asynchronous interrupts #532

piotro888 commented Dec 3, 2023

lekcyjna123 left a comment

piotro888 commented Dec 13, 2023

Kristopher38 commented Dec 14, 2023

Kristopher38 left a comment

Kristopher38 Dec 15, 2023

piotro888 Dec 16, 2023

tilk Dec 16, 2023

piotro888 Dec 16, 2023

piotro888 commented Dec 16, 2023

Kristopher38 left a comment •

edited

Loading

tilk left a comment

tilk commented Dec 16, 2023

piotro888 commented Dec 16, 2023 •

edited

Loading

tilk commented Dec 16, 2023

piotro888 commented Dec 17, 2023 •

edited

Loading

		# 10-15 is the smallest feasible cycle count between interrupts to provide forward progress
		("interrupt.asm", 300, {4: 21, 8: 9349}, {2: 21, 7: 9349, 31: 0xDE}, 10, 15),

Asynchronous interrupts #532

Asynchronous interrupts #532

Conversation

piotro888 commented Dec 3, 2023

lekcyjna123 left a comment

Choose a reason for hiding this comment

piotro888 commented Dec 13, 2023

Kristopher38 commented Dec 14, 2023

Kristopher38 left a comment

Choose a reason for hiding this comment

Kristopher38 Dec 15, 2023

Choose a reason for hiding this comment

piotro888 Dec 16, 2023

Choose a reason for hiding this comment

tilk Dec 16, 2023

Choose a reason for hiding this comment

piotro888 Dec 16, 2023

Choose a reason for hiding this comment

piotro888 commented Dec 16, 2023

Kristopher38 left a comment • edited Loading

Choose a reason for hiding this comment

tilk left a comment

Choose a reason for hiding this comment

tilk commented Dec 16, 2023

piotro888 commented Dec 16, 2023 • edited Loading

tilk commented Dec 16, 2023

piotro888 commented Dec 17, 2023 • edited Loading

Kristopher38 left a comment •

edited

Loading

piotro888 commented Dec 16, 2023 •

edited

Loading

piotro888 commented Dec 17, 2023 •

edited

Loading