Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous interrupts #532

Merged
merged 14 commits into from
Dec 16, 2023
Merged

Asynchronous interrupts #532

merged 14 commits into from
Dec 16, 2023

Conversation

piotro888
Copy link
Member

Implementations of asynchronous interrupts based on idea that we came up with @lekcyjna123 and @Arusekk

Async interrupts are handled just like exceptions (via internal ExceptionCauseRegister) and are inserted only at branch instructions (which already know its pc in FU) (+ conditonally immediately after (at) MRET and CSR because it is required by spec)
This way we don't need to store PC in ROB or use it in all FUs. Additionaly the same logic would be used for handling mispredictions (very easy to do)!

InterruptController is temporary solution and will be replaced with proper CSR based controller with the same interface in other PR (in progress)

Marked as draft because I want to add some simple tests (+ probably port ones from feature/interrupts)

Of course depends on #523 (and needed to see a nice diff :) )

@piotro888 piotro888 added the enhancement New feature or request label Dec 3, 2023
@piotro888 piotro888 added this to the Implement machine mode ISA milestone Dec 3, 2023
@piotro888 piotro888 linked an issue Dec 3, 2023 that may be closed by this pull request
@piotro888 piotro888 added the microarch Involves the processor's microarchitecture label Dec 6, 2023
Copy link
Contributor

@lekcyjna123 lekcyjna123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@piotro888 piotro888 marked this pull request as ready for review December 13, 2023 19:50
@piotro888
Copy link
Member Author

PR is ready - I integrated @Kristopher38 's tests from feature/interrupts and extended them to cover pending interrupts during ISR execution.

Two bugs were found and fixed:

  • Missing conflict for exception_stall and fetch_verify at the same cycle (probably could happen also on exceptions) - fetch was stalling due to invalid state
  • Core not resumed after commiting instruction with async interrupt that is the only instruction in core (introduced and only related to this PR)

I added Zicsr to basic_core_config, because it supported ExceptionUnit, and handling them wouldn't make to much sense without csr support (mepc etc)

@Kristopher38
Copy link
Member

Does this mean we're really close to getting speculative execution?

Copy link
Member

@Kristopher38 Kristopher38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Un)healthy does of bikeshedding - finished reviewing this and turns out almost all of my comments are about comments, but it's probably a good sign if anything.

coreblocks/core.py Show resolved Hide resolved
coreblocks/stages/retirement.py Outdated Show resolved Hide resolved
test/stages/test_retirement.py Show resolved Hide resolved
Comment on lines 340 to 341
# 10-15 is the smallest feasible cycle count between interrupts to provide forward progress
("interrupt.asm", 300, {4: 21, 8: 9349}, {2: 21, 7: 9349, 31: 0xDE}, 10, 15),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried to lower hi and lo params? This comment was true in my implementation but might no longer hold.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it turns out we always have forward progression! updated values

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we required by the spec to handle next interrupt immediately after the previous one returns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but hi/lo reffers only to execution cycles while not in ISR. Interrupts that happen during ISR and are executed immediately at MRET are triggred by independent random() < 0.4 condition in test.

test/test_core.py Outdated Show resolved Hide resolved
coreblocks/structs_common/csr.py Show resolved Hide resolved
coreblocks/fu/priv.py Outdated Show resolved Hide resolved
@piotro888
Copy link
Member Author

Does this mean we're really close to getting speculative execution?

Yes! And I plan it to integrate if (at least partially) via reporting branch mispredictions as exceptions in next, or nextnext PR.
Things that will be still TODO:

  • Support for branch predictor (for now I think i will hard-code pc+4 prediction to jb unit)
  • Experimenting on performance of different branch predictors

Things TODO for future (that we forgot about):

  • Our exception handing method/core flushing is the slowest possible one - maybe optimize it/write new (or make faster one for jumps, see next point)
  • Further optimization or alternative - make checkpoints on speculative jumps, so if we have missprediction, we won't need to wait unit jump is retired to clear whole ROB, but we can discard missed speculation path prefix of it, and start fetching and executing proper path immediately. Can be done with separate structure, and only for jumps, real exceptions/interrupts could use slower method.

Copy link
Member

@Kristopher38 Kristopher38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! And I plan it to integrate if (at least partially) via reporting branch mispredictions as exceptions in next, or nextnext PR.

Awesome, can't wait to see it happen! I might even bang out some nontrivial branch predictor if my time constraints allow it.

Copy link
Member

@tilk tilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tilk tilk merged commit c63beb4 into master Dec 16, 2023
8 checks passed
@tilk tilk deleted the async-interrupts branch December 16, 2023 18:50
github-actions bot pushed a commit that referenced this pull request Dec 16, 2023
@tilk
Copy link
Member

tilk commented Dec 16, 2023

Unfortunately, Fmax went down a lot, this needs to be investigated.

@piotro888
Copy link
Member Author

piotro888 commented Dec 16, 2023

@tilk
I managed to fix the FMax problem by inserting FIFO in front of ExceptionCauseRegister report method (back to 52 MHz). I will post PR tommorow, but I don't think this is the real problem that we are trying to solve.
Before this PR (logs from previous commit), JB unit was on critical path with a some weird dependency of RS_Optype->Decoder->JB_taken->RS_Funct3(???)->FIFO_branch. Maybe I've read it incorrectly, but in any case, it doesn't seem good to me, that JB FU takes our entire timing margin. This PR increased the already critical path in JBUnit by "only" 2 ns, so it become visible. I think it needs further investigation.
Also none of above explain gigantic spike in Carry LUT usage and IPC loss on some benchamarks.
What do you think about it?

@tilk
Copy link
Member

tilk commented Dec 16, 2023

I managed to fix the FMax problem by inserting FIFO in front of ExceptionCauseRegister report method (back to 52 MHz).

Is the additional delay to ExceptionCauseRegister safe? I fear a race condition can happen.

Maybe I've read it incorrectly, but in any case, it doesn't seem good to me, that JB FU takes our entire timing margin. This PR increased the already critical path in JBUnit by "only" 2 ns, so it become visible. I think it needs further investigation.

Indeed, this looks fishy. I wonder what will happen when jumps will no longer stall the fetcher.

Also none of above explain gigantic spike in Carry LUT usage and IPC loss on some benchamarks.

I wouldn't worry about the Carry LUT spike on the basic core, a corresponding spike is not seen on the full core. IPC loss is possibly due to more transaction conflicts. I want to introduce a simple Transactron profiler soon, which will hopefully help find such bottlenecks.

@piotro888
Copy link
Member Author

piotro888 commented Dec 17, 2023

Haha, I know now.

  • Carry LUT and RAM LUTs increased, because I added CSRUnit to Basic Core Config. Without it all current CSR registers were optimized away. This is also why there is no spike on full core config. Whats interesting is that device utilization decreased, but it happened previously for no reason too.
  • IPC problem - yes it is caused by adding method conflicts in Fetch. When implementing it, I looked at waveforms and I'm sure that I checked and saw that Retirement transaction could run in the same time with Fetcher verify - so there should be no degradation. But it looks like I was wrong. Anyway adding conflicts was only a nicer looking alternative for adding run condition for Verify Transaction in Retirement. It is equivalent, because stall_exception is also only used in Retirement. I only need to revert to the previous solution and IPC should be the same.
  • FMax as in my previous comment (I didn't look at JBUnit yet). Probably it is safe, I need to think it through for a moment

tilk pushed a commit that referenced this pull request Jan 4, 2024
github-actions bot pushed a commit that referenced this pull request Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request microarch Involves the processor's microarchitecture
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interrupt implementation
4 participants