Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Striped multiport fifo #15

Open
lekcyjna123 opened this issue Jul 5, 2023 · 2 comments
Open

Striped multiport fifo #15

lekcyjna123 opened this issue Jul 5, 2023 · 2 comments

Comments

@lekcyjna123
Copy link
Contributor

lekcyjna123 commented Jul 5, 2023

Currently we have implemented a MultiportFifo, that gives very strong guarantees:

  • all elements are in time order
  • always maximum subset of methods is usable

But this comes at cost of exponential complexity, and sometimes we can live with weaker guarantees:

  • all elements are in time order
  • always maximum subset of methods which operates on elements from the same stripe are ready.

As an example, lets take 2 port fifo with content:

  • cycle 1: a, b
  • cycle 2: c, d

Now we start to pop elements.
If we pop from such queue one element we get in first cycle a or b. Let assume that we got a, on the beginning of the second cycle of popping we have:

  • cycle 1: b
  • cycle 2: c, d

MultiportFifo guarantees that we can pop now two elements b and either c or d. In StrippedMultiportFifo it will be enough if we get only b (so second port will be unused).

Such structure can be useful to pass elements between pipeline elements if we care about order between cycles, but don't care about full pipeline usage.

@tilk
Copy link
Member

tilk commented Jul 8, 2023

If you don't care about throughput, why not use a normal FIFO?

The thing you are suggesting will have pretty weird performance behavior, and the only way to reduce the weirdness is to increase the number of internal queues while keeping the number of ports.

Also, did you mean "striped"?

@lekcyjna123 lekcyjna123 changed the title Stripped multiport fifo Striped multiport fifo Jul 8, 2023
@lekcyjna123
Copy link
Contributor Author

lekcyjna123 commented Jul 8, 2023

If you don't care about throughput, why not use a normal FIFO?

Because in some cases we can optimistically assume, that nearly always all output port will be ready if at least one is ready. In such case using normal FIFO will be not useful because it will always limit to only 1 port instead. Additionally we can not use simply n different normal FIFO because we can need to keep in order data which are send in different cycles.

One example about which I thought while creating this issue:
I can get an vector instruction, which need to be translated to 24 simpler ones. Original instruction do widening addition data in 8 registers simultaneously. So it have to be translated to:

  • widen rs1
  • widen rs2
  • add rs1 and rs2
    So for each register pair this instructions have to be ordered (I have to first widen and after that add data), but job can be parallelised over all 8 destination registers. So I get 3 batches each of 8 instructions. I don't care about order of execution inside a batch but order between batches is important, so each batch can be a stripe in fifo.

The thing you are suggesting will have pretty weird performance behavior, and the only way to reduce the weirdness is to increase the number of internal queues while keeping the number of ports.

I don't understand why performance behaviour will be weird? In optimistic case whole stripe will be read at once. If there are some stalls and the stripe can not be read at once, then we have to split it to parts and first we process as much data as possible and in next cycle rest.

Also, did you mean "striped"?

Fixed


EDIT: I solved my issue with vector registers without striped queue, but I think that this can be useful in future, because patter seems generic.

@tilk tilk transferred this issue from kuznia-rdzeni/coreblocks Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants