-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[major] Fine-grind parameters and capabilities for memories #96
base: main
Are you sure you want to change the base?
Conversation
We may need an additional custom port for mbist/dft. This port will be ignored in the behavioral simulation. But exist in the post synthesis flow. |
@sequencer Is it sufficient to have a
|
spec.md
Outdated
read => no | ||
write => with-mask | ||
write-latency => 1 | ||
custom-port => {a:UInt<4>, flip b:UInt<2>} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this match syntax more? Something like:
port custom: {a:UInt<4>, flip b:UInt<2>}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Maybe we should use another keyward (e.g. custom-port) to avoid syntax ambiguity when parsing:
custom-port custom_a => {a:Uint<4>, flip b:Uint<2>}
The motivation behind the custom-port =>
syntax is that there should only be at most one custom port for a single memory. But this may cause implicit naming collision with other ports. So let's also include the port name here.
Using =>
is to keep in line with other single-value configuration. See below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in e6311a7 to use custom-port custom => {a:UInt<4>, flip b:UInt<2>}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should there be at most one custom port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the custom-port custom => ...
syntax, this limitation is lifted, and there can be multiple custom ports now. However it might still be desirable.
If people requires multiple custom ports, they can always aggregate them into a single one. Limiting the number of custom ports may simplify compilers.
spec.md
Outdated
corresponding element within the memory holds the new value. | ||
|
||
6. A read-under-write flag indicating the behavior when a memory location is | ||
3. A variable number of named ports, each having following parameters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"each having the following parameters in this order:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e6311a7.
read-under-write => undefined | ||
port r1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird to mix =>
syntax and :
syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation of using :
is to distinguish a indented block from a single configuration value. Using a different delimitator might make writing parser a little bit easier?
flip rdata: T, | ||
wdata: T, | ||
wmask: M | ||
} | ||
``` | ||
|
||
where `M`{.firrtl} is the mask type calculated from the element type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the mask actually help anywhere? It seems it is a feature which really only makes sense assuming a certain compilation strategy (lowering aggregate memories to multiple memories, in which the mask becomes the enable flag).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this is a abstraction over different possible hardware memory implementations (some SRAM allows bit or byte masks). Aggregated memory lowering might be part of memory synthesis, which one might want to delay until backend flow.
Also, this might be good for behavioral simulations performance?
port or a readwrite port that is used to initiate a write operation on a given | ||
clock edge, where `en`{.firrtl} is set and, for a readwriter, `wmode`{.firrtl} | ||
is set. An "active read port" is a read port or a readwrite port that is used to | ||
For the purpose of defining such collisions, an "active write port" is a port |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph made more sense for a single clock. When each port can have a different clock, what is "a given clock edge"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point.
This description mostly came from the original read-under-write specification. It also has a separate paragraph after this one specific to independently clocked ports, which has been kept.
After a closer look at the original specification, it seems that it has a very imprecise definition of the semantics of a read or write. The first part of the original specification seems to imply that all ports share a single clock (a clock that's globally agreed on). Also writes to memories are atomic w.r.t. a globally-agreed memory content, and reads are also atomic.
There is also the problem of overlapping writes. The original spec completely omitted such possibility, since it modeled that writes happens atomically after write-latency.
- If we would like to formally define the semantics of ALL possible overlapping transactions, it might be more convenient and precise to use the language of observability, similar to those used in defining memory orderings.
- If we are happy with atomic writes, then we can just change mentions of "cycle" to "edge w.r.t. the port's clock", and everything will be fine ™️.
If we settled with the language of observability of read / write transactions, the original specification seems to intend to define the following semantics:
- If the writer and the reader have the same clock, and there's an overlap timewise (w.r.t. wall time) between the write transaction and the read transaction:
- If the
read-under-write
flag isold
, and the read transaction's start is before the entire write transaction (w.r.t. wall time), the read SHOULD NOT observe the write. 1 - If the
read-under-write
flag isnew
, and the read transaction's end is after the entire write transaction (w.r.t. wall time), the read SHOULD observe the write. - Otherwise, the result of the read is entirely undefined. This is stricter than saying that whether the read observe the write is undefined: it may read part of the write, or some other random bits out of nowhere.
- If the
- If the writer and the reader have different clocks, and there's an overlap timewise (w.r.t. wall time) between the write transaction and the read transaction, the result of the read is entirely undefined.
Would this be a more precise definition of the read under write policy? If we decide to use this definition, we may keep a shorter version of the original paragraph as a more intuitive explanation of the aforementioned formal semantics.
Footnotes
-
This description differs from the original spec in the sense that if a read starts inside a active write, the original spec says the read SHOULD NOT observe the write (the write happens atomically after write-latency). Here we change it to undefined. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not well versed in what real memories prefer to do and how they deal with some of these issues. Things like "what happens when you clock gate in the middle of a write window?" , "is there an internal clock all transactions are referenced too (e.g. 'wall-clock' time above)?". (Hopefully) Obviously firrtl semantics will pick a reasonable subset of all possible behaviors with an eye towards 90% user-case behavior.
Incidentally, talking to some FPGA folks, they classically build their own bypass logic around simpler memories to get the read-under-write behavior they want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sequencer What's your thoughts on these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically there is already an annotation designed for the custom port which makes MBIST work. However the annotation makes it directly export to Top level, this may satisfy an IP vendor, but for those who use chisel to design the full chip, this introduces additional problems.
We all agree that MBIST is necessary for ASIC designers. However it's always provided by vendors, thus a custom port connecting from each memory, and wire to the BIST controller is also necessary.
The main issue is we cannot model it in firrtl since for each memory IP vendor, their MBIST design might be different. Thus I think it's necessary to provide a custom port attribute in FIRRTL, but forbid it to generate a behavior model.
Sorry for the late reply...
another thing to mention is: it's not possible to emulate the behavior for different DFT/MBIST implementation. I think it's necessary to forbid compiler to lower behavior memory with the custom port being configured. |
@sequencer Then it's more convenient to define the behavior of custom ports during behavioral simulations to be completely undefined. Updated in e6311a7 |
This PR changes memory statements in FIRRTL to allow more fine-grind control over parameters on memories, allowing:
In short, a memory in the syntax proposed in this PR would look like this:
Note that this will be a breaking change. Although it would be easy to make the old port declaration syntax (e.g.
reader => r1
) a shorthand for the new ones, signal names is also different in this new approach (e.g.wdata
vsdata
forwriter
s). It would add a lot of burden to the spec to have different signal names in ports with different sets of capabilities.This is a draft PR. The ideas presented here is still in early design stage,
and we've not yet added changelogs for it.Pending problems
Port-pairwise read-under-write policy
In theory, there might be memories that have different read-under-write policies between different pairs of ports. If we would like to characterize such behavior, we would need to allow specifying RUW policies for all (ordered) pairs of ports in the form of, let's say, a matrix.
Though possible, we believe this would be a highly uncommon scenario. So another way of settling this is to allow a special value indicating a non-existing or non-standard global read-under-write policy for the memory instance, and allow compilers and tools to have their own implementation-dependent way of specifying those policies:
Another way is to simply provide no support such memories. Users would need to resort to using
extmodule
s.Alternative read / write latency syntax
Currently from a pure syntactical POV, we doesn't enforce ports that have read capability to have a
read-latency
settings. We may make this requirement purely syntactical by requiring users to provide latencies when specifying capabilities, e.g.:Backward compatibility
To make this new syntax backward compatible, we would have to:
reader => r1
a shorthand for the new syntax.read-latency
andwrite-latency
in global scope, and specify the validity and semantics of ports that have no read / write latency specified, or two incompatible values specified.