-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LZA (Leading zeros anticipation) #741
Conversation
coreblocks/func_blocks/fu/fpu/lza.py
Outdated
m = TModule() | ||
|
||
@def_method(m, self.predict_request) | ||
def _(arg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _(arg): | |
def _(sig_a, sig_b, carry): |
"new" argument syntax is preferred for small number of arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty well, but I left some comments.
|
||
|
||
class LZAModule(Elaboratable): | ||
"""LZA module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please extend the docstring. Let assume, that reader doesn't know anything about floating point operation. Will he know what is the goal of that module?
Additionaly add a link to the paper, which you attached to the commit msg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
""" | ||
|
||
def __init__(self, *, fpu_params: FPUParams): | ||
self.predict_in_layout = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the meaning of each subfield?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added comment explaining each subfield
coreblocks/func_blocks/fu/fpu/lza.py
Outdated
m.d.av_comb += z[0].eq(1) | ||
|
||
for i in reversed(range(1, self.lza_params.sig_width + 1)): | ||
m.d.av_comb += f[i - 1].eq((t[i] ^ ~(z[i - 1]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the only usage of z
, so you have ~(~A & ~B)
. I think that this can be simplified and z
can be defined as A | B
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, we can define z
as (sig_a | sig_b)
plus some operation for z[0]
depending on carry
coreblocks/func_blocks/fu/fpu/lza.py
Outdated
m.d.av_comb += z[0].eq(1) | ||
|
||
for i in reversed(range(1, self.lza_params.sig_width + 1)): | ||
m.d.av_comb += f[i - 1].eq((t[i] ^ ~(z[i - 1]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the shift by one in f
intentional? In paper you have equation (2) f[i] = t[i] ^ ~z[i+1]
which translates to our indexing: f[i] = t[i] ^ ~z[i-1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is intentional. g
, z
, t
, has a width of sig_width + 1
(to account for carry, which allows us to predict either a+b
or a+b+1
depending on its value), while f
has a width of sig_width
. So basically i
for z
, t
or g
maps to i-1
for f
.
coreblocks/func_blocks/fu/fpu/lza.py
Outdated
m.d.av_comb += f[i - 1].eq((t[i] ^ ~(z[i - 1]))) | ||
|
||
m.d.av_comb += shift_amount.eq(0) | ||
for i in reversed(range(self.lza_params.sig_width)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to use count_leading_zeros
from amaranth_ext
? It has logarithmic critical path and you have linear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with count_leading_zeroes
is that it requires a number to have a width that is an exact power of 2. Significands usually do not fulfill this requirement. I would have to extend input widths to the nearest power of 2 and do some operations to convert shift for that number to shift for our target width. I could also do it by creating a string L
out of f
that only has 1
on the postion of left-most 1
in f
and then using count_trailing_zeros
, but I don't know if this is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought about extending count_leading_zeros
with 1
s to the nearest power of two on LSBs. This will cause that number of zeros returned by count_leading_zeros
for the extended number will be the same as for non-extended, so no additional processing will be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, extended f
to nearest power of two and filled lower bits with 1
test/func_blocks/fu/test_lza.py
Outdated
self.test_val_sig_a_6 = 8421376 | ||
self.test_val_sig_b_6 = 8421376 | ||
|
||
def test_manual(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add also a random test?
sig_a = randomint()
sig_b = randomint()
pred_lz = lza(sig_a, sig_b)
true_lz = count_leading_zero(sig_a+sig_b)
assert pred_lz == true_lz or pred_lz == true_lz + 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I had to do some more work to ensure that numbers are normalized and a>=b
, but overall the function random_test
looks more or less the same.
d46eadd
to
4440f6d
Compare
test/func_blocks/fu/test_lza.py
Outdated
@@ -37,6 +38,27 @@ def test_manual(self): | |||
help_values = TestLZA.HelpValues(params) | |||
lza = TestLZA.LZAModuleTest(params) | |||
|
|||
def clz(sig_a, sig_b, carry): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a generic function in test framework to be used also in other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
test/func_blocks/fu/test_lza.py
Outdated
@@ -121,6 +143,7 @@ def lza_test(): | |||
|
|||
def test_process(): | |||
yield from lza_test() | |||
yield from random_test() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually when you do random test you want to set a seed and execute more than 1 iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
51aeafc
to
2e0a96b
Compare
Pull request #742 was merged, please refactor the test to use the new syntax. |
test/func_blocks/fu/test_lza.py
Outdated
{ | ||
"sig_a": help_values.test_val_sig_a_5, | ||
"sig_b": help_values.test_val_sig_b_5, | ||
"carry": 1, | ||
}, | ||
{ | ||
"sig_a": help_values.test_val_sig_a_5, | ||
"sig_b": help_values.test_val_sig_b_5, | ||
"carry": 1, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two identical test cases.
Maybe it would be better to generate final test cases by adding "carry" to source list of cases? Separate help_values would also not be needed then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
coreblocks/func_blocks/fu/fpu/lza.py
Outdated
def nearestpow2(n): | ||
a = int(log2(n)) | ||
if 2**a == n: | ||
return n | ||
else: | ||
return 2 ** (a + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amaranth ceil_log2()
can be used instead of this function.
Not related to this case as 2**ceil_log2
is short enough to write, but we usually put helper functions that could have use in other places to transactron.utils
library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
5355a2e
to
4ada0b1
Compare
test/func_blocks/fu/fpu/test_lza.py
Outdated
@@ -139,10 +93,16 @@ async def lza_test(sim: TestbenchContext): | |||
{"shift_amount": 7, "is_zero": 0}, | |||
{"shift_amount": 7, "is_zero": 0}, | |||
] | |||
for i in range(len(test_cases)): | |||
for i in range(len(test_cases) // 2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why divide by 2? You seem to ignore half of the cases this way. Am I missing something?
(This is one of the reasons why using indexes is discouraged.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it was a mistake on my part. I wanted to remove cases where I explicitly set carry
to 1
by performing two tests per test case and manually setting carry
to 1
in the second one. For some reason, I decided that dividing the number of test cases by two after I deleted redundant ones is a great idea. But as I said this was a mistake, that is now fixed.
|
||
m.d.av_comb += t.eq((sig_a ^ sig_b) << 1) | ||
m.d.av_comb += g.eq((sig_a & sig_b) << 1) | ||
m.d.av_comb += z.eq(((sig_a | sig_b) << 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m.d.av_comb += z.eq(((sig_a | sig_b) << 1)) | |
m.d.av_comb += z.eq((sig_a | sig_b) << 1) |
Leading Zero anticipation module implementation for FPU adder-subtractor based on: https://userpages.cs.umbc.edu/phatak/645/supl/lza/lza-survey-arith01.pdf