nextpnr sometimes produces invalid bitstream for LIFCL-17 #903

danc86 · 2022-02-02T09:51:43Z

Sorry for the vague issue title, I think I have found a problem with nextpnr routing but I'm not sure where exactly.

I'm working on a Litex-based design in CFU-Playground with a Vexriscv CPU and a large ML accelerator, targetting Crosslink-NX 17 (LIFCL-17). I arrived at a design which meets timing but the bitstream doesn't work on my board. The CPU never produces any UART output and the Litex LED chaser pattern does not light up, which usually means that the FPGA configuration logic rejected the bitstream.

I'm testing with nextpnr current master branch (commit c306ef1) and yosys current master branch (commit bc027b2cae9a85b887684930705762fac720b529).

You can build the "bad" design from my nextpnr-bug-jan2022 branch, in proj/hps_accel.

I'm also attaching the generated Verilog sources, the build script, and the intermediate files and resulting bitstream here:
nextpnr-bug-jan2022.zip

If I change anything in the design, including both making it smaller (by taking out some small Litex CSR) or larger (by putting back some small Litex CSR), it starts working.

If I use --router router1 on the same design, it works.

If I use a different seed with router2, it works.

If I pass the new --estimate-delay-mult 30 option, it works.

I think it means that there is some rarely-used configuration that doesn't work properly, this design got unlucky but any change to the routing happens to avoid the problem. It feels similar to gatecat/prjoxide#10 which was fixed by #730.

The text was updated successfully, but these errors were encountered:

gatecat · 2022-02-02T13:03:26Z

Thanks, looking into this now, it would be really useful to have a couple of the known good designs (with the bitstreams/fasm) to compare too if that's easily doable.

gatecat · 2022-02-02T13:04:50Z

The CPU never produces any UART output and the Litex LED chaser pattern does not light up, which usually means that the FPGA configuration logic rejected the bitstream.

If you have an easy way of checking, can you see if the outputs are high-impedance or being actively driven but stuck? This would distinguish between totally rejected bitstream, and accepted but the clock not running, although either of those are easier to debug than subtlety broken logic!

danc86 · 2022-02-02T23:10:24Z

Here's a tarball of the exact same design, built with seed 1038 instead of 38. This bitstream works fine:
nextpnr-bug-jan2022-good.zip

This board has a test point for the DONE pin. I can check if it stays low after loading the bad bitstream. That's probably the best way to know if the configuration logic has rejected it. I can visit the lab tomorrow and try that.

gatecat · 2022-02-03T15:01:27Z

Thanks for your help with this!

Would you be able to give #905 a try? (I need to look at the routethru situation a bit in general, but it's the most obvious potentially wrong thing I see).

danc86 · 2022-02-03T21:36:11Z

Thanks for the quick fix! I tried PR#905 on the bad design with the bad seed and it does indeed fix it.

The only thing that worries me a little is that changing almost anything about the routing would "fix" the issue before. And disabling these elements would probably send the router down a totally different solution in the same way that changing the seed would. So it might just be pure luck that it starts working now?

On the other hand, if the bad design was indeed using these unimplemented DCS elements then it makes sense that it was the cause of the problem.

gatecat · 2022-02-03T21:40:22Z

Yes, it permuting the routing just enough to work again seems plausible as well unfortunately. I'll keep the ticket open in case this reoccurs but merge that PR as I don't think it makes anything else worse.

danc86 mentioned this issue Feb 2, 2022

hps: test without master SPI google/CFU-Playground#430

Closed

gatecat mentioned this issue Feb 3, 2022

nexus: Hotfix to disable unimplemented DCS routethru #905

Merged

danc86 mentioned this issue Feb 14, 2022

mmap: make CSR optional litex-hub/litespi#67

Merged

gatecat closed this as completed Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nextpnr sometimes produces invalid bitstream for LIFCL-17 #903

nextpnr sometimes produces invalid bitstream for LIFCL-17 #903

danc86 commented Feb 2, 2022

gatecat commented Feb 2, 2022

gatecat commented Feb 2, 2022

danc86 commented Feb 2, 2022

gatecat commented Feb 3, 2022

danc86 commented Feb 3, 2022

gatecat commented Feb 3, 2022

nextpnr sometimes produces invalid bitstream for LIFCL-17 #903

nextpnr sometimes produces invalid bitstream for LIFCL-17 #903

Comments

danc86 commented Feb 2, 2022

gatecat commented Feb 2, 2022

gatecat commented Feb 2, 2022

danc86 commented Feb 2, 2022

gatecat commented Feb 3, 2022

danc86 commented Feb 3, 2022

gatecat commented Feb 3, 2022