Information relevant to programming Blackhole while it is being brought up.
Tensix | Ethernet | DRAM | NoC | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Total | Available for Compute | L1 | Total | Programmability | Total | Bank Size | Programmability | Alignments | Multicast | |||
DRAM | PCIe | L1 | ||||||||||
Wormhole N150 | 8x10 | 8x8 | 1464 KB | 16 | 1x RISC-V 256 KB L1 |
12 banks | 1 GB | N/A | Read: 32B Write: 16B |
Read: 32B Write: 16B |
Read: 16B Write: 16B |
Rectangular |
Blackhole | 14x10 | 13x10 | 1464 KB Data cache added |
14 | 2x RISC-V 512 KB L1 |
8 banks | ~4 GB | 1x RISC-V 128 KB L1 |
Read: 64B Write: 16B |
Read: 64B Write 16B |
Read: 16B Write: 16B |
Rectangular Strided L-shaped |
Blackhole added a data cache in L1. Writing an address on one core and reading it from another only requires the reader to invalidate if the address was previously read.
Invalidating the cache can be done via calls to invalidate_l1_cache()
The cache can be disabled through an env var:
export TT_METAL_DISABLE_L1_DATA_CACHE_RISCVS=<BR,NC,TR,ER>
Runtime has not enabled access to second RISC-V on the ethernet cores yet.
Fast dispatch can be run out of ethernet cores.
Runtime has not enabled access to program RISC-V on DRAM yet.
Non-rectangular multicast shapes have not been tested yet.
On previous architectures there are instances in kernels where NoC commands are issued without explicit flushes. These were causing ND mismatches or hangs on BH because data and semaphore signals were getting updated faster than NoC has a chance to service the command and are resolved by adding flushes. Previous architectures did not need this because of higher RISC to L1 latency compared to NoC latency.
Debug tools are functional on BH and it is reccomended to use Watcher when triaging Op failures to catch potential alignment issues. Disabling the L1 cache can be helpful to identify missed cache invalidations.
Depending on the firmware, reset via tt-smi -r 0
may not work and the board will need to be rebooted.
Bringing up full post commit is a WIP on BH, currently we only run the cpp tests. It is triggered on pushes to main but we have seen some instability with the machines with ND failures.
Please file issues or any instances of ND behaviour to the Blackhole board