Skip to content

Latest commit

 

History

History
108 lines (84 loc) · 3.26 KB

BlackholeBringUpProgrammingGuide.md

File metadata and controls

108 lines (84 loc) · 3.26 KB

Blackhole Bring-Up Programming Guide

Introduction

Information relevant to programming Blackhole while it is being brought up.

Wormhole N150 vs. Blackhole

Tensix Ethernet DRAM NoC
Total Available for Compute L1 Total Programmability   Total Bank Size Programmability Alignments Multicast
DRAM PCIe L1
Wormhole N150 8x10 8x8 1464 KB 16 1x RISC-V
256 KB L1
12 banks 1 GB N/A Read: 32B
Write: 16B
Read: 32B
Write: 16B
Read: 16B
Write: 16B
Rectangular
Blackhole 14x10 13x10 1464 KB
Data cache added
14 2x RISC-V
512 KB L1
8 banks ~4 GB 1x RISC-V
128 KB L1
Read: 64B
Write: 16B
Read: 64B
Write 16B
Read: 16B
Write: 16B
Rectangular
Strided
L-shaped

L1 Data Cache

Blackhole added a data cache in L1. Writing an address on one core and reading it from another only requires the reader to invalidate if the address was previously read.

Invalidating the cache can be done via calls to invalidate_l1_cache()

The cache can be disabled through an env var:

export TT_METAL_DISABLE_L1_DATA_CACHE_RISCVS=<BR,NC,TR,ER>

Ethernet Cores

Runtime has not enabled access to second RISC-V on the ethernet cores yet.

Fast dispatch can be run out of ethernet cores.

DRAM

Runtime has not enabled access to program RISC-V on DRAM yet.

NoC

Non-rectangular multicast shapes have not been tested yet.

On previous architectures there are instances in kernels where NoC commands are issued without explicit flushes. These were causing ND mismatches or hangs on BH because data and semaphore signals were getting updated faster than NoC has a chance to service the command and are resolved by adding flushes. Previous architectures did not need this because of higher RISC to L1 latency compared to NoC latency.

Debug

Debug tools are functional on BH and it is reccomended to use Watcher when triaging Op failures to catch potential alignment issues. Disabling the L1 cache can be helpful to identify missed cache invalidations.

Resetting

Depending on the firmware, reset via tt-smi -r 0 may not work and the board will need to be rebooted.

CI

Bringing up full post commit is a WIP on BH, currently we only run the cpp tests. It is triggered on pushes to main but we have seen some instability with the machines with ND failures.

Issue Tracking

Please file issues or any instances of ND behaviour to the Blackhole board